Perspective Multiscale Detection and Tracking of Persons
Abstract
The efficient detection and tracking of persons in videos has widrespread applications, specially in CCTV systems for surveillance or forensics applications. In this paper we present a new method for people detection and tracking based on the knowledge of the perspective information of the scene. It allows alleviating two main drawbacks of existing methods: (i) high or even excessive computational cost associated to multiscale detection-by-classification methods; and (ii) the inherent difficulty of the CCTV, in which predominate partial and full occlusions as well as very high intra-class variability. During the detection stage, we propose to use the homograhy of the dominant plane to compute the expected sizes of persons at different positions of the image and thus dramatically reduce the number of evaluation of the multiscale sliding window detection scheme. To achieve robustness against false positives and negatives, we have used a combination of full and upper-body detectors, as well as a Data Association Filter (DAF) inspired in the well-known Rao-Blackwellization-based particle filters (RBPF). Our experiments demonstrate the benefit of using the proposed perspective multiscale approach, compared to conventional sliding window approaches, and also that this perspective information can lead to useful mixes of full-body and upper-body detectors.
BIB_text
author = {Marcos Nieto, Juan Diego Ortega, Andoni Cortés, Seán Gaines},
title = {Perspective Multiscale Detection and Tracking of Persons},
pages = {92-103},
volume = {8326},
keywds = {
Object detection, machine learning, person detection, person tracking, homography, camera calibration
}
abstract = {
The efficient detection and tracking of persons in videos has widrespread applications, specially in CCTV systems for surveillance or forensics applications. In this paper we present a new method for people detection and tracking based on the knowledge of the perspective information of the scene. It allows alleviating two main drawbacks of existing methods: (i) high or even excessive computational cost associated to multiscale detection-by-classification methods; and (ii) the inherent difficulty of the CCTV, in which predominate partial and full occlusions as well as very high intra-class variability. During the detection stage, we propose to use the homograhy of the dominant plane to compute the expected sizes of persons at different positions of the image and thus dramatically reduce the number of evaluation of the multiscale sliding window detection scheme. To achieve robustness against false positives and negatives, we have used a combination of full and upper-body detectors, as well as a Data Association Filter (DAF) inspired in the well-known Rao-Blackwellization-based particle filters (RBPF). Our experiments demonstrate the benefit of using the proposed perspective multiscale approach, compared to conventional sliding window approaches, and also that this perspective information can lead to useful mixes of full-body and upper-body detectors.
}
isbn = {978-3-319-04116-2},
isi = {1},
date = {2014-01-08},
year = {2014},
}