Multi-modal object tracking in
a network of audiovisual sensors (MOTINAS)
The aim of this project was to
develop a unified scheme cooperative multi-modal and
multi-sensor tracking. The multi-sensor network is composed of
STAC sensors (stereo microphones coupled with rectilinear,
omni-directional or pan-tilt-zoom cameras). Sound information is
used to discriminate ambiguous visual observations as well as to
extend the coverage area of the sensors beyond the field of view
of the cameras. Although single modality as well as
multi-modality trackers have achieved some success, a number of
important tracking issues remain open for enabling the adoption
of these algorithms in real-world scenarios. Among these issues,
three important inter-related problems were addressed in this
project, namely the definition of a generic and flexible feature
representation for a target, a reliable mechanism to update the
target model based on incoming observations, and a robust
multi-sensor handover strategy. First, we developed a robust and
adaptive representation of objects based on their acoustical and
visual attributes while moving across the network of
heterogeneous sensors. Next, a flexible object models was
defined based on multiple features and their weighting over
time. The object model is used to control and guide the
evolution of the target state in order to help intra-sensor
occlusion handling and inter-sensor handover. To evaluate the
tracking scheme, we created a test corpus and its associated
ground-truth data for use in the project as well as for
distribution to the research community through the website
http://www.spevi.org to facilitate comparisons.
Related journal papers
1. H. Zhou, M. Taj and A. Cavallaro, ''Target detection and
tracking with heterogeneous sensors'', IEEE Journal on Selected
Topics in Signal Processing, Vol. 2, Issue 4, August 2008,
pp.503-513
2. E. Maggio, M. Taj and A. Cavallaro, ''Efficient multi-target
visual tracking using Random Finite Sets'', IEEE Trans. on
Circuits and Systems for Video Technology, Vol. 18, Issue 8,
Aug. 2008, pp.1016-1027
3. N. Anjum, A. Cavallaro, "Multi-feature trajectory clustering
using mean shift'', IEEE Trans. on Circuits and Systems for
Video Technology, Vol. 18, Issue 11, Nov. 2008, pp. 1555-1564
4. S. Karlsson, M. Taj and A. Cavallaro, ''Detection and
tracking of humans and faces'', EURASIP Journal on Image and
Video Processing, Article ID 526191, doi:10.1155/2008/526191,
Vol. 2008 (2008)
5. E. Maggio, F. Smeraldi, A. Cavallaro, ''Adaptive
multi-feature tracking in a particle filtering framework'', IEEE
Trans. on Circuits and Systems for Video Technology, Vol. 17,
Issue 10, Oct. 2007, pp.1348-1359
6. G. Kayumbi and A. Cavallaro, ''Multi-view trajectory mapping
using homography with lens distortion correction'', EURASIP
Journal on Image and Video Processing (in press)
7. E. Maggio and A. Cavallaro, ''Accurate appearance-based
Bayesian tracking for maneuvering targets'', Computer Vision and
Image Understanding, (accepted, with minor revision)
Related conference papers
1. G. Kayumbi, N. Anjum, A. Cavallaro, ''Global trajectory
reconstruction from distributed visual sensors'', Proc. of ACM /
IEEE Int. Conference on Distributed Smart Cameras (ICDSC),
Stanford, California (USA), 7-11 September, 2008
2. M. Taj and A. Cavallaro, ''Object and scene-centric activity
detection using state occupancy duration modelling'', Proc. of
IEEE Int. Conference on Advanced Signal and Video based
Surveillance, Santa Fe, New Mexico (USA), September 1-3, 2008
3. G. Kayumbi and A. Cavallaro, ''Robust homography-based
trajectory transformation for multi-camera scene analysis'',
Proc. of ACM / IEEE Int. Conference on Conference on Distributed
Smart Cameras (ICDSC), Vienna (A), 25-28 September, 2007
4. H. Zhou, M. Taj, A. Cavallaro, ''Audiovisual tracking using
STAC sensors'', Proc. of ACM / IEEE Int. Conference on
Conference on Distributed Smart Cameras (ICDSC), Vienna (A),
25-28 September, 2007
5. M. Taj and A. Cavallaro, "Multi-camera scene analysis using
an object-centric continuous distribution Hidden Markov Model",
Proc. of IEEE Int. Conf. on Image Processing (ICIP), San
Antonio, Texas (USA), 16-19 September, 2007
6. M. Bregonzio, M. Taj, A. Cavallaro, "Multi-modal particle
filtering tracking using appearance, motion and audio
likelihoods", Proc. of IEEE Int. Conf. on Image Processing (ICIP),
San Antonio, Texas (USA), 16-19 September, 2007
7. N. Anjum and A. Cavallaro, "Single camera calibration for
trajectory-based behaviour analysis", Proc. of IEEE Int.
Conference on Advanced Video and Signal based Surveillance,
London (UK), 5-7 September, 2007
8. N. Anjum, M. Taj, A. Cavallaro, "Relative position estimation
of non-overlapping cameras", Proc. of IEEE Int. Conference on
Acoustics, Speech, and Signal Processing (ICASSP), Honolulu
(USA), April 15-20, 2007
9. G. Monaci, P. Vandergheynst, E. Maggio, A. Cavallaro,
"Tracking atoms with particles for audio-visual source
localization", Proc. of IEEE Int. Conf. on Acoustics, Speech,
and Signal Proc. (ICASSP), Honolulu, April 15-20, 2007
10. E. Maggio, E. Piccardo, C. Regazzoni, A. Cavallaro,
"Particle PHD filter for multi-target visual tracking", Proc. of
IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP),
Honolulu (USA), April 15-20, 2007
11. F. Ziliani, A. Cavallaro, "Evaluation of multi-sensor
surveillance event detectors", IET Conference on Imaging for
Crime Detection and Prevention (ICDP 2006), London, UK, 2006
Website for the distribution of the datasets http://www.spevi.org
(11711 hits, 1130 unique visitors)
Related editorials
1. C. Regazzoni, A. Cavallaro, F. Porikli, ''Video Tracking in
Complex Scenes for Surveillance'', EURASIP Journal on Image and
Video Processing, (in press).
2. A. Cavallaro, ''Multi-sensor object detection and tracking'',
Signal, Image and Video Processing, Vol.1, N. 2, June 2007
Related book
H. Aghajan, A. Cavallaro, ''Multi-Camera Networks: Concepts and
Applications'', Elsevier, ISBN: 978-0-12-374633-7, May 2009
Related events
- ECCV Workshop on "Multi-camera and Multi-modal Sensor Fusion:
Algorithms and Applications"
Marseille, France - in conjunction with the 10th European
Conference on Computer Vision
http://www.elec.qmul.ac.uk/staffinfo/andrea/M2SFA2.html