SPEVI
Datasets
This
page provides publicly available benchmark datasets for testing and
evaluating target tracking algorithms for
surveillance-related applications. The datasets are free for research and
educational purposes only and can be used in scientific publications at
the condition of respecting the requested citation acknowledgment.
Because
the accuracy of target tracking algorithms is highly data dependent, they
need be evaluated with large test corpora containing significant
statistical data variability. Current test sets used for target tracking
evaluation are generally composed of a limited number of data items. This
limitation is due to two main reasons: (i) the generation of ground-truth
data is a highly time-consuming and tedious task and (ii) audiovisual data
involving people and their properties (e.g., vehicles) are not easily
distributed due to privacy issues. To complement the existing datasets
(links at the bottom of this page), this page distributes additional data
and their associated ground-truth.
If you
want to contribute to this dataset, please contact us at info@spevi.org
New - PFT: A
protocol for evaluating video trackers
Audiovisual
people dataset
This is a dataset for uni-modal and multi-modal
(audio and visual) people detection tracking. The dataset consists of
three sequences recorded in different scenarios with a video camera and a two microphones. Two sequences (motinas_Room160 and
motinas_Room105) are recorded in rooms with reverberations. The third
sequence (motinas_Chamber) is recorded in a
room with reduced reverberations.
Sensor
details
-
The camera is placed in the centre of a bar
that supports two microphones
- Distance between the microphones: 95 cm
- Microphones: Beyerdynamic MCE 530
condenser microphones
- Camera: KOBI KF-31CD analog CCD surveillance camera
Sample frames
|
|
|
motinas_Room160
|
motinas_Room105
|
motinas_Chamber
|
Data details
- Location
of recording: Department of Electronic Engineering - Queen Mary, University
of London
- Number of sequences: 3
- Total number of images: 3271
- Format of images: 8-bit color AVI
- Image size: 360 x 288 pixels
- Video sampling rate: 25 Hz
- Audio sampling rate: 44.1 kHz
Ground truth
The
ground truth data are provided together with the sequences in the
corresponding .zip file, as list of XML files representing the positions
of the objects in the field of view.
Requested citation
acknowledgment
Courtesy of
EPSRC funded MOTINAS project (EP/D033772/1)
Point
of contact
Murtaza
Taj, murtaza.taj[at]elec.qmul.ac.uk
Download
To
download this dataset click here
Single
face dataset
This is a dataset for single person/face visual detection and tracking.
The dataset is composed of five sequences with different illumination
conditions and resolutions. Three sequences (motinas_toni,
motinas_toni_change_ill
and motinas_nikola_dark)
are shot with a hand held camera (JVC GR-20EK). In motinas_toni
the target moves under a constant bright illumination; in motinas_toni_change_ill
the illumination changes from dark to bright; the sequence motinas_nikola_dark
is constantly dark. Two sequences (motinas_emilio_webcam
and motinas_emilio_webcam_turning)
are shot with a webcam (Logitech Quickcam)
under a fairly constant illumination.
Sample
frames
|
|
|
motinas_toni
|
motinas_toni_change_ill
|
motinas_nikola_dark
|
|
|
motinas_emilio_webcam
|
motinas_emilio_webcam_turning
|
Sensor details
- video
camera: JVC GR-20EK and Logitech Quickcam
Data
details
-
Location of recording: Department of Electronic Engineering - Queen Mary,
University of London
- Number of sequences: 5
- Total number of images: 3018
- Format of images: DivX 6 compression
- Image size and sampling rate: 640 x 480 pixels, 25 Hz (motinas_toni,
motinas_toni_change_ill,
motinas_nikola_dark)
- Image size and sampling rate: 320 x 240 pixels, 10 Hz (motinas_emilio_webcam
and motinas_emilio_webcam_turning)
Target
initialization
The
target initialization parameters (the parameters of an ellipse around the
face) are provided in the .zip files together with the sequences.
Ground
truth
The
ground truth data is available in the .zip files for the sequences motinas_toni
and motinas_emilio_webcam.
In the ground truth files each line of text describes the objects'
position and size in a frame. The syntax of a line is the following:
frame number_of_objects
obj_1_name x y half_width
half_height angle obj_2_name
x y half_width half_height angle ...
Example: first line of a ground truth file in the single object
case: 1 1 man 172 77 29
36 -5
Requested citation
acknowledgment
E. Maggio,
A. Cavallaro, "Hybrid particle filter and mean shift tracker with
adaptive transition model", in Proc. of IEEE Int. Conference on
Acoustics, Speech and Signal Processing (ICASSP 2005), Philadelphia,
19-23 March 2005, pp. 221 - 224.
Point
of contact
Emilio
Maggio, emilio.maggio[at]elec.qmul.ac.uk
Download
To
download this dataset click here
Multiple
faces dataset
This is a dataset for multiple people/faces visual detection and
tracking. The dataset is composed of 3 sequences (same scenario); 4
targets repeatedly occlude each other while appearing and disappearing
from the field of view of the camera. The sequence motinas_multi_face_frontal shows
frontal faces only; in motinas_multi_face_turning
the faces are frontal and rotated; in motinas_multi_face_fast
the targets move faster that in the previous two sequences.
Sample
frames
|
|
|
motinas_multi_face_frontal
|
motinas_multi_face_turning
|
motinas_multi_face_fast
|
Sensor
details
-
video camera: JVC GR-20EK
Data
details
-
Location of recording: Department of Electronic Engineering - Queen Mary,
University of London
- Number of sequences:3
- Total number of images: 2769
- Format of images: DivX
6 compression
- Image size: 640 x 480 pixels
- Sampling rate: 25 Hz
Target
initialization
The
target initialization parameters (the parameters of an ellipse around the
face) are provided in the .zip files together with the sequences.
Requested
citation acknowledgment
E.
Maggio, E. Piccardo, C. Regazzoni, A.
Cavallaro. "Particle PHD filter for multi-target visual
tracking", in Proc. of IEEE International Conference on Acoustics,
Speech and Signal Processing (ICASSP 2007), Honolulu (USA), April 15-20,
2007
Point
of contact
Emilio
Maggio, emilio.maggio[at]elec.qmul.ac.uk
Download
To
download this dataset click here
"i-Lids (AVSS
2007)" bag and vehicle detection challenge
This is a dataset for event detection in CCTV footage and is a sub-set of
the i-Lids
dataset. The events of interest appearing in the dataset are abandoned
baggage (Task 1) and parked vehicle (Task 2). The description
of the tasks can be found here.
Please refer to the description of the i-Lids bag and vehicle detection
challenge
for the submission procedure.
Download
To download this dataset click here
The "i-Lids (AVSS
2007)" Evaluation dataset can be found here
Annotation Tool
ViPER
Other
datasets
Name: cVSG Dataset
Description: The Chroma Video
Segmentation Ground Truth (cVSG) is a corpus of
video sequences and segmentation masks. Chroma based techniques were used
to first acquire foregrounds and backgrounds separately and then combined
to form video sequences. Sequences have been selected to ensure different
complexities.
Download: http://www-vpu.ii.uam.es/CVSG/
Name OTCBVS
Dataset
Description Videos and images recorded
in and beyond the visible spectrum (faces and people)
Download http://www.cse.ohio-state.edu/otcbvs-bench/
Name PETS 2001 Dataset
Description Two view-monitoring of a
campus site (people and vehicles)
Download http://www.cvg.cs.rdg.ac.uk/cgi-bin/PETSMETRICS/page.cgi?dataset
Name PETS 2006 Dataset
Description Person and baggage detection
in a train station
Download http://www.cvg.rdg.ac.uk/PETS2006/data.html
Name VIVID PETS 2005 Dataset
Description Aerial footage (vehicles)
Download http://www.vividevaluation.ri.cmu.edu/datasets/datasets.html
Name AMI Corpora
Description Meeting room scenarios, with
two people sitting around meeting tables
Download http://corpus.amiproject.org/amicorpus/download/download
Name: PETS 2000
Description: Outdoor people and
vehicle tracking (single camera)
Download: ftp://ftp.pets.rdg.ac.uk/pub/PETS2000/
Name: PETS 2002
Description: Moving People
Download: http://www.cvg.cs.rdg.ac.uk/PETS2002/pets2002-db.html
Name: VS - PETS 2003 – INMOVE
Description: Outdoor people
tracking - football data (three synchronised
views)
Download: http://www.cvg.cs.rdg.ac.uk/VSPETS/vspets-db.html
Name: PETS - ECCV 2004 – CAVIAR
Description: A number of video
clips were recorded acting out the different scenarios of interest. These
include people walking alone, meeting with others, window shopping,
fighting and passing out and abandoned luggage
Download: http://groups.inf.ed.ac.uk/vision/CAVIAR/CAVIARDATA1/
Name: VISOR Dataset
Description: Multicamera
outdoors and indoors scenarios
Download: http://imagelab.ing.unimore.it/visor/video_categories.asp
Name: NGSIM
Description: detailed vehicle
trajectory data on parts of highways
Download: http://ngsim.fhwa.dot.gov/modules.php?op=modload&name=News&file=article&sid=4
Name: IBM
Description: 4 outdoor (from PETS2001)
of people and vehicles and 11 indoor clips of people.
Download: http://domino.research.ibm.com/comm/research_projects.nsf/pages/s3.performanceevaluation.html
Name:
CANDELA
Description: "Indoor
abandoned object" and "road intersection"
Download: http://www.multitel.be/~va/candela/
Name: Traffic datasets
Description: Traffic databases
Download: http://i21www.ira.uka.de/image_sequences/
Name: WAMOP-PETS'2005
Description: Scenes on water.
Download: http://www.vast.uccs.edu/~tboult/PETS05/
Name: DaimlerChrysler Pedestrian Classification
Benchmark Dataset
Description: Collection of
pedestrian and non-pedestrian images.
Download: http://www.gavrila.net/Computer_Vision/Research/Pedestrian_Detection/DC_Pedestrian_Class__Benchmark/dc_pedestrian_class__benchmark.html
Name: PETS-ICVS'2003 – Fgnet
Description: Smart meeting, that includes facial expressions, gaze and
gesture/action.
Download: http://www.cvg.cs.rdg.ac.uk/PETS-ICVS/pets-icvs-db.html
Name ViHASi dataset
Description Virtual Human Action Silhouette
Data
Download http://dipersec.king.ac.uk/VIHASI/
Name MuHAVi dataset
Description Multicamera Human Action Video Data
Download http://dipersec.king.ac.uk/MuHAVi-MAS/
Name: PLIA2
Description: set of common
household activities during the four-hour period using a set of
instructions
Download: http://architecture.mit.edu/house_n/data/PlaceLab/PLIA2.htm
Name: KTH data set
Description: six types of human
actions (walking, jogging, running, boxing, hand waving and hand
clapping) performed several times by 25 subjects in four different
scenarios
Download: http://www.nada.kth.se/cvap/actions/
Name: Weizmann dataset
Description: actions as
walk, run, rump, gallop sideways, bend, one-hand wave, two-hands wave,
jump in place, jumping Jack, skip
Download: http://www.wisdom.weizmann.ac.il/~vision/SpaceTimeActions.html
Name: FRGC dataset
Description: The FRGC database is
a collection of Biometric images with both 2D and 3D information. It has
50,000 recordings divided into training and validation partitions
Download: http://face.nist.gov/frgc/
Name:
3D_RMA
Description: This database has
been acquired in the framework of the M2VTS project and contains 6 3D scans of
120 individuals
Download: http://www.sic.rma.ac.be/~beumier/DB/3d_rma.html
Name: IPPR Contest motion segmentation dataset
Description:
Download: http://media.ee.ntu.edu.tw/Archer_contest/
|