Caltech Pedestrian¶
The Caltech Pedestrian Dataset consists of approximately 10 hours of 640x480 30Hz video taken from a vehicle driving through regular traffic in an urban environment. About 250,000 frames (in 137 approximately minute long segments) with a total of 350,000 bounding boxes and 2300 unique pedestrians were annotated.
The annotation includes temporal correspondence between bounding boxes and detailed occlusion labels.
Use cases¶
Pedestrian detection in images/videos.
Properties¶
name: caltech_pedestriankeywords: image_processing, detection, pedestriandataset size: 11,9 GBis downloadable: yestasks:- detection: (default)
primary use: object detectiondescription: Contains image filenames, classes and bounding box annotations for pedestrian detection in images/videos.sets: train, testmetadata file size in disk: 728,4 kBhas annotations: yeswhich:- labels for each class/category.
- bounding box of pedestrians.
- occlusion % of annotated pedestrians.
- detection_10x:
primary use: object detectiondescription: Contains image filenames, classes and bounding box annotations for pedestrian detection in images/videos.sets: train, testmetadata file size in disk: 6,2 MBhas annotations: yeswhich:- labels for each class/category.
- bounding box of pedestrians.
- occlusion % of annotated pedestrians.
- detection_30x:
primary use: object detectiondescription: Contains image filenames, classes and bounding box annotations for pedestrian detection in images/videos.sets: train, testmetadata file size in disk: 17,4 MBhas annotations: yeswhich:- labels for each class/category.
- bounding box of pedestrians.
- occlusion % of annotated pedestrians.
Metadata structure (HDF5)¶
Task: detection¶
/
├── train/
│ ├── image_filenames # dtype=np.uint8, shape=(4250,90) (note: string in ASCII format)
│ ├── classes # dtype=np.uint8, shape=(4,10) (note: string in ASCII format)
│ ├── boxes # dtype=np.float, shape=(6313,4)
│ ├── boxesv # dtype=np.float, shape=(6313,4)
│ ├── id # dtype=np.int32, shape=(6313,)
│ ├── occlusion # dtype=np.float, shape=(6313,)
│ ├── object_fields # dtype=np.uint8, shape=(6,16) (note: string in ASCII format)
│ ├── object_ids # dtype=np.int32, shape=(6313,6)
│ ├── list_image_filenames_per_class # dtype=np.int32, shape=(4,5033))
│ ├── list_boxes_per_image # dtype=np.int32, shape=(4250,22))
│ ├── list_boxesv_per_image # dtype=np.int32, shape=(4250,22))
│ ├── list_object_ids_per_image # dtype=np.int32, shape=(4250,22))
│ └── list_objects_ids_per_class # dtype=np.int32, shape=(4,5033))
│
└── test/
├── image_filenames # dtype=np.uint8, shape=(4024,90) (note: string in ASCII format)
├── classes # dtype=np.uint8, shape=(4,10) (note: string in ASCII format)
├── boxes # dtype=np.float, shape=(5109,4)
├── boxesv # dtype=np.float, shape=(5109,4)
├── id # dtype=np.int32, shape=(5109,)
├── occlusion # dtype=np.float, shape=(5109,)
├── object_fields # dtype=np.uint8, shape=(6,16) (note: string in ASCII format)
├── object_ids # dtype=np.int32, shape=(5109,6)
├── list_image_filenames_per_class # dtype=np.int32, shape=(4,2010))
├── list_boxes_per_image # dtype=np.int32, shape=(4024,13))
├── list_boxesv_per_image # dtype=np.int32, shape=(4024,13))
├── list_object_ids_per_image # dtype=np.int32, shape=(4024,13))
└── list_objects_ids_per_class # dtype=np.int32, shape=(4,4371))
Fields¶
image_filenames: image file path+namesavailable in: train, testdtype: np.uint8is padded: Truefill value: 0note: strings stored in ASCII format
classes: class namesavailable in: train, testdtype: np.uint8is padded: Truefill value: 0note: strings stored in ASCII format
boxes: bounding boxesavailable in: train, testdtype: np.floatis padded: Falsefill value: -1note: bbox format (x1,y1,x2,y2)
boxesv: bounding boxes (visible)available in: train, testdtype: np.floatis padded: Falsefill value: -1note: bbox format (x1,y1,x2,y2)
id: label idsavailable in: train, testdtype: np.int32is padded: Falsefill value: -1
occlusion: occlusion percentageavailable in: train, testdtype: np.floatis padded: Falsefill value: -1
object_fields: list of field names of the object id listavailable in: train, testdtype: np.uint8is padded: Truefill value: 0note: strings stored in ASCII formatnote: key field (field name aggregator)
object_ids: list of field idsavailable in: train, testdtype: np.int32is padded: Falsefill value: -1note: key field (field id aggregator)
list_image_filenames_per_class: list of image per classavailable in: train, testdtype: np.int32is padded: Truefill value: -1note: pre-ordered list
list_boxes_per_image: list of bounding boxes per imageavailable in: train, testdtype: np.int32is padded: Truefill value: -1note: pre-ordered list
list_boxesv_per_image: list of (visible) bounding boxes per imageavailable in: train, testdtype: np.int32is padded: Truefill value: -1note: pre-ordered list
list_object_ids_per_image: list of object ids per imageavailable in: train, testdtype: np.int32is padded: Truefill value: -1note: pre-ordered list
list_objects_ids_per_class: list of object ids per classavailable in: train, testdtype: np.int32is padded: Truefill value: -1note: pre-ordered list
Task: detection_10x¶
/
├── train/
│ ├── image_filenames # dtype=np.uint8, shape=(42782,90) (note: string in ASCII format)
│ ├── classes # dtype=np.uint8, shape=(4,10) (note: string in ASCII format)
│ ├── boxes # dtype=np.float, shape=(63538,4)
│ ├── boxesv # dtype=np.float, shape=(63538,4)
│ ├── id # dtype=np.int32, shape=(63538,)
│ ├── occlusion # dtype=np.float, shape=(63538,)
│ ├── object_fields # dtype=np.uint8, shape=(6,16) (note: string in ASCII format)
│ ├── object_ids # dtype=np.int32, shape=(63538,6)
│ ├── list_image_filenames_per_class # dtype=np.int32, shape=(4,20422))
│ ├── list_boxes_per_image # dtype=np.int32, shape=(42782,22))
│ ├── list_boxesv_per_image # dtype=np.int32, shape=(42782,22))
│ ├── list_object_ids_per_image # dtype=np.int32, shape=(42782,22))
│ └── list_objects_ids_per_class # dtype=np.int32, shape=(4,50605))
│
└── test/
├── image_filenames # dtype=np.uint8, shape=(40465,90) (note: string in ASCII format)
├── classes # dtype=np.uint8, shape=(4,10) (note: string in ASCII format)
├── boxes # dtype=np.float, shape=(51079,4)
├── boxesv # dtype=np.float, shape=(51079,4)
├── id # dtype=np.int32, shape=(51079,)
├── occlusion # dtype=np.float, shape=(51079,)
├── object_fields # dtype=np.uint8, shape=(6,16) (note: string in ASCII format)
├── object_ids # dtype=np.int32, shape=(51079,6)
├── list_image_filenames_per_class # dtype=np.int32, shape=(4,20173))
├── list_boxes_per_image # dtype=np.int32, shape=(40465,14))
├── list_boxesv_per_image # dtype=np.int32, shape=(40465,14))
├── list_object_ids_per_image # dtype=np.int32, shape=(40465,14))
└── list_objects_ids_per_class # dtype=np.int32, shape=(4,43748))
Fields¶
image_filenames: image file path+namesavailable in: train, testdtype: np.uint8is padded: Truefill value: 0note: strings stored in ASCII format
classes: class namesavailable in: train, testdtype: np.uint8is padded: Truefill value: 0note: strings stored in ASCII format
boxes: bounding boxesavailable in: train, testdtype: np.floatis padded: Falsefill value: -1note: bbox format (x1,y1,x2,y2)
boxesv: bounding boxes (visible)available in: train, testdtype: np.floatis padded: Falsefill value: -1note: bbox format (x1,y1,x2,y2)
id: label idsavailable in: train, testdtype: np.int32is padded: Falsefill value: -1
occlusion: occlusion percentageavailable in: train, testdtype: np.floatis padded: Falsefill value: -1
object_fields: list of field names of the object id listavailable in: train, testdtype: np.uint8is padded: Truefill value: 0note: strings stored in ASCII formatnote: key field (field name aggregator)
object_ids: list of field idsavailable in: train, testdtype: np.int32is padded: Falsefill value: -1note: key field (field id aggregator)
list_image_filenames_per_class: list of image per classavailable in: train, testdtype: np.int32is padded: Truefill value: -1note: pre-ordered list
list_boxes_per_image: list of bounding boxes per imageavailable in: train, testdtype: np.int32is padded: Truefill value: -1note: pre-ordered list
list_boxesv_per_image: list of (visible) bounding boxes per imageavailable in: train, testdtype: np.int32is padded: Truefill value: -1note: pre-ordered list
list_object_ids_per_image: list of object ids per imageavailable in: train, testdtype: np.int32is padded: Truefill value: -1note: pre-ordered list
list_objects_ids_per_class: list of object ids per classavailable in: train, testdtype: np.int32is padded: Truefill value: -1note: pre-ordered list
Task: detection_30x¶
/
├── train/
│ ├── image_filenames # dtype=np.uint8, shape=(128419,90) (note: string in ASCII format)
│ ├── classes # dtype=np.uint8, shape=(4,10) (note: string in ASCII format)
│ ├── boxes # dtype=np.float, shape=(190598,4)
│ ├── boxesv # dtype=np.float, shape=(190598,4)
│ ├── id # dtype=np.int32, shape=(190598,)
│ ├── occlusion # dtype=np.float, shape=(190598,)
│ ├── object_fields # dtype=np.uint8, shape=(6,16) (note: string in ASCII format)
│ ├── object_ids # dtype=np.int32, shape=(190598,6)
│ ├── list_image_filenames_per_class # dtype=np.int32, shape=(4,61274))
│ ├── list_boxes_per_image # dtype=np.int32, shape=(128419,22))
│ ├── list_boxesv_per_image # dtype=np.int32, shape=(128419,22))
│ ├── list_object_ids_per_image # dtype=np.int32, shape=(128419,22))
│ └── list_objects_ids_per_class # dtype=np.int32, shape=(4,151768))
│
└── test/
├── image_filenames # dtype=np.uint8, shape=(121465,90) (note: string in ASCII format)
├── classes # dtype=np.uint8, shape=(4,10) (note: string in ASCII format)
├── boxes # dtype=np.float, shape=(153305,4)
├── boxesv # dtype=np.float, shape=(153305,4)
├── id # dtype=np.int32, shape=(153305,)
├── occlusion # dtype=np.float, shape=(153305,)
├── object_fields # dtype=np.uint8, shape=(6,16) (note: string in ASCII format)
├── object_ids # dtype=np.int32, shape=(153305,6)
├── list_image_filenames_per_class # dtype=np.int32, shape=(4,60537))
├── list_boxes_per_image # dtype=np.int32, shape=(121465,14))
├── list_boxesv_per_image # dtype=np.int32, shape=(121465,14))
├── list_object_ids_per_image # dtype=np.int32, shape=(121465,14))
└── list_objects_ids_per_class # dtype=np.int32, shape=(4,131273))
Fields¶
image_filenames: image file path+namesavailable in: train, testdtype: np.uint8is padded: Truefill value: 0note: strings stored in ASCII format
classes: class namesavailable in: train, testdtype: np.uint8is padded: Truefill value: 0note: strings stored in ASCII format
boxes: bounding boxesavailable in: train, testdtype: np.floatis padded: Falsefill value: -1note: bbox format (x1,y1,x2,y2)
boxesv: bounding boxes (visible)available in: train, testdtype: np.floatis padded: Falsefill value: -1note: bbox format (x1,y1,x2,y2)
id: label idsavailable in: train, testdtype: np.int32is padded: Falsefill value: -1
occlusion: occlusion percentageavailable in: train, testdtype: np.floatis padded: Falsefill value: -1
object_fields: list of field names of the object id listavailable in: train, testdtype: np.uint8is padded: Truefill value: 0note: strings stored in ASCII formatnote: key field (field name aggregator)
object_ids: list of field idsavailable in: train, testdtype: np.int32is padded: Falsefill value: -1note: key field (field id aggregator)
list_image_filenames_per_class: list of image per classavailable in: train, testdtype: np.int32is padded: Truefill value: -1note: pre-ordered list
list_boxes_per_image: list of bounding boxes per imageavailable in: train, testdtype: np.int32is padded: Truefill value: -1note: pre-ordered list
list_boxesv_per_image: list of (visible) bounding boxes per imageavailable in: train, testdtype: np.int32is padded: Truefill value: -1note: pre-ordered list
list_object_ids_per_image: list of object ids per imageavailable in: train, testdtype: np.int32is padded: Truefill value: -1note: pre-ordered list
list_objects_ids_per_class: list of object ids per classavailable in: train, testdtype: np.int32is padded: Truefill value: -1note: pre-ordered list