Caltech Pedestrian¶

The Caltech Pedestrian Dataset consists of approximately 10 hours of 640x480 30Hz video taken from a vehicle driving through regular traffic in an urban environment. About 250,000 frames (in 137 approximately minute long segments) with a total of 350,000 bounding boxes and 2300 unique pedestrians were annotated.

The annotation includes temporal correspondence between bounding boxes and detailed occlusion labels.

Use cases¶

Pedestrian detection in images/videos.

Properties¶

name: caltech_pedestrian
keywords: image_processing, detection, pedestrian
dataset size: 11,9 GB
is downloadable: yes
tasks:
- detection: (default)
  
  primary use: object detection
  
  description: Contains image filenames, classes and bounding box annotations for pedestrian detection in images/videos.
  
  sets: train, test
  
  metadata file size in disk: 728,4 kB
  
  has annotations: yes
  
  which:
  
  labels for each class/category.
  
  bounding box of pedestrians.
  
  occlusion % of annotated pedestrians.
- detection_10x:
  
  primary use: object detection
  
  description: Contains image filenames, classes and bounding box annotations for pedestrian detection in images/videos.
  
  sets: train, test
  
  metadata file size in disk: 6,2 MB
  
  has annotations: yes
  
  which:
  
  labels for each class/category.
  
  bounding box of pedestrians.
  
  occlusion % of annotated pedestrians.
- detection_30x:
  
  primary use: object detection
  
  description: Contains image filenames, classes and bounding box annotations for pedestrian detection in images/videos.
  
  sets: train, test
  
  metadata file size in disk: 17,4 MB
  
  has annotations: yes
  
  which:
  
  labels for each class/category.
  
  bounding box of pedestrians.
  
  occlusion % of annotated pedestrians.

Metadata structure (HDF5)¶

Task: detection¶

/
├── train/
│   ├── image_filenames   # dtype=np.uint8, shape=(4250,90)  (note: string in ASCII format)
│   ├── classes           # dtype=np.uint8, shape=(4,10)     (note: string in ASCII format)
│   ├── boxes             # dtype=np.float, shape=(6313,4)
│   ├── boxesv            # dtype=np.float, shape=(6313,4)
│   ├── id                # dtype=np.int32, shape=(6313,)
│   ├── occlusion         # dtype=np.float, shape=(6313,)
│   ├── object_fields     # dtype=np.uint8, shape=(6,16)     (note: string in ASCII format)
│   ├── object_ids        # dtype=np.int32, shape=(6313,6)
│   ├── list_image_filenames_per_class   # dtype=np.int32, shape=(4,5033))
│   ├── list_boxes_per_image             # dtype=np.int32, shape=(4250,22))
│   ├── list_boxesv_per_image            # dtype=np.int32, shape=(4250,22))
│   ├── list_object_ids_per_image        # dtype=np.int32, shape=(4250,22))
│   └── list_objects_ids_per_class       # dtype=np.int32, shape=(4,5033))
│
└── test/
    ├── image_filenames   # dtype=np.uint8, shape=(4024,90)  (note: string in ASCII format)
    ├── classes           # dtype=np.uint8, shape=(4,10)     (note: string in ASCII format)
    ├── boxes             # dtype=np.float, shape=(5109,4)
    ├── boxesv            # dtype=np.float, shape=(5109,4)
    ├── id                # dtype=np.int32, shape=(5109,)
    ├── occlusion         # dtype=np.float, shape=(5109,)
    ├── object_fields     # dtype=np.uint8, shape=(6,16)     (note: string in ASCII format)
    ├── object_ids        # dtype=np.int32, shape=(5109,6)
    ├── list_image_filenames_per_class   # dtype=np.int32, shape=(4,2010))
    ├── list_boxes_per_image             # dtype=np.int32, shape=(4024,13))
    ├── list_boxesv_per_image            # dtype=np.int32, shape=(4024,13))
    ├── list_object_ids_per_image        # dtype=np.int32, shape=(4024,13))
    └── list_objects_ids_per_class       # dtype=np.int32, shape=(4,4371))

Fields¶

image_filenames: image file path+names
- available in: train, test
- dtype: np.uint8
- is padded: True
- fill value: 0
- note: strings stored in ASCII format
classes: class names
- available in: train, test
- dtype: np.uint8
- is padded: True
- fill value: 0
- note: strings stored in ASCII format
boxes: bounding boxes
- available in: train, test
- dtype: np.float
- is padded: False
- fill value: -1
- note: bbox format (x1,y1,x2,y2)
boxesv: bounding boxes (visible)
- available in: train, test
- dtype: np.float
- is padded: False
- fill value: -1
- note: bbox format (x1,y1,x2,y2)
id: label ids
- available in: train, test
- dtype: np.int32
- is padded: False
- fill value: -1
occlusion: occlusion percentage
- available in: train, test
- dtype: np.float
- is padded: False
- fill value: -1
object_fields: list of field names of the object id list
- available in: train, test
- dtype: np.uint8
- is padded: True
- fill value: 0
- note: strings stored in ASCII format
- note: key field (field name aggregator)
object_ids: list of field ids
- available in: train, test
- dtype: np.int32
- is padded: False
- fill value: -1
- note: key field (field id aggregator)
list_image_filenames_per_class: list of image per class
- available in: train, test
- dtype: np.int32
- is padded: True
- fill value: -1
- note: pre-ordered list
list_boxes_per_image: list of bounding boxes per image
- available in: train, test
- dtype: np.int32
- is padded: True
- fill value: -1
- note: pre-ordered list
list_boxesv_per_image: list of (visible) bounding boxes per image
- available in: train, test
- dtype: np.int32
- is padded: True
- fill value: -1
- note: pre-ordered list
list_object_ids_per_image: list of object ids per image
- available in: train, test
- dtype: np.int32
- is padded: True
- fill value: -1
- note: pre-ordered list
list_objects_ids_per_class: list of object ids per class
- available in: train, test
- dtype: np.int32
- is padded: True
- fill value: -1
- note: pre-ordered list

Task: detection_10x¶

/
├── train/
│   ├── image_filenames   # dtype=np.uint8, shape=(42782,90)  (note: string in ASCII format)
│   ├── classes           # dtype=np.uint8, shape=(4,10)     (note: string in ASCII format)
│   ├── boxes             # dtype=np.float, shape=(63538,4)
│   ├── boxesv            # dtype=np.float, shape=(63538,4)
│   ├── id                # dtype=np.int32, shape=(63538,)
│   ├── occlusion         # dtype=np.float, shape=(63538,)
│   ├── object_fields     # dtype=np.uint8, shape=(6,16)     (note: string in ASCII format)
│   ├── object_ids        # dtype=np.int32, shape=(63538,6)
│   ├── list_image_filenames_per_class   # dtype=np.int32, shape=(4,20422))
│   ├── list_boxes_per_image             # dtype=np.int32, shape=(42782,22))
│   ├── list_boxesv_per_image            # dtype=np.int32, shape=(42782,22))
│   ├── list_object_ids_per_image        # dtype=np.int32, shape=(42782,22))
│   └── list_objects_ids_per_class       # dtype=np.int32, shape=(4,50605))
│
└── test/
    ├── image_filenames   # dtype=np.uint8, shape=(40465,90)  (note: string in ASCII format)
    ├── classes           # dtype=np.uint8, shape=(4,10)     (note: string in ASCII format)
    ├── boxes             # dtype=np.float, shape=(51079,4)
    ├── boxesv            # dtype=np.float, shape=(51079,4)
    ├── id                # dtype=np.int32, shape=(51079,)
    ├── occlusion         # dtype=np.float, shape=(51079,)
    ├── object_fields     # dtype=np.uint8, shape=(6,16)     (note: string in ASCII format)
    ├── object_ids        # dtype=np.int32, shape=(51079,6)
    ├── list_image_filenames_per_class   # dtype=np.int32, shape=(4,20173))
    ├── list_boxes_per_image             # dtype=np.int32, shape=(40465,14))
    ├── list_boxesv_per_image            # dtype=np.int32, shape=(40465,14))
    ├── list_object_ids_per_image        # dtype=np.int32, shape=(40465,14))
    └── list_objects_ids_per_class       # dtype=np.int32, shape=(4,43748))

Fields¶

image_filenames: image file path+names
- available in: train, test
- dtype: np.uint8
- is padded: True
- fill value: 0
- note: strings stored in ASCII format
classes: class names
- available in: train, test
- dtype: np.uint8
- is padded: True
- fill value: 0
- note: strings stored in ASCII format
boxes: bounding boxes
- available in: train, test
- dtype: np.float
- is padded: False
- fill value: -1
- note: bbox format (x1,y1,x2,y2)
boxesv: bounding boxes (visible)
- available in: train, test
- dtype: np.float
- is padded: False
- fill value: -1
- note: bbox format (x1,y1,x2,y2)
id: label ids
- available in: train, test
- dtype: np.int32
- is padded: False
- fill value: -1
occlusion: occlusion percentage
- available in: train, test
- dtype: np.float
- is padded: False
- fill value: -1
object_fields: list of field names of the object id list
- available in: train, test
- dtype: np.uint8
- is padded: True
- fill value: 0
- note: strings stored in ASCII format
- note: key field (field name aggregator)
object_ids: list of field ids
- available in: train, test
- dtype: np.int32
- is padded: False
- fill value: -1
- note: key field (field id aggregator)
list_image_filenames_per_class: list of image per class
- available in: train, test
- dtype: np.int32
- is padded: True
- fill value: -1
- note: pre-ordered list
list_boxes_per_image: list of bounding boxes per image
- available in: train, test
- dtype: np.int32
- is padded: True
- fill value: -1
- note: pre-ordered list
list_boxesv_per_image: list of (visible) bounding boxes per image
- available in: train, test
- dtype: np.int32
- is padded: True
- fill value: -1
- note: pre-ordered list
list_object_ids_per_image: list of object ids per image
- available in: train, test
- dtype: np.int32
- is padded: True
- fill value: -1
- note: pre-ordered list
list_objects_ids_per_class: list of object ids per class
- available in: train, test
- dtype: np.int32
- is padded: True
- fill value: -1
- note: pre-ordered list

Task: detection_30x¶

/
├── train/
│   ├── image_filenames   # dtype=np.uint8, shape=(128419,90)  (note: string in ASCII format)
│   ├── classes           # dtype=np.uint8, shape=(4,10)       (note: string in ASCII format)
│   ├── boxes             # dtype=np.float, shape=(190598,4)
│   ├── boxesv            # dtype=np.float, shape=(190598,4)
│   ├── id                # dtype=np.int32, shape=(190598,)
│   ├── occlusion         # dtype=np.float, shape=(190598,)
│   ├── object_fields     # dtype=np.uint8, shape=(6,16)       (note: string in ASCII format)
│   ├── object_ids        # dtype=np.int32, shape=(190598,6)
│   ├── list_image_filenames_per_class   # dtype=np.int32, shape=(4,61274))
│   ├── list_boxes_per_image             # dtype=np.int32, shape=(128419,22))
│   ├── list_boxesv_per_image            # dtype=np.int32, shape=(128419,22))
│   ├── list_object_ids_per_image        # dtype=np.int32, shape=(128419,22))
│   └── list_objects_ids_per_class       # dtype=np.int32, shape=(4,151768))
│
└── test/
    ├── image_filenames   # dtype=np.uint8, shape=(121465,90)  (note: string in ASCII format)
    ├── classes           # dtype=np.uint8, shape=(4,10)       (note: string in ASCII format)
    ├── boxes             # dtype=np.float, shape=(153305,4)
    ├── boxesv            # dtype=np.float, shape=(153305,4)
    ├── id                # dtype=np.int32, shape=(153305,)
    ├── occlusion         # dtype=np.float, shape=(153305,)
    ├── object_fields     # dtype=np.uint8, shape=(6,16)       (note: string in ASCII format)
    ├── object_ids        # dtype=np.int32, shape=(153305,6)
    ├── list_image_filenames_per_class   # dtype=np.int32, shape=(4,60537))
    ├── list_boxes_per_image             # dtype=np.int32, shape=(121465,14))
    ├── list_boxesv_per_image            # dtype=np.int32, shape=(121465,14))
    ├── list_object_ids_per_image        # dtype=np.int32, shape=(121465,14))
    └── list_objects_ids_per_class       # dtype=np.int32, shape=(4,131273))

Fields¶

image_filenames: image file path+names
- available in: train, test
- dtype: np.uint8
- is padded: True
- fill value: 0
- note: strings stored in ASCII format
classes: class names
- available in: train, test
- dtype: np.uint8
- is padded: True
- fill value: 0
- note: strings stored in ASCII format
boxes: bounding boxes
- available in: train, test
- dtype: np.float
- is padded: False
- fill value: -1
- note: bbox format (x1,y1,x2,y2)
boxesv: bounding boxes (visible)
- available in: train, test
- dtype: np.float
- is padded: False
- fill value: -1
- note: bbox format (x1,y1,x2,y2)
id: label ids
- available in: train, test
- dtype: np.int32
- is padded: False
- fill value: -1
occlusion: occlusion percentage
- available in: train, test
- dtype: np.float
- is padded: False
- fill value: -1
object_fields: list of field names of the object id list
- available in: train, test
- dtype: np.uint8
- is padded: True
- fill value: 0
- note: strings stored in ASCII format
- note: key field (field name aggregator)
object_ids: list of field ids
- available in: train, test
- dtype: np.int32
- is padded: False
- fill value: -1
- note: key field (field id aggregator)
list_image_filenames_per_class: list of image per class
- available in: train, test
- dtype: np.int32
- is padded: True
- fill value: -1
- note: pre-ordered list
list_boxes_per_image: list of bounding boxes per image
- available in: train, test
- dtype: np.int32
- is padded: True
- fill value: -1
- note: pre-ordered list
list_boxesv_per_image: list of (visible) bounding boxes per image
- available in: train, test
- dtype: np.int32
- is padded: True
- fill value: -1
- note: pre-ordered list
list_object_ids_per_image: list of object ids per image
- available in: train, test
- dtype: np.int32
- is padded: True
- fill value: -1
- note: pre-ordered list
list_objects_ids_per_class: list of object ids per class
- available in: train, test
- dtype: np.int32
- is padded: True
- fill value: -1
- note: pre-ordered list

Disclaimer¶

For information about the dataset and its terms of use, please see this link.