COCO - Common Objects in Context¶
The Microsoft Common Objects in COntext (MS COCO) dataset contains 91 common object categories with 82 of them having more than 5,000 labeled instances. In total the dataset has 2,500,000 labeled instances in 328,000 images.
Use cases¶
Object detection, segmentation, captioning and human body joint detection.
Properties¶
name
: cocokeywords
: image_processing, detection, keypoint, captions, human, posedataset size
: 40,3 GBis downloadable
: yestasks
:- detection_2015: (default)
primary use
: object detectiondescription
: Contains image filenames, classes, bounding box and segmentation mask annotations for object detection in images.sets
: train, val, testmetadata file size in disk
: 243,6 MBhas annotations
: yeswhich
:- image filenames
- object categories and supercategories
- bounding box of pedestrians.
- occlusion % of annotated pedestrians.
- segmentation masks
- detection_2016:
primary use
: object detectiondescription
: Contains image filenames, classes, bounding box and segmentation mask annotations for object detection in images.sets
: train, val, test, test_devmetadata file size in disk
: 244,7 MBhas annotations
: yeswhich
:- image filenames
- object categories and supercategories
- bounding box of pedestrians.
- occlusion % of annotated pedestrians.
- segmentation masks
- caption_2015:
primary use
: image captioningdescription
: Contains image filenames and captions for image captioning.sets
: train, val, testmetadata file size in disk
: 21,9 MBhas annotations
: yeswhich
:- image filenames
- captions
- caption_2016:
primary use
: image captioningdescription
: Contains image filenames and captions for image captioning.sets
: train, val, test, test_devmetadata file size in disk
: 23,0 MBhas annotations
: yeswhich
:- image filenames
- captions
- keypoints_2016:
primary use
: human body joint detectiondescription
: Contains image filenames, classes, bounding box and segmentation mask annotations for object detection in images.sets
: train, val, test, test_devmetadata file size in disk
: 106,6 MBhas annotations
: yeswhich
:- image filenames
- object categories and supercategories
- bounding box of pedestrians.
- occlusion % of annotated pedestrians.
- segmentation masks
- body joint keypoints
- skeleton
Metadata structure (HDF5)¶
Task: detection_2015¶
/
├── train/
│ ├── image_filenames # dtype=np.uint8, shape=(82783,74) (note: string in ASCII format)
│ ├── category # dtype=np.uint8, shape=(80,15) (note: string in ASCII format)
│ ├── supercategory # dtype=np.uint8, shape=(12,11) (note: string in ASCII format)
│ ├── coco_annotations_ids # dtype=np.int32, shape=(604907,)
│ ├── coco_categories_ids # dtype=np.int32, shape=(80,)
│ ├── coco_images_ids # dtype=np.int32, shape=(82783,)
│ ├── coco_urls # dtype=np.uint8, shape=(82783,32) (note: string in ASCII format)
│ ├── image_id # dtype=np.int32, shape=(82783,)
│ ├── category_id # dtype=np.int32, shape=(80,)
│ ├── annotation_id # dtype=np.int32, shape=(604907,)
│ ├── width # dtype=np.int32, shape=(82783,)
│ ├── height # dtype=np.int32, shape=(82783,)
│ ├── boxes # dtype=np.float, shape=(604907,4)
│ ├── iscrowd # dtype=np.uint8, shape=(2,)
│ ├── segmentation # dtype=np.float, shape=(604907,10043)
│ ├── area # dtype=np.int32, shape=(604907,)
│ ├── object_fields # dtype=np.uint8, shape=(13,16) (note: string in ASCII format)
│ ├── object_ids # dtype=np.int32, shape=(604907,13)
│ ├── list_boxes_per_image # dtype=np.int32, shape=(82783,93))
│ ├── list_image_filenames_per_category # dtype=np.int32, shape=(80,45174))
│ ├── list_image_filenames_per_supercategory # dtype=np.int32, shape=(12,45174))
│ ├── list_object_ids_per_image # dtype=np.int32, shape=(82783,93))
│ ├── list_objects_ids_per_category # dtype=np.int32, shape=(80,185316))
│ └── list_objects_ids_per_supercategory # dtype=np.int32, shape=(12,185316))
│
├── val/
│ ├── image_filenames # dtype=np.uint8, shape=(40504,74) (note: string in ASCII format)
│ ├── category # dtype=np.uint8, shape=(80,15) (note: string in ASCII format)
│ ├── supercategory # dtype=np.uint8, shape=(12,11) (note: string in ASCII format)
│ ├── coco_annotations_ids # dtype=np.int32, shape=(291875,)
│ ├── coco_categories_ids # dtype=np.int32, shape=(80,)
│ ├── coco_images_ids # dtype=np.int32, shape=(40504,)
│ ├── coco_urls # dtype=np.uint8, shape=(40504,32) (note: string in ASCII format)
│ ├── image_id # dtype=np.int32, shape=(40504,)
│ ├── category_id # dtype=np.int32, shape=(80,)
│ ├── annotation_id # dtype=np.int32, shape=(291875,)
│ ├── width # dtype=np.int32, shape=(40504,)
│ ├── height # dtype=np.int32, shape=(40504,)
│ ├── boxes # dtype=np.float, shape=(291875,4)
│ ├── iscrowd # dtype=np.uint8, shape=(2,)
│ ├── segmentation # dtype=np.float, shape=(291875,7237)
│ ├── area # dtype=np.int32, shape=(291875,)
│ ├── object_fields # dtype=np.uint8, shape=(13,16) (note: string in ASCII format)
│ ├── object_ids # dtype=np.int32, shape=(291875,13)
│ ├── list_boxes_per_image # dtype=np.int32, shape=(40504,93))
│ ├── list_image_filenames_per_category # dtype=np.int32, shape=(80,21634))
│ ├── list_image_filenames_per_supercategory # dtype=np.int32, shape=(12,21634))
│ ├── list_object_ids_per_image # dtype=np.int32, shape=(40504,93))
│ ├── list_objects_ids_per_category # dtype=np.int32, shape=(80,88153))
│ └── list_objects_ids_per_supercategory # dtype=np.int32, shape=(12,88153))
│
└── test/
├── image_filenames # dtype=np.uint8, shape=(40775,72) (note: string in ASCII format)
├── category # dtype=np.uint8, shape=(80,15) (note: string in ASCII format)
├── supercategory # dtype=np.uint8, shape=(12,11) (note: string in ASCII format)
├── coco_categories_ids # dtype=np.int32, shape=(80,)
├── coco_images_ids # dtype=np.int32, shape=(40775,)
├── coco_urls # dtype=np.uint8, shape=(40775,32) (note: string in ASCII format)
├── image_id # dtype=np.int32, shape=(40775,)
├── category_id # dtype=np.int32, shape=(80,)
├── width # dtype=np.int32, shape=(40775,)
├── height # dtype=np.int32, shape=(40775,)
├── object_fields # dtype=np.uint8, shape=(4,16) (note: string in ASCII format)
├── object_ids # dtype=np.int32, shape=(40775,4)
└── list_object_ids_per_image # dtype=np.int32, shape=(40775,1))
Fields¶
image_filenames
: image file path+namesavailable in
: train, val, testdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII format
category
: category namesavailable in
: train, val, testdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII format
supercategory
: super category namesavailable in
: train, val, testdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII format
coco_annotations_ids
: reference to coco annotation ids (useful for evaluating on coco)available in
: train, valdtype
: np.int32is padded
: Falsefill value
: -1
coco_categories_ids
: reference to coco category ids (useful for evaluating on coco)available in
: train, val, testdtype
: np.int32is padded
: Falsefill value
: -1
coco_images_ids
: reference to coco image filename ids (useful for evaluating on coco)available in
: train, val, testdtype
: np.int32is padded
: Falsefill value
: -1
coco_urls
: coco urlsavailable in
: train, val, testdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII format
image_id
: image filename idsavailable in
: train, val, testdtype
: np.int32is padded
: Falsefill value
: -1
category_id
: category idsavailable in
: train, val, testdtype
: np.int32is padded
: Falsefill value
: -1
annotation_id
: annotation idsavailable in
: train, valdtype
: np.int32is padded
: Falsefill value
: -1
width
: image widthavailable in
: train, val, testdtype
: np.int32is padded
: Falsefill value
: -1
height
: image heightavailable in
: train, val, testdtype
: np.int32is padded
: Falsefill value
: -1
boxes
: bounding boxavailable in
: train, valdtype
: np.floatis padded
: Falsefill value
: -1note
: bbox format (x1,y1,x2,y2)
iscrowd
: is crowd (0 - False, 1 - True)available in
: train, valdtype
: np.uint8is padded
: Falsefill value
: -1
segmentation
: segmentation maskavailable in
: train, valdtype
: np.floatis padded
: Truefill value
: -1note
: the masks come in 3 different formats, but they are mostly lists of lists. These have been packed (vectorized) into an array with a single dimension in order to be stored in the HDF5 metadata file. To unpack these arrays to their original format, use theunsqueeze_list()
method indbcollection.utils.pad
.
area
: object areaavailable in
: train, valdtype
: np.int32is padded
: Falsefill value
: -1
object_fields
: list of field names of the object id listavailable in
: train, val, testdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII formatnote
: key field (field name aggregator)
object_ids
: list of field idsavailable in
: train, val, testdtype
: np.int32is padded
: Falsefill value
: -1note
: key field (field id aggregator)
list_boxes_per_image
: list of bounding boxes per imageavailable in
: train, valdtype
: np.int32is padded
: Truefill value
: -1note
: pre-ordered list
list_image_filenames_per_category
: list of image filenames per categoryavailable in
: train, valdtype
: np.int32is padded
: Truefill value
: -1note
: pre-ordered list
list_image_filenames_per_supercategory
: list of image filenames per supercategoryavailable in
: train, valdtype
: np.int32is padded
: Truefill value
: -1note
: pre-ordered list
list_object_ids_per_image
: list of object ids per imageavailable in
: train, val, testdtype
: np.int32is padded
: Truefill value
: -1note
: pre-ordered list
list_objects_ids_per_category
: list of object ids per categoryavailable in
: train, valdtype
: np.int32is padded
: Truefill value
: -1note
: pre-ordered list
list_objects_ids_per_supercategory
: list of object ids per supercategoryavailable in
: train, valdtype
: np.int32is padded
: Truefill value
: -1note
: pre-ordered list
Task: detection_2016¶
/
├── train/
│ ├── image_filenames # dtype=np.uint8, shape=(82783,74) (note: string in ASCII format)
│ ├── category # dtype=np.uint8, shape=(80,15) (note: string in ASCII format)
│ ├── supercategory # dtype=np.uint8, shape=(12,11) (note: string in ASCII format)
│ ├── coco_annotations_ids # dtype=np.int32, shape=(604907,)
│ ├── coco_categories_ids # dtype=np.int32, shape=(80,)
│ ├── coco_images_ids # dtype=np.int32, shape=(82783,)
│ ├── coco_urls # dtype=np.uint8, shape=(82783,32) (note: string in ASCII format)
│ ├── image_id # dtype=np.int32, shape=(82783,)
│ ├── category_id # dtype=np.int32, shape=(80,)
│ ├── annotation_id # dtype=np.int32, shape=(604907,)
│ ├── width # dtype=np.int32, shape=(82783,)
│ ├── height # dtype=np.int32, shape=(82783,)
│ ├── boxes # dtype=np.float, shape=(604907,4)
│ ├── iscrowd # dtype=np.uint8, shape=(2,)
│ ├── segmentation # dtype=np.float, shape=(604907,10043)
│ ├── area # dtype=np.int32, shape=(604907,)
│ ├── object_fields # dtype=np.uint8, shape=(13,16) (note: string in ASCII format)
│ ├── object_ids # dtype=np.int32, shape=(604907,13)
│ ├── list_boxes_per_image # dtype=np.int32, shape=(82783,93))
│ ├── list_image_filenames_per_category # dtype=np.int32, shape=(80,45174))
│ ├── list_image_filenames_per_supercategory # dtype=np.int32, shape=(12,45174))
│ ├── list_object_ids_per_image # dtype=np.int32, shape=(82783,93))
│ ├── list_objects_ids_per_category # dtype=np.int32, shape=(80,185316))
│ └── list_objects_ids_per_supercategory # dtype=np.int32, shape=(12,185316))
│
├── val/
│ ├── image_filenames # dtype=np.uint8, shape=(40504,74) (note: string in ASCII format)
│ ├── category # dtype=np.uint8, shape=(80,15) (note: string in ASCII format)
│ ├── supercategory # dtype=np.uint8, shape=(12,11) (note: string in ASCII format)
│ ├── coco_annotations_ids # dtype=np.int32, shape=(291875,)
│ ├── coco_categories_ids # dtype=np.int32, shape=(80,)
│ ├── coco_images_ids # dtype=np.int32, shape=(40504,)
│ ├── coco_urls # dtype=np.uint8, shape=(40504,32) (note: string in ASCII format)
│ ├── image_id # dtype=np.int32, shape=(40504,)
│ ├── category_id # dtype=np.int32, shape=(80,)
│ ├── annotation_id # dtype=np.int32, shape=(291875,)
│ ├── width # dtype=np.int32, shape=(40504,)
│ ├── height # dtype=np.int32, shape=(40504,)
│ ├── boxes # dtype=np.float, shape=(291875,4)
│ ├── iscrowd # dtype=np.uint8, shape=(2,)
│ ├── segmentation # dtype=np.float, shape=(291875,7237)
│ ├── area # dtype=np.int32, shape=(291875,)
│ ├── object_fields # dtype=np.uint8, shape=(13,16) (note: string in ASCII format)
│ ├── object_ids # dtype=np.int32, shape=(291875,13)
│ ├── list_boxes_per_image # dtype=np.int32, shape=(40504,93))
│ ├── list_image_filenames_per_category # dtype=np.int32, shape=(80,21634))
│ ├── list_image_filenames_per_supercategory # dtype=np.int32, shape=(12,21634))
│ ├── list_object_ids_per_image # dtype=np.int32, shape=(40504,93))
│ ├── list_objects_ids_per_category # dtype=np.int32, shape=(80,88153))
│ └── list_objects_ids_per_supercategory # dtype=np.int32, shape=(12,88153))
│
├── test/
│ ├── image_filenames # dtype=np.uint8, shape=(81434,72) (note: string in ASCII format)
│ ├── category # dtype=np.uint8, shape=(80,15) (note: string in ASCII format)
│ ├── supercategory # dtype=np.uint8, shape=(12,11) (note: string in ASCII format)
│ ├── coco_categories_ids # dtype=np.int32, shape=(80,)
│ ├── coco_images_ids # dtype=np.int32, shape=(81434,)
│ ├── coco_urls # dtype=np.uint8, shape=(81434,32) (note: string in ASCII format)
│ ├── image_id # dtype=np.int32, shape=(81434,)
│ ├── category_id # dtype=np.int32, shape=(80,)
│ ├── width # dtype=np.int32, shape=(81434,)
│ ├── height # dtype=np.int32, shape=(81434,)
│ ├── object_fields # dtype=np.uint8, shape=(4,16) (note: string in ASCII format)
│ ├── object_ids # dtype=np.int32, shape=(81434,4)
│ └── list_object_ids_per_image # dtype=np.int32, shape=(81434,1))
│
└── test_dev/
├── image_filenames # dtype=np.uint8, shape=(20288,72) (note: string in ASCII format)
├── category # dtype=np.uint8, shape=(80,15) (note: string in ASCII format)
├── supercategory # dtype=np.uint8, shape=(12,11) (note: string in ASCII format)
├── coco_categories_ids # dtype=np.int32, shape=(80,)
├── coco_images_ids # dtype=np.int32, shape=(20288,)
├── coco_urls # dtype=np.uint8, shape=(20288,32) (note: string in ASCII format)
├── image_id # dtype=np.int32, shape=(20288,)
├── category_id # dtype=np.int32, shape=(80,)
├── width # dtype=np.int32, shape=(20288,)
├── height # dtype=np.int32, shape=(20288,)
├── object_fields # dtype=np.uint8, shape=(4,16) (note: string in ASCII format)
├── object_ids # dtype=np.int32, shape=(20288,4)
└── list_object_ids_per_image # dtype=np.int32, shape=(20288,1))
Fields¶
image_filenames
: image file path+namesavailable in
: train, val, test, test_devdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII format
category
: category namesavailable in
: train, val, test, test_devdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII format
supercategory
: super category namesavailable in
: train, val, test, test_devdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII format
coco_annotations_ids
: reference to coco annotation ids (useful for evaluating on coco)available in
: train, valdtype
: np.int32is padded
: Falsefill value
: -1
coco_categories_ids
: reference to coco category ids (useful for evaluating on coco)available in
: train, val, test, test_devdtype
: np.int32is padded
: Falsefill value
: -1
coco_images_ids
: reference to coco image filename ids (useful for evaluating on coco)available in
: train, val, test, test_devdtype
: np.int32is padded
: Falsefill value
: -1
coco_urls
: coco urlsavailable in
: train, val, test, test_devdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII format
image_id
: image filename idsavailable in
: train, val, test, test_devdtype
: np.int32is padded
: Falsefill value
: -1
category_id
: category idsavailable in
: train, val, test, test_devdtype
: np.int32is padded
: Falsefill value
: -1
annotation_id
: annotation idsavailable in
: train, valdtype
: np.int32is padded
: Falsefill value
: -1
width
: image widthavailable in
: train, val, test, test_devdtype
: np.int32is padded
: Falsefill value
: -1
height
: image heightavailable in
: train, val, test, test_devdtype
: np.int32is padded
: Falsefill value
: -1
boxes
: bounding boxavailable in
: train, valdtype
: np.floatis padded
: Falsefill value
: -1note
: bbox format (x1,y1,x2,y2)
iscrowd
: is crowd (0 - False, 1 - True)available in
: train, valdtype
: np.uint8is padded
: Falsefill value
: -1
segmentation
: segmentation maskavailable in
: train, valdtype
: np.floatis padded
: Truefill value
: -1note
: the masks come in 3 different formats, but they are mostly lists of lists. These have been packed (vectorized) into an array with a single dimension in order to be stored in the HDF5 metadata file. To unpack these arrays to their original format, use theunsqueeze_list()
method indbcollection.utils.pad
.
area
: object areaavailable in
: train, valdtype
: np.int32is padded
: Falsefill value
: -1
object_fields
: list of field names of the object id listavailable in
: train, val, test, test_devdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII formatnote
: key field (field name aggregator)
object_ids
: list of field idsavailable in
: train, val, test, test_devdtype
: np.int32is padded
: Falsefill value
: -1note
: key field (field id aggregator)
list_boxes_per_image
: list of bounding boxes per imageavailable in
: train, valdtype
: np.int32is padded
: Truefill value
: -1note
: pre-ordered list
list_image_filenames_per_category
: list of image filenames per categoryavailable in
: train, valdtype
: np.int32is padded
: Truefill value
: -1note
: pre-ordered list
list_image_filenames_per_supercategory
: list of image filenames per supercategoryavailable in
: train, valdtype
: np.int32is padded
: Truefill value
: -1note
: pre-ordered list
list_object_ids_per_image
: list of object ids per imageavailable in
: train, val, test, test_devdtype
: np.int32is padded
: Truefill value
: -1note
: pre-ordered list
list_objects_ids_per_category
: list of object ids per categoryavailable in
: train, valdtype
: np.int32is padded
: Truefill value
: -1note
: pre-ordered list
list_objects_ids_per_supercategory
: list of object ids per supercategoryavailable in
: train, valdtype
: np.int32is padded
: Truefill value
: -1note
: pre-ordered list
Task: keypoints_2016¶
/
├── train/
│ ├── image_filenames # dtype=np.uint8, shape=(82783,74) (note: string in ASCII format)
│ ├── category # dtype=np.uint8, shape=(80,15) (note: string in ASCII format)
│ ├── supercategory # dtype=np.uint8, shape=(12,11) (note: string in ASCII format)
│ ├── coco_annotations_ids # dtype=np.int32, shape=(185316,)
│ ├── coco_categories_ids # dtype=np.int32, shape=(80,)
│ ├── coco_images_ids # dtype=np.int32, shape=(82783,)
│ ├── coco_urls # dtype=np.uint8, shape=(82783,32) (note: string in ASCII format)
│ ├── image_id # dtype=np.int32, shape=(82783,)
│ ├── category_id # dtype=np.int32, shape=(80,)
│ ├── annotation_id # dtype=np.int32, shape=(185316,)
│ ├── width # dtype=np.int32, shape=(82783,)
│ ├── height # dtype=np.int32, shape=(82783,)
│ ├── boxes # dtype=np.float, shape=(185316,4)
│ ├── iscrowd # dtype=np.uint8, shape=(2,)
│ ├── segmentation # dtype=np.float, shape=(185316,10043)
│ ├── area # dtype=np.int32, shape=(185316,)
│ ├── keypoint_names # dtype=np.uint8, shape=(17,15) (note: string in ASCII format)
│ ├── keypoints # dtype=np.int32, shape=(185316,51)
│ ├── num_keypoints # dtype=np.uint8, shape=(18,)
│ ├── skeleton # dtype=np.uint8, shape=(19,2)
│ ├── object_fields # dtype=np.uint8, shape=(13,16) (note: string in ASCII format)
│ ├── object_ids # dtype=np.int32, shape=(185316,13)
│ ├── list_boxes_per_image # dtype=np.int32, shape=(82783,20))
│ ├── list_image_filenames_per_num_keypoints # dtype=np.int32, shape=(17,45174))
│ ├── list_keypoints_per_image # dtype=np.int32, shape=(82783,20))
│ ├── list_object_ids_per_image # dtype=np.int32, shape=(82783,20))
│ └── list_object_ids_per_keypoint # dtype=np.int32, shape=(17,92701))
│
├── val/
│ ├── image_filenames # dtype=np.uint8, shape=(40504,70) (note: string in ASCII format)
│ ├── category # dtype=np.uint8, shape=(80,15) (note: string in ASCII format)
│ ├── supercategory # dtype=np.uint8, shape=(12,11) (note: string in ASCII format)
│ ├── coco_annotations_ids # dtype=np.int32, shape=(88153,)
│ ├── coco_categories_ids # dtype=np.int32, shape=(80,)
│ ├── coco_images_ids # dtype=np.int32, shape=(40504,)
│ ├── coco_urls # dtype=np.uint8, shape=(40504,32) (note: string in ASCII format)
│ ├── image_id # dtype=np.int32, shape=(40504,)
│ ├── category_id # dtype=np.int32, shape=(80,)
│ ├── annotation_id # dtype=np.int32, shape=(88153,)
│ ├── width # dtype=np.int32, shape=(40504,)
│ ├── height # dtype=np.int32, shape=(40504,)
│ ├── boxes # dtype=np.float, shape=(88153,4)
│ ├── iscrowd # dtype=np.uint8, shape=(2,)
│ ├── segmentation # dtype=np.float, shape=(88153,6121)
│ ├── area # dtype=np.int32, shape=(88153,)
│ ├── keypoint_names # dtype=np.uint8, shape=(17,15) (note: string in ASCII format)
│ ├── keypoints # dtype=np.int32, shape=(88153,51)
│ ├── num_keypoints # dtype=np.uint8, shape=(18,)
│ ├── skeleton # dtype=np.uint8, shape=(19,2)
│ ├── object_fields # dtype=np.uint8, shape=(13,16) (note: string in ASCII format)
│ ├── object_ids # dtype=np.int32, shape=(88153,13)
│ ├── list_boxes_per_image # dtype=np.int32, shape=(40504,16))
│ ├── list_image_filenames_per_num_keypoints # dtype=np.int32, shape=(17,21634))
│ ├── list_keypoints_per_image # dtype=np.int32, shape=(40504,16))
│ ├── list_object_ids_per_image # dtype=np.int32, shape=(40504,16))
│ └── list_object_ids_per_keypoint # dtype=np.int32, shape=(17,43971))
│
├── test/
│ ├── image_filenames # dtype=np.uint8, shape=(81434,72) (note: string in ASCII format)
│ ├── category # dtype=np.uint8, shape=(80,15) (note: string in ASCII format)
│ ├── supercategory # dtype=np.uint8, shape=(12,11) (note: string in ASCII format)
│ ├── coco_categories_ids # dtype=np.int32, shape=(80,)
│ ├── coco_images_ids # dtype=np.int32, shape=(81434,)
│ ├── coco_urls # dtype=np.uint8, shape=(81434,32) (note: string in ASCII format)
│ ├── image_id # dtype=np.int32, shape=(81434,)
│ ├── category_id # dtype=np.int32, shape=(80,)
│ ├── width # dtype=np.int32, shape=(81434,)
│ ├── height # dtype=np.int32, shape=(81434,)
│ ├── object_fields # dtype=np.uint8, shape=(4,16) (note: string in ASCII format)
│ ├── object_ids # dtype=np.int32, shape=(81434,4)
│ └── list_object_ids_per_image # dtype=np.int32, shape=(81434,1))
│
└── test_dev/
├── image_filenames # dtype=np.uint8, shape=(20288,72) (note: string in ASCII format)
├── category # dtype=np.uint8, shape=(80,15) (note: string in ASCII format)
├── supercategory # dtype=np.uint8, shape=(12,11) (note: string in ASCII format)
├── coco_categories_ids # dtype=np.int32, shape=(80,)
├── coco_images_ids # dtype=np.int32, shape=(20288,)
├── coco_urls # dtype=np.uint8, shape=(20288,32) (note: string in ASCII format)
├── image_id # dtype=np.int32, shape=(20288,)
├── category_id # dtype=np.int32, shape=(80,)
├── width # dtype=np.int32, shape=(20288,)
├── height # dtype=np.int32, shape=(20288,)
├── object_fields # dtype=np.uint8, shape=(4,16) (note: string in ASCII format)
├── object_ids # dtype=np.int32, shape=(20288,4)
└── list_object_ids_per_image # dtype=np.int32, shape=(20288,1))
Fields¶
image_filenames
: image file path+namesavailable in
: train, val, test, test_devdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII format
category
: category namesavailable in
: train, val, test, test_devdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII format
supercategory
: super category namesavailable in
: train, val, test, test_devdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII format
coco_annotations_ids
: reference to coco annotation ids (useful for evaluating on coco)available in
: train, valdtype
: np.int32is padded
: Falsefill value
: -1
coco_categories_ids
: reference to coco category ids (useful for evaluating on coco)available in
: train, val, test, test_devdtype
: np.int32is padded
: Falsefill value
: -1
coco_images_ids
: reference to coco image filename ids (useful for evaluating on coco)available in
: train, val, test, test_devdtype
: np.int32is padded
: Falsefill value
: -1
coco_urls
: coco urlsavailable in
: train, val, test, test_devdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII format
image_id
: image filename idsavailable in
: train, val, test, test_devdtype
: np.int32is padded
: Falsefill value
: -1
category_id
: category idsavailable in
: train, val, test, test_devdtype
: np.int32is padded
: Falsefill value
: -1
annotation_id
: annotation idsavailable in
: train, valdtype
: np.int32is padded
: Falsefill value
: -1
width
: image widthavailable in
: train, val, test, test_devdtype
: np.int32is padded
: Falsefill value
: -1
height
: image heightavailable in
: train, val, test, test_devdtype
: np.int32is padded
: Falsefill value
: -1
boxes
: bounding boxavailable in
: train, valdtype
: np.floatis padded
: Falsefill value
: -1note
: bbox format (x1,y1,x2,y2)
iscrowd
: is crowd (0 - False, 1 - True)available in
: train, valdtype
: np.uint8is padded
: Falsefill value
: -1
segmentation
: segmentation maskavailable in
: train, valdtype
: np.floatis padded
: Truefill value
: -1note
: the masks come in 3 different formats, but they are mostly lists of lists. These have been packed (vectorized) into an array with a single dimension in order to be stored in the HDF5 metadata file. To unpack these arrays to their original format, use theunsqueeze_list()
method indbcollection.utils.pad
.
area
: object areaavailable in
: train, valdtype
: np.int32is padded
: Falsefill value
: -1
keypoint_names
: body joint namesavailable in
: train, valdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII format
keypoints
: body joint coordinatesavailable in
: train, valdtype
: np.int32is padded
: Falsefill value
: -1note
: coordinates format [x1,y1,is_visible,x2,y2,is_visible, …]
num_keypoints
: number of body jointsavailable in
: train, valdtype
: np.uint8is padded
: Falsefill value
: -1
skeleton
: pairwise body jointsavailable in
: train, valdtype
: np.uint8is padded
: Falsefill value
: -1
object_fields
: list of field names of the object id listavailable in
: train, val, test, test_devdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII formatnote
: key field (field name aggregator)
object_ids
: list of field idsavailable in
: train, val, test, test_devdtype
: np.int32is padded
: Falsefill value
: -1note
: key field (field id aggregator)
list_boxes_per_image
: list of bounding boxes per imageavailable in
: train, valdtype
: np.int32is padded
: Truefill value
: -1note
: pre-ordered list
list_image_filenames_per_category
: list of image filenames per categoryavailable in
: train, valdtype
: np.int32is padded
: Truefill value
: -1note
: pre-ordered list
list_image_filenames_per_supercategory
: list of image filenames per supercategoryavailable in
: train, valdtype
: np.int32is padded
: Truefill value
: -1note
: pre-ordered list
list_object_ids_per_image
: list of object ids per imageavailable in
: train, val, test, test_devdtype
: np.int32is padded
: Truefill value
: -1note
: pre-ordered list
list_objects_ids_per_category
: list of object ids per categoryavailable in
: train, valdtype
: np.int32is padded
: Truefill value
: -1note
: pre-ordered list
list_objects_ids_per_supercategory
: list of object ids per supercategoryavailable in
: train, valdtype
: np.int32is padded
: Truefill value
: -1note
: pre-ordered list