ILSVRC2012 - Imagenet Large Scale Visual Recognition Challenge 2012¶
ImageNet is an image database organized according to the WordNet hierarchy (currently only the nouns), in which each node of the hierarchy is depicted by hundreds and thousands of images.
The Large Scale Visual Recognition Challenge 2012 (ILSVRC2012) is a subset of the large hand-labeled ImageNet dataset (10,000,000 labeled images depicting 10,000+ object categories). The training data is a subset of ImageNet containing the 1000 categories and 1.2 million images
Use cases¶
Image classification.
Properties¶
name
: ilsvrc2012keywords
: image_processing, classificationdataset size
: 154,6 GBis downloadable
: nodata setup
: create a folder or symlink with the nameilsvrc2012/
where you have stored and unpacked the data files and, when loading the dataset, use thedata_dir
input argument to specify the data’s folder path.
tasks
:- classification: (default)
primary use
: image classificationdescription
: Contains image filenames and label annotations for image classification.sets
: train, valmetadata file size in disk
: 6,8 MBhas annotations
: yeswhich
:- labels for each image class/category.
- descriptions for each class/category.
- raw256:
primary use
: image classificationdescription
: Contains image filenames and label annotations for image classification.sets
: train, valmetadata file size in disk
: 6,8 MBhas annotations
: yeswhich
:- labels for each image class/category.
- descriptions for each class/category.
Metadata structure (HDF5)¶
Task: classification¶
/
├── train/
│ ├── image_filenames # dtype=np.uint8, shape=(1281166,76) (note: string in ASCII format)
│ ├── classes # dtype=np.uint8, shape=(1000,10) (note: string in ASCII format)
│ ├── labels # dtype=np.uint8, shape=(1000,122)
│ ├── descriptions # dtype=np.uint8, shape=(1000,256) (note: string in ASCII format)
│ ├── object_fields # dtype=np.uint8, shape=(2,16) (note: string in ASCII format)
│ ├── object_ids # dtype=np.int32, shape=(1281166,2)
│ └── list_image_filenames_per_class # dtype=np.int32, shape=(1000,1300))
│
└── val/
├── image_filenames # dtype=np.uint8, shape=(50000,67) (note: string in ASCII format)
├── classes # dtype=np.uint8, shape=(1000,10) (note: string in ASCII format)
├── labels # dtype=np.uint8, shape=(1000,122)
├── descriptions # dtype=np.uint8, shape=(1000,256) (note: string in ASCII format)
├── object_fields # dtype=np.uint8, shape=(2,16) (note: string in ASCII format)
├── object_ids # dtype=np.int32, shape=(50000,2)
└── list_image_filenames_per_class # dtype=np.int32, shape=(1000,50))
Fields¶
images
: image file path + nameavailable in
: train, valdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII format
classes
: class namesavailable in
: train, valdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII format
labels
: label namesavailable in
: train, valdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII format
descriptions
: class descriptionsavailable in
: train, valdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII format
object_fields
: list of field names of the object id listavailable in
: train, valdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII formatnote
: key field (field name aggregator)
object_ids
: list of field idsavailable in
: train, valdtype
: np.int32is padded
: Falsefill value
: -1note
: key field (field id aggregator)
list_image_filenames_per_class
: list of image filenames per classavailable in
: train, valdtype
: np.int32is padded
: Truefill value
: -1note
: pre-ordered list
Task: raw256¶
/
├── train/
│ ├── image_filenames # dtype=np.uint8, shape=(1281166,76) (note: string in ASCII format)
│ ├── classes # dtype=np.uint8, shape=(1000,10) (note: string in ASCII format)
│ ├── labels # dtype=np.uint8, shape=(1000,122)
│ ├── descriptions # dtype=np.uint8, shape=(1000,256) (note: string in ASCII format)
│ ├── object_fields # dtype=np.uint8, shape=(2,16) (note: string in ASCII format)
│ ├── object_ids # dtype=np.int32, shape=(1281166,2)
│ └── list_image_filenames_per_class # dtype=np.int32, shape=(1000,1300))
│
└── val/
├── image_filenames # dtype=np.uint8, shape=(50000,67) (note: string in ASCII format)
├── classes # dtype=np.uint8, shape=(1000,10) (note: string in ASCII format)
├── labels # dtype=np.uint8, shape=(1000,122)
├── descriptions # dtype=np.uint8, shape=(1000,256) (note: string in ASCII format)
├── object_fields # dtype=np.uint8, shape=(2,16) (note: string in ASCII format)
├── object_ids # dtype=np.int32, shape=(50000,2)
└── list_image_filenames_per_class # dtype=np.int32, shape=(1000,50))
Fields¶
images
: image file path + nameavailable in
: train, valdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII format
classes
: class namesavailable in
: train, valdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII format
labels
: label namesavailable in
: train, valdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII format
descriptions
: class descriptionsavailable in
: train, valdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII format
object_fields
: list of field names of the object id listavailable in
: train, valdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII formatnote
: key field (field name aggregator)
object_ids
: list of field idsavailable in
: train, valdtype
: np.int32is padded
: Falsefill value
: -1note
: key field (field id aggregator)
list_image_filenames_per_class
: list of image filenames per classavailable in
: train, valdtype
: np.int32is padded
: Truefill value
: -1note
: pre-ordered list