MNIST Handwritten Digit Database¶
The MNIST database of handwritten digits has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from MNIST. The digits have been size-normalized and centered in a fixed-size image.
Use cases¶
Image classification.
Properties¶
name: mnistkeywords: image_processing, classificationdataset size: 11,6 MBis downloadable: yestasks: classification (default)
Tasks¶
classification (default)¶
How to use¶
>>> # import the package
>>> import dbcollection as dbc
>>>
>>> # load the dataset
>>> mnist = dbc.load('mnist', 'classification')
>>> mnist
DataLoader: "mnist" (classification task)
Properties¶
primary use: image classificationdescription: Contains image tensors and label annotations for image classification.sets: train, testmetadata file size in disk: 6,8 MBhas annotations: yeswhich:- labels for each image class/category.
available fields:
HDF5 file structure¶
/
├── train/
│ ├── classes # dtype=np.uint8, shape=(10,2) (note: string in ASCII format)
│ ├── images # dtype=np.uint8, shape=(60000,28,28)
│ ├── labels # dtype=np.uint8, shape=(60000,)
│ ├── object_fields # dtype=np.uint8, shape=(2,7) (note: string in ASCII format)
│ ├── object_ids # dtype=np.int32, shape=(60000,2)
│ └── list_images_per_class # dtype=np.int32, shape=(10,6742))
│
└── test/
├── classes # dtype=np.uint8, shape=(10,2) (note: string in ASCII format)
├── images # dtype=np.uint8, shape=(10000,28,28)
├── labels # dtype=np.uint8, shape=(10000,)
├── object_fields # dtype=np.uint8, shape=(2,7) (note: string in ASCII format)
├── object_ids # dtype=np.int32, shape=(10000,2)
└── list_images_per_class # dtype=np.int32, shape=(10,1135))
Fields¶
classes: class namesavailable in: train, testdtype: np.uint8is padded: Truefill value: 0note: strings stored in ASCII format
images: images tensoravailable in: train, testdtype: np.uint8is padded: Falsefill value: -1
labels: class idsavailable in: train, testdtype: np.uint8is padded: Falsefill value: -1
object_fields: list of field names of the object id listavailable in: train, testdtype: np.uint8is padded: Truefill value: 0note: strings stored in ASCII formatnote: key field (field name aggregator)
object_ids: list of field idsavailable in: train, testdtype: np.int32is padded: Falsefill value: -1note: key field (field id aggregator)
list_images_per_class: list of image ids per classavailable in: train, testdtype: np.int32is padded: Truefill value: -1note: pre-ordered list