MNIST Handwritten Digit Database¶
The MNIST database of handwritten digits has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from MNIST. The digits have been size-normalized and centered in a fixed-size image.
Use cases¶
Image classification.
Properties¶
name
: mnistkeywords
: image_processing, classificationdataset size
: 11,6 MBis downloadable
: yestasks
: classification (default)
Tasks¶
classification (default)¶
How to use¶
>>> # import the package
>>> import dbcollection as dbc
>>>
>>> # load the dataset
>>> mnist = dbc.load('mnist', 'classification')
>>> mnist
DataLoader: "mnist" (classification task)
Properties¶
primary use
: image classificationdescription
: Contains image tensors and label annotations for image classification.sets
: train, testmetadata file size in disk
: 6,8 MBhas annotations
: yeswhich
:- labels for each image class/category.
available fields
:
HDF5 file structure¶
/
├── train/
│ ├── classes # dtype=np.uint8, shape=(10,2) (note: string in ASCII format)
│ ├── images # dtype=np.uint8, shape=(60000,28,28)
│ ├── labels # dtype=np.uint8, shape=(60000,)
│ ├── object_fields # dtype=np.uint8, shape=(2,7) (note: string in ASCII format)
│ ├── object_ids # dtype=np.int32, shape=(60000,2)
│ └── list_images_per_class # dtype=np.int32, shape=(10,6742))
│
└── test/
├── classes # dtype=np.uint8, shape=(10,2) (note: string in ASCII format)
├── images # dtype=np.uint8, shape=(10000,28,28)
├── labels # dtype=np.uint8, shape=(10000,)
├── object_fields # dtype=np.uint8, shape=(2,7) (note: string in ASCII format)
├── object_ids # dtype=np.int32, shape=(10000,2)
└── list_images_per_class # dtype=np.int32, shape=(10,1135))
Fields¶
classes
: class namesavailable in
: train, testdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII format
images
: images tensoravailable in
: train, testdtype
: np.uint8is padded
: Falsefill value
: -1
labels
: class idsavailable in
: train, testdtype
: np.uint8is padded
: Falsefill value
: -1
object_fields
: list of field names of the object id listavailable in
: train, testdtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII formatnote
: key field (field name aggregator)
object_ids
: list of field idsavailable in
: train, testdtype
: np.int32is padded
: Falsefill value
: -1note
: key field (field id aggregator)
list_images_per_class
: list of image ids per classavailable in
: train, testdtype
: np.int32is padded
: Truefill value
: -1note
: pre-ordered list