UCF101 - Action Recognition¶
UCF101 is an action recognition data set of realistic action videos, collected from YouTube, having 101 action categories. This data set is an extension of UCF50 data set which has 50 action categories.
With 13320 videos from 101 action categories, UCF101 gives the largest diversity in terms of actions and with the presence of large variations in camera motion, object appearance and pose, object scale, viewpoint, cluttered background, illumination conditions, etc, it is the most challenging data set to date.
The videos in 101 action categories are grouped into 25 groups, where each group can consist of 4-7 videos of an action.
Use cases¶
Human action recognition in videos.
Properties¶
name
: ucf_101keywords
: image_processing, recognition, activity, human, single_persondataset size
: 6,9 GBis downloadable
: yestasks
:- recognition: (default)
primary use
: action recognition in videosdescription
: Contains videos and action label annotations for action recognitionsets
: train01, train02, train03, test01, test02, test03metadata file size in disk
: 14,5 MBhas annotations
: yeswhich
:- activity labels for each video.
Metadata structure (HDF5)¶
Task: recognition¶
/
├── train01/
│ ├── activities # dtype=np.uint8, shape=(101,19) (note: string in ASCII format)
│ ├── image_filenames # dtype=np.uint8, shape=(1788425,113) (note: string in ASCII format)
│ ├── total_frames # dtype=np.int32, shape=(9537,)
│ ├── video_filenames # dtype=np.uint8, shape=(9537,60)
│ ├── videos # dtype=np.uint8, shape=(9537,29) (note: string in ASCII format)
│ ├── object_fields # dtype=np.uint8, shape=(5,31) (note: string in ASCII format)
│ ├── object_ids # dtype=np.int32, shape=(9537,5)
│ ├── list_image_filenames_per_video # dtype=np.int32, shape=(9537,1776)
│ └── list_videos_per_activity # dtype=np.int32, shape=(101,121)
│
├── test01/
│ ├── activities # dtype=np.uint8, shape=(101,19) (note: string in ASCII format)
│ ├── image_filenames # dtype=np.uint8, shape=(697865,113) (note: string in ASCII format)
│ ├── total_frames # dtype=np.int32, shape=(3783,)
│ ├── video_filenames # dtype=np.uint8, shape=(3783,60)
│ ├── videos # dtype=np.uint8, shape=(3783,29) (note: string in ASCII format)
│ ├── object_fields # dtype=np.uint8, shape=(5,31) (note: string in ASCII format)
│ ├── object_ids # dtype=np.int32, shape=(3783,5)
│ ├── list_image_filenames_per_video # dtype=np.int32, shape=(3783,900)
│ └── list_videos_per_activity # dtype=np.int32, shape=(101,49)
│
├── train02/
│ ├── activities # dtype=np.uint8, shape=(101,19) (note: string in ASCII format)
│ ├── image_filenames # dtype=np.uint8, shape=(1791290,113) (note: string in ASCII format)
│ ├── total_frames # dtype=np.int32, shape=(9586,)
│ ├── video_filenames # dtype=np.uint8, shape=(9586,60)
│ ├── videos # dtype=np.uint8, shape=(9586,29) (note: string in ASCII format)
│ ├── object_fields # dtype=np.uint8, shape=(5,31) (note: string in ASCII format)
│ ├── object_ids # dtype=np.int32, shape=(9586,5)
│ ├── list_image_filenames_per_video # dtype=np.int32, shape=(9586,1776)
│ └── list_videos_per_activity # dtype=np.int32, shape=(101,122)
│
├── test02/
│ ├── activities # dtype=np.uint8, shape=(101,19) (note: string in ASCII format)
│ ├── image_filenames # dtype=np.uint8, shape=(695000,113) (note: string in ASCII format)
│ ├── total_frames # dtype=np.int32, shape=(3734,)
│ ├── video_filenames # dtype=np.uint8, shape=(3734,60)
│ ├── videos # dtype=np.uint8, shape=(3734,29) (note: string in ASCII format)
│ ├── object_fields # dtype=np.uint8, shape=(5,31) (note: string in ASCII format)
│ ├── object_ids # dtype=np.int32, shape=(3734,5)
│ ├── list_image_filenames_per_video # dtype=np.int32, shape=(3734,833)
│ └── list_videos_per_activity # dtype=np.int32, shape=(101,49)
│
├── train03/
│ ├── activities # dtype=np.uint8, shape=(101,19) (note: string in ASCII format)
│ ├── image_filenames # dtype=np.uint8, shape=(1786111,113) (note: string in ASCII format)
│ ├── total_frames # dtype=np.int32, shape=(9624,)
│ ├── video_filenames # dtype=np.uint8, shape=(9624,60)
│ ├── videos # dtype=np.uint8, shape=(9624,29) (note: string in ASCII format)
│ ├── object_fields # dtype=np.uint8, shape=(5,31) (note: string in ASCII format)
│ ├── object_ids # dtype=np.int32, shape=(9624,5)
│ ├── list_image_filenames_per_video # dtype=np.int32, shape=(9624,900)
│ └── list_videos_per_activity # dtype=np.int32, shape=(101,124)
│
└── test03/
├── activities # dtype=np.uint8, shape=(101,19) (note: string in ASCII format)
├── image_filenames # dtype=np.uint8, shape=(700157,113) (note: string in ASCII format)
├── total_frames # dtype=np.int32, shape=(3696,)
├── video_filenames # dtype=np.uint8, shape=(3696,60)
├── videos # dtype=np.uint8, shape=(3696,29) (note: string in ASCII format)
├── object_fields # dtype=np.uint8, shape=(5,31) (note: string in ASCII format)
├── object_ids # dtype=np.int32, shape=(3696,5)
├── list_image_filenames_per_video # dtype=np.int32, shape=(3696,1776)
└── list_videos_per_activity # dtype=np.int32, shape=(101,48)
Fields¶
activities
: activity namesavailable in
: train01, train02, train03, test01, test02, test03dtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII format
image_filenames
: image file path+nameavailable in
: train01, train02, train03, test01, test02, test03dtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII format
total_frames
: number of frames per videoavailable in
: train01, train02, train03, test01, test02, test03dtype
: np.int32is padded
: Falsefill value
: -1
videos
: video nameavailable in
: train01, train02, train03, test01, test02, test03dtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII format
video_filenames
: video file path+nameavailable in
: train01, train02, train03, test01, test02, test03dtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII format
object_fields
: list of field names of the object id listavailable in
: train01, train02, train03, test01, test02, test03dtype
: np.uint8is padded
: Truefill value
: 0note
: strings stored in ASCII formatnote
: key field (field name aggregator)
object_ids
: list of field idsavailable in
: train01, train02, train03, test01, test02, test03dtype
: np.int32is padded
: Falsefill value
: -1note
: key field (field id aggregator)
list_image_filenames_per_video
: list of image ids per videoavailable in
: train01, train02, train03, test01, test02, test03dtype
: np.int32is padded
: Truefill value
: -1note
: pre-ordered list
list_videos_per_activity
: list of video ids per activityavailable in
: train01, train02, train03, test01, test02, test03dtype
: np.int32is padded
: Truefill value
: -1note
: pre-ordered list