Video-Action-Recognition

🔥🔥🔥The Journey of Action Recognition✈️

👋👋👋 A collection of methods and datasets in the journey of action recognition.

📌 More details please refer to our paper.

🛠️ Please let us know if you find out a mistake or have any suggestions by e-mail: [email protected]

📑 Citation

If you find our work useful for your research, please cite the following paper:

@inproceedings{10.1145/3701716.3717746,
author = {Ding, Xi and Wang, Lei},
title = {The Journey of Action Recognition},
year = {2025},
isbn = {9798400713316},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3701716.3717746},
doi = {10.1145/3701716.3717746},
booktitle = {Companion Proceedings of the ACM on Web Conference 2025},
pages = {1869–1884},
numpages = {16},
keywords = {action recognition, data, learning paradigm, model architectures},
location = {Sydney NSW, Australia},
series = {WWW '25}
}

🚀 News

[10/02/2025] 🎁 The GitHub repository for our paper has been released.
[27/01/2025] 🎈 Our paper has been accepted as an oral presentation at the Companion Proceedings of The Web Conference 2025 (WWW 2025)

🔦 Table of Contents

Video-Action-Recognition

🧰 Methods Used in The Journey of Action Recognition

Handcrafted Methods

Click to expand Table 1

Model	Venue	Learning	Dataset	Modality	Code
HL-STIP	IJCV 2005	Supervised	Outdoor scenes	RGB	-
Spatio-temporal Cuboids	VS-PETS 2005	Supervised	Human Action Dataset	RGB	-
3D-SURF	ECCV 2006	Supervised	Mikolajczyk	RGB	-
3D-SIFT	ACM MM 2007	Supervised	Weizmann	RGB	-
NNMF Detector	ICCV 2007	Supervised	KTH	RGB	-
HOG3D	BMVC 2008	Supervised	KTH, Weizmann, Hollywood	RGB	-
Laptev et al.	CVPR 2008	Supervised	KTH	RGB + Optical flow	-
Action MACH	CVPR 2008	Supervised	KTH, Weizmann	RGB + Optical flow	-
Extended SURF	ECCV 2008	Supervised	KTH, TRECVID 2006	RGB	-
LTP	ICCV 2009	Supervised	KTH, Hollywood, Kissing and slapping dataset, UCF Sports	RGB	-
Messing et al.	ICCV 2009	Supervised	KTH	RGB	-
Bregonzio et al.	CVPR 2009	Supervised	KTH, Weizmann	RGB	-
Tracklet Descriptors	ECCV 2010	Supervised	KTH, ADL, Hollywood	RGB + Optical flow	-
Dense Long-Duration Trajectories	ICME 2010	Supervised	KTH	RGB + Optical flow	-
Dense Trajectories	IJCV 2013	Supervised	KTH, YouTube, Hollywood2, UCF Sports, IXMAS, Olympic Sports, UCF50, UIUC, HMDB51	RGB + Optical flow	-
iDT	ICCV 2013	Supervised	Hollywood2, HMDB51, Olympic Sports, UCF50	RGB + Optical flow	-
Taylor videos	ICML 2024	Supervised	HMDB51, CATER, MPII Cooking, Kinetics-400, -600, Something-Something V2, NTU RGB+D, Kinetics-skeleton	RGB + Skeleton	GitHub

2D-based Methods

Click to expand Table 2

Model	Venue	Learning	Dataset	Modality	Code
Slow fusion	CVPR 2014	Supervised	Sports-1M, UCF101	RGB	GitHub
CNN-LSTM	CVPR 2015	Supervised	Sports-1M, UCF101	RGB + Optical flow	GitHub
LRCN	CVPR 2015	Supervised	UCF101	RGB + Optical flow	GitHub
Composite LSTM	ICML 2015	Unsupervised	UCF101, HMDB51	RGB	GitHub
Rank Pooling	TPAMI 2016	Supervised	HMDB51, Hollywood2, MPII Cooking	RGB + Optical flow	-
LENN	CVPR 2016	Supervised	UCF101	RGB	-
Bilen et al.	TPAMI 2017	Supervised	UCF101, HMDB51	RGB	-
TSN	TPAMI 2018	Supervised	HMDB51, UCF101, Kinetics-400, ActivityNet, THUMOS14	RGB + RGB differences + Optical flow + Audio	GitHub
Attention-LSTM	CVPR 2018	Supervised	UCF101, HMDB51, Kinetics-400	RGB + Optical flow + Audio	GitHub
PEAR	ICME 2019	Reinforcement	UCF101, Sports-1M	RGB + Optical flow	-
TSM	ICCV 2019	Supervised	Something-Something V1, V2, Kinetics-400, UCF101, HMDB51	RGB	GitHub
VINCE	arXiv 2020	Self-supervised	Kinetics-400	RGB	GitHub
C²LSTM	Neurocomputing 2020	Supervised	UCF101, HMDB51	RGB	-
MoCo	CVPR 2021	Self-supervised	Kinetics-400, UCF101, HMDB51	RGB	GitHub
TCL	CVPR 2021	Semi-supervised + Contrastive	Mini-Something-V2, Kinetics-400, Charades-Ego	RGB	GitHub
TDN	CVPR 2021	Supervised	Something-Something V1, V2, Kinetics-400	RGB	GitHub
DB-LSTM	Neurocomputing 2021	Supervised	UCF101, HMDB51	RGB + Optical flow	-
SeCo	AAAI 2021	Self-supervised	Kinetics-400, UCF101, HMDB51, ActivityNet	RGB	GitHub
Xiao et al.	CVPR 2022	Semi-supervised + Contrastive	Kinetics-400, UCF101, HMDB51	RGB	GitHub
GCSM	ACM MM 2023	Few-shot	UCF101, HMDB51, Kinetics-400	RGB	-
GgHM	ICCV 2023	Few-shot	HMDB51, UCF101, Kinetics-400, Something-Something V2	RGB	GitHub

3D-based Methods

Click to expand Table 3

Model	Venue	Learning	Dataset	Modality	Code
C3D	ICCV 2015	Supervised	UCF101	RGB	GitHub
I3D	CVPR 2017	Supervised	Kinetics-400, UCF101, HMDB51	RGB	GitHub
P3D	ICCV 2017	Supervised	Sports-1M, UCF101, ActivityNet	RGB	GitHub
ResNet3D	CVPR 2018	Supervised	Kinetics-400, UCF101, HMDB51, ActivityNet	RGB	GitHub
S3D	ECCV 2018	Supervised	Kinetics-400, Something-Something V1, UCF101, HMDB51	RGB + Optical flow	GitHub
CSN	ICCV 2019	Supervised	Sports-1M, Kinetics-400, Something-Something V1	RGB	GitHub
SlowFast	ICCV 2019	Supervised	Kinetics-400, Kinetics-600, Charades, AVA	RGB	GitHub
STM	ICCV 2019	Supervised	Something-Something V1, Something-Something V2, Kinetics-400, UCF101, HMDB51	RGB	-
DEEP-HAL	ICCV 2019	Self-supervised	HMDB51, Charades, MPII Cooking	RGB + Optical flow	-
Xv et al.	CVPR 2019	Self-supervised	UCF101, HMDB51	RGB	-
X3D	CVPR 2020	Supervised	Kinetics-400, Kinetics-600, Charades, AVA	RGB	GitHub
TPN	CVPR 2020	Supervised	Kinetics-400, Something-Something V1, Something-Something V2, Epic-Kitchens	RGB	GitHub
SpeedNet	CVPR 2020	Self-supervised	Kinetics-400, UCF101, HMDB51, NfS	RGB	GitHub
CoCLR	NeurIPS 2020	Self-supervised	UCF101, HMDB51, Kinetics-400	RGB + Optical flow	GitHub
VTHCL	arXiv 2020	Self-supervised	Kinetics-400, UCF101, HMDB51	RGB	GitHub
MvPL	ICCV 2021	Semi-supervised	Kinetics-400, UCF101, HMDB51	RGB + Optical flow	-
CVRL	CVPR 2021	Self-supervised	Kinetics-400, Kinetics-600, UCF101, HMDB51	RGB	GitHub
Yang et al.	CVPR 2021	Supervised	Kinetics-400, Kinetics-700, Charades, Something-Something V1, AVA	RGB	-
3DResNet+ATFR	CVPR 2021	Supervised	Kinetics-400, Kinetics-600, UCF101, HMDB51, Something-Something V2	RGB	-
MoViNet	CVPR 2021	Supervised	Kinetics-400, Kinetics-600, Kinetics-700, Something-Something V2, Epic-Kitchens-100, MiT, Charades	RGB	GitHub
ODF+SDF	ACM MM 2021	Self-supervised	HMDB51, Charades, MPII Cooking, EPIC-Kitchen	RGB + Optical flow + object/saliency detectors	-
CLASTER	ECCV 2022	Reinforcement+Zero-shot	UCF101, HMDB51, Olympic Sports	RGB + Optical flow + Semantic embeddings	-
TFCNet	arXiv 2022	Supervised	Diving48, CATER	RGB	-
Multi-Transforms	ICMEW 2024	Self-supervised	UCF101, HMDB51	RGB	-
HoT	ICASSP 2024	Supervised	HMDB51, MPII Cooking	RGB + Optical flow	-
Flow corr.	ICASSP 2024	Supervised	HMDB51, Charades, MPII Cooking	RGB + Optical flow	-

Two-stream Methods

Click to expand Table 4

Model	Venue	Learning	Dataset	Modality	Code
Two-Stream ConvNet	NeurIPS 2014	Supervised	UCF101, HMDB51	RGB + Optical flow	GitHub
P-CNN	ICCV 2015	Supervised	JHMDB, MPII Cooking	RGB + Optical Flow + Joint	-
TDD	CVPR 2015	Supervised	HMDB51, UCF101	RGB + Optical flow	GitHub
Two-Stream Fusion	CVPR 2016	Supervised	UCF101, HMDB51	RGB + Optical flow	GitHub
TSN-Two-Stream	ECCV 2016	Supervised	HMDB51, UCF101	RGB + RGB differences + Optical flow + Warped optical flow	GitHub
DOVF	CVPR 2017	Supervised	UCF101, HMDB51	RGB + Optical flow	GitHub
TLE	CVPR 2017	Supervised	UCF101, HMDB51	RGB + Optical flow	GitHub
ActionVLAD	CVPR 2017	Supervised	HMDB51, UCF101, Charades	RGB + Optical flow	-
TRN-Two-Stream	ECCV 2018	Supervised	Something-Something V1, Something-Something V2, Charades	RGB	GitHub
TSM-Two-Stream	ICCV 2019	Supervised	Something-Something V1, Something-Something V2, Kinetics-400, UCF101, HMDB51	RGB + Optical flow	GitHub
KTSN	arXiv 2020	Supervised	FSD-10	RGB + Optical flow + Skeleton	-
MSM-ResNets	IVC 2021	Supervised	UCF101, HMDB51	RGB + Optical Flow + Motion Saliency	-
MAT-EffNet	MMSys 2023	Supervised	UCF101, HMDB51, Kinetics-400	RGB + Optical flow	-
TTFA	SPL 2024	Few-shot	Something-Something V2, Kinetics-400	RGB + Optical flow	-

(2+1)D-based Methods

Click to expand Table 5

Model	Venue	Learning	Dataset	Modality	Code
R(2+1)D	CVPR 2018	Supervised	Kinetics-400, Sports-1M, UCF101, HMDB51	RGB + Optical flow	GitHub
R(2+1)D+BERT	ECCVW 2020	Supervised	HMDB51, UCF101	RGB	GitHub
XDC	NeurIPS 2020	Self-supervised	HMDB51, UCF101	RGB + Audio	GitHub
ELo	CVPR 2020	Self-supervised	Kinetics-400, UCF101, HMDB51	RGB + Optical flow + Audio	-
Jin et al.	ICICSP 2021	Supervised	UCF101	RGB	-
GDT	arXiv 2021	Self-supervised	Kinetics-400, UCF101, HMDB51	RGB + Audio	-
AVID	CVPR 2021	Self-supervised	Kinetics-400, UCF101, HMDB51	RGB + Audio	GitHub

Transformer-based Methods

Click to expand Table 6

Model	Venue	Learning	Dataset	Modality	Code
VTN	ICCV 2021	Supervised	Kinetics-400, MiT	RGB	GitHub
TimeSformer	ICML 2021	Supervised	Kinetics-400, Kinetics-600	RGB	GitHub
STAM	arXiv 2021	Supervised	Kinetics-400, UCF101, Charades	RGB	GitHub
ViViT	ICCV 2021	Supervised	Kinetics-400, Kinetics-600, Epic-Kitchens-100, MiT, Something-Something V2	RGB	GitHub
MViT	ICCV 2021	Supervised	Kinetics-400, Kinetics-600, Something-Something V2, Charades, AVA	RGB	GitHub
Motionformer	NeurIPS 2021	Supervised	Kinetics-400, Kinetics-600, Something-Something V2, Epic-Kitchens-100	RGB	GitHub
X-ViT	NeurIPS 2021	Supervised	Kinetics-400, Kinetics-600, Something-Something V2, Epic-Kitchens-100	RGB	GitHub
TallFormer	ECCV 2022	Supervised	THUMOS14, ActivityNet	RGB	GitHub
VideoSwin	CVPR 2022	Supervised	Kinetics-400, Kinetics-600, Something-Something V2	RGB	GitHub
ORViT	CVPR 2022	Supervised	Something-Something V2, SomethingElse, Diving48, AVA, Epic-Kitchens-100	RGB	GitHub
BEVT	CVPR 2022	Self-supervised	Kinetics-400, Something-Something V2, Diving-48	RGB	GitHub
MaskFeat	CVPR 2022	Self-supervised	Kinetics-400, Kinetics-600, Kinetics-700	RGB	GitHub
UniFormer	arXiv 2022	Supervised	Kinetics-400, Kinetics-600, Something-Something V1, V2	RGB	GitHub
VideoMAE	NeurIPS 2022	Self-supervised	Kinetics-400, Something-Something V2, UCF101, HMDB51, AVA	RGB	GitHub
MTV	CVPR 2022	Supervised	Kinetics-400, Kinetics-600, Kinetics-700, Something-Something V2, Epic-Kitchens-100, MiT	RGB	GitHub
MAE-ST	arXiv 2022	Self-supervised	Kinetics-400, Something-Something V2, AVA	RGB	GitHub
CAST	NeurIPS 2023	Supervised	Kinetics-400, Something-Something V2, Epic-Kitchens-100	RGB	GitHub
UniFormerV2	ICCV 2023	Supervised+Contrastive	Kinetics-400, Kinetics-600, Kinetics-700, MiT, Something-Something V1, V2, ActivityNet, HACS	RGB	-
OmniMAE	CVPR 2023	Self-supervised	Something-Something V2, Epic-Kitchens-100, Kinetics-400	RGB	GitHub
MVD	CVPR 2023	Self-supervised	Kinetics-400, Something-Something V2, UCF101, HMDB51	RGB	GitHub
Hiera	ICML 2023	Self-supervised	Kinetics-400, Kinetics-600, Kinetics-700, Something-Something V2, AVA	RGB	GitHub
VideoMAE V2	CVPR 2023	Self-supervised	Kinetics-400, Something-Something V2, UCF101, HMDB51	RGB	GitHub
SOAP	ACM MM 2024	Few-shot	Something-Something V2, Kinetics-400, UCF101, HMDB51	RGB	GitHub
C2C	ECCV 2024	Zero-shot	Sth-com	RGB	GitHub
VMPs	ACML 2024	Supervised	HMDB51, MPII Cooking 2, FineGym	RGB + Motion prompts	GitHub
TIME Layer	arXiv 2024	Self-supervised	UCF101, HMDB51, UWA3D Multiview Activity II, NTU RGB+D, NTU RGB+D 120	RGB + Depth	-

Skeletons-based Methods

Click to expand Table 7

Model	Venue	Learning	Dataset	Modality	Code
Dynamic Skeletons	CVPR 2015	Supervised	MSRDailyActivity, CAD-60, SYSU 3D HOI	Depth + Joint	-
HBRNN-L	CVPR 2015	Supervised	MSRAction3D, Berkeley MHAD, HDM05	Joint	-
Part-aware LSTM	CVPR 2016	Supervised	NTU RGB+D	RGB + Depth + Joint + Infrared	GitHub
LARP-SO	CVPR 2016	Supervised	Florence3D-Action, MSRActionPairs3D, G3D-Gaming	Joint	-
STA-LSTM	AAAI 2017	Supervised	NTU RGB+D	Joint	-
LieNet	CVPR 2017	Supervised	NTU RGB+D, HDM05, G3D-Gaming	Joint + Bone	-
Two-Stream RNN	CVPR 2017	Supervised	NTU RGB+D	Joint	-
Ke et al.	CVPR 2017	Supervised	NTU RGB+D	Joint	-
VA-LSTM	ICCV 2017	Supervised	NTU RGB+D, SYSU 3D HOI	Joint	GitHub
View Invariant	Pattern Recognit. 2017	Supervised	NTU RGB+D, Northwestern-UCLA, UWA3D Multiview Activity II, MSRC-12	Joint	-
Two-Stream CNN	ICMEW 2017	Supervised	NTU RGB+D, PKU-MMD I	Joint + Skeleton motion	GitHub
LSTM-CNN	ICMEW 2017	Supervised	NTU RGB+D	Joint	-
ST-LSTM+Trust Gate	TPAMI 2018	Supervised	NTU RGB+D, MSRAction3D, SYSU 3D HOI, Berkeley MHAD	Joint	-
ST-GCN	AAAI 2018	Supervised	Kinetics-400, NTU RGB+D	Joint	GitHub
Tang et al.	CVPR 2018	Reinforcement	NTU RGB+D, SYSU 3D HOI, UTKinect-Action3D	Joint + Bone	-
AS-GCN	CVPR 2019	Supervised	NTU RGB+D, Kinetics-400	Joint + Bone	GitHub
2s-AGCN	CVPR 2019	Fully-supervised	NTU RGB+D, Kinetics-skeleton	Joint + Bone	GitHub
DGNN	CVPR 2019	Supervised	NTU RGB+D, Kinetics-skeleton	Joint + Bone	GitHub
EfficientGCN	ACM MM 2020	Supervised	NTU RGB+D, NTU RGB+D 120	Joint + Velocity + Bone	-
RA-GCN	TCSVT 2020	Supervised	NTU RGB+D, NTU RGB+D 120	Joint + Bone	gitee
Shift-GCN	CVPR 2020	Supervised	NTU RGB+D, NTU RGB+D 120, Northwestern-UCLA	Joint + Bone	GitHub
MS-G3D	CVPR 2020	Supervised	NTU RGB+D 60, NTU RGB+D 120, Kinetics-skeleton	Joint + Bone	GitHub
DSTA-Net	ACCV 2020	Supervised	NTU RGB+D, NTU RGB+D 120	Joint + Bone	-
SCK+DCK / SCK$\oplus$+DCK$\oplus$	TPAMI 2020	Supervised	UTKinect-Action3D, Florence3D-Action, MSRAction3D, NTU RGB+D 60, Kinetics-400, HMDB51, MPII Cooking	Joint	-
CTR-GCN	ICCV 2021	Supervised	NTU RGB+D, NTU RGB+D 120, Northwestern-UCLA	Joint + Bone	-
FGCN	TIP 2022	Supervised	NTU RGB+D, NTU RGB+D120, Northwestern-UCLA	Joint + Bone	-
AGE-Ens	TNNLS 2022	Supervised	NTU RGB+D, NTU RGB+D 120	Joint + Bone	GitHub
PoseConv3D	CVPR 2022	Supervised	Kinetics-400, UCF101, HMDB51	Joint + Bone + RGB	GitHub
InfoGCN	CVPR 2022	Supervised	NTU RGB+D, NTU RGB+D 120, Northwestern-UCLA	Joint + Bone	GitHub
DASTM	ECCV 2022	Few-shot	NTU RGB+D 120, Kinetics-skeleton	Joint + Bone	-
Uncertainty-DTW	ECCV 2022	Supervised/Unsupervised few-shot	NTU RGB+D, NTU RGB+D 120, Kinetics-skeleton	Skeleton sequences	GitHub
TranSkeleton	TCSVT 2023	Supervised	NTU RGB+D, NTU RGB+D 120	Joint + Bone	-
HiCo	AAAI 2023	Unsupervised + Contrastive	NTU RGB+D, NTU RGB+D 120, PKU-MMD I, PKU MMD II	Joint	GitHub
FR-Head	CVPR 2023	Supervised + Contrastive	NTU RGB+D, NTU RGB+D 120, Northwestern-UCLA	Joint + Bone	GitHub
3Mformer	CVPR 2023	Supervised	NTU RGB+D, NTU RGB+D 120, Kinetics-400, Northwestern-UCLA	Joint + Hyper-edge	-
HYSP	ICLR 2023	Self-supervised	NTU RGB+D, NTU RGB+D 120, PKU-MMD I	Joint	GitHub
PAINet	ICCV 2023	Few-shot	NTU RGB+D 120, Kinetics-skeleton	Joint + Bone	-
PCM³	ACM MM 2023	Self-supervised	NTU RGB+D, NTU RGB+D 120, PKU-MMD I	Joint + Bone + Motion	GitHub
Stream-GCN	IJCAI 2023	Supervised	NTU RGB+D, NTU RGB+D 120, Northwestern-UCLA	Joint + Bone	-
SkeletonGCL	arXiv 2023	Self-supervised	NTU RGB+D, NTU RGB+D 120, Northwestern-UCLA	Joint + Bone	GitHub
DSCNet	ESWA 2024	Supervised + Multimodal	NTU RGB+D, NTU RGB+D 120, PKU-MMD I, UAV-Human, IKEA ASM, Northwestern-UCLA	RGB + Joint + Bone	-
Skeleton-OOD	Neurocomputing 2024	Supervised	NTU RGB+D, NTU RGB+D 120, Kinetics-400	Joint	GitHub
ViA	IJCV 2024	Self-supervised	Posetics, NTU RGB+D, NTU RGB+D 120, Toyota Smarthome, UAV-Human, Penn Action	Joint + Motion	GitHub
DeGCN	TIP 2024	Supervised	NTU RGB+D, NTU RGB+D 120, Northwestern-UCLA	Joint + Bone	GitHub
Js-SaPR-GCN	TCSVT 2024	Supervised	NTU RGB+D, NTU RGB+D 120, Northwestern-UCLA	Joint + Bone + Motion	-
BlockGCN	CVPR 2024	Supervised	NTU RGB+D, NTU RGB+D 120, Northwestern-UCLA	Joint + Bone + Motion	GitHub
JEANIE	IJCV 2024	Supervised/Unsupervised few-shot	NTU RGB+D, NTU RGB+D 120, Kinetics-skeleton, MSRAction3D, UWA3D Multiview Activity	Skeleton sequences	-
SA-DVAE	arXiv 2024	Zero-shot	NTU RGB+D, NTU RGB+D 120, PKU-MMD I	Joint	GitHub
ProtoGCN	arXiv 2024	Self-supervised + Prototype	NTU RGB+D, NTU RGB+D 120, Kinetics-skeleton, FineGYM	Joint	GitHub
HSIC-based	arXiv 2024	Supervised	NTU RGB+D, NTU RGB+D 120, Northwestern-UCLA	Joint + Bone	-
USDRL	AAAI 2025	Self-supervised	NTU RGB+D, NTU RGB+D 120, PKU-MMD I, PKU-MMD II	Joint + Bone + Motion	GitHub

Depth-based Methods

Click to expand Table 8

Model	Venue	Learning	Dataset	Modality	Code
HON4D	CVPR 2013	Supervised	MSRAction3D, MSRDailyActivity3D, MSRActionPairs3D	Depth	-
HOPC	ECCV 2014	Supervised	MSRAction3D, MSRActionPairs3D, UWA3D Multiview Activity	Depth + Point cloud	-
Wang et al.	Trans. Human-Mach. Syst. 2016	Supervised	MSRAction3D, MSRDailyActivity3D, UTKinect-Action3D	Depth	-
Rahmani et al.	CVPR 2016	Supervised	Northwestern-UCLA, UWA3D Multiview Activity II	Depth	-
S²DDI	ICCVW 2017	Supervised	MSRAction3D, G3D-Gaming, MSRDailyActivity3D, SYSU 3D HOI, UTD-MHAD	Depth	-
Wang et al.	TMM 2018	Supervised	NTU RGB+D	Depth	-
MVDI	Inf. Sci. 2018	Supervised	NTU RGB+D, Northwestern-UCLA, UWA3D Multiview Activity II	Depth	GitHub
3DFCNN	Multimed. Tools Appl. 2020	Supervised	NTU RGB+D, Northwestern-UCLA, UWA3D Multiview Activity II	Depth	-
Liu et al.	ICASSP 2017	Supervised	MSRAction3D, DHA	Depth	-
Dhiman et al.	TIP 2020	Supervised	NTU RGB-D, UWA3D Multiview Activity II, Northwestern-UCLA	RGB + Depth	-
Stateful ConvLSTM	arXiv 2020	Supervised	NTU RGB+D	Depth	-
DEAR	arXiv 2024	Supervised	Something-Something V2	RGB + Depth	GitHub

Infrared-based Methods

Click to expand Table 9

Model	Venue	Learning	Dataset	Modality	Code
Gao et al.	Neurocomputing 2016	Supervised	InfAR	Infrared + Optical flow	-
Jiang et al.	CVPRW 2017	Supervised	InfAR	Infrared + Optical flow	-
Kawashima et al.	AVSS 2017	Supervised	Custom Dataset	Infrared	-
Shah et al.	SPIE 2018	Supervised	Custom IR Dataset	Infrared	-
TSTDDs	SPL 2018	Supervised	InfAR, NTU RGB+D	Infrared + Optical flow	-
Akula et al.	CSR 2018	Supervised	Custom IR Dataset	Infrared	-
Imran et al.	Infrared Phys. Technol. 2019	Supervised	InfAR, IITR-IAR	Infrared + Optical flow	-
Meglouli et al.	CEAI 2019	Supervised	InfAR	Infrared + Optical flow	-
Mehta et al.	ICPR 2020	Adversarial	TSF	Infrared + Optical flow	GitHub

Point Cloud Methods

Click to expand Table 10

Model	Venue	Learning	Dataset	Modality	Code
MeteorNet	ICCV 2019	Supervised	MSRAction3D	Point cloud	GitHub
PointLSTM	CVPR 2020	Supervised	MSRAction3D	Point cloud	GitHub
3DV-PointNet++	CVPR 2020	Supervised	NTU RGB+D, NTU RGB+D 120, Northwestern-UCLA, UWA3D Multiview Activity II	Depth	GitHub
ASTA3DConv	Trans. Instrum. Meas. 2020	Supervised	MSRAction3D	Point cloud	-
Wang et al.	WACV 2021	Self-supervised	NTU RGB+D, NTU-PCL, MSRAction3D	Point cloud	-
P4Transformer	CVPR 2021	Supervised	MSRAction3D, NTU RGB+D, NTU RGB+D 120	Point cloud	GitHub
PSTNet	ICLR 2021	Supervised	MSRAction3D, NTU RGB+D, NTU RGB+D 120	Point cloud	GitHub
PST²	WACV 2022	Supervised	MSRAction3D	Point cloud	-
MaST-Pre	ICCV 2023	Self-supervised	MSRAction3D, NTU RGB+D	Point cloud	GitHub
PointCPSC	ICCV 2023	Self-supervised	MSRAction3D, NTU RGB+D	Point cloud	-
3DInAction	CVPR 2024	Supervised	MSRAction3D	Point cloud	GitHub
KAN-HyperpointNet	arXiv 2024	Supervised	NTU RGB+D, MSRAction3D	Point cloud	-

Text/Audio Methods

Click to expand Table 11

Model	Venue	Learning	Dataset	Modality	Code
CPD	arXiv 2020	Self-supervised	Kinetics-400, HMDB51, UCF101	RGB + Text	GitHub
G-Blend	CVPR 2020	Multi-task	Kinetics-400, Mini-Sports, EPIC-Kitchen	RGB + Optical flow + Audio	-
MIL-NCE	CVPR 2020	Self-supervised	HowTo100M, HMDB51, UCF101	RGB + Text	GitHub
MMV	NeurIPS 2020	Self-supervised	UCF101, HMDB51, Kinetics-600	RGB + Audio + Text	GitHub
VIMPAC	arXiv 2021	Self-supervised	Something-Something V2, Diving48, UCF101, HMDB51	RGB + Text	GitHub
InternVideo	CVPR 2023	Self-supervised	Kinetics-400, Kinetics-600, Kinetics-700, Something-Something V1, V2, ActivityNet, HACS, HMDB51	RGB + Text	GitHub
Side4Video	arXiv 2023	Self-supervised	Something-Something V1, Something-Something V2, Kinetics-400	RGB + Text	GitHub
EZ-CLIP	arXiv 2024	Zero-shot	Kinetics-400, HMDB51, UCF101, Something-Something V2	RGB + Text	GitHub
SATA	arXiv 2024	Zero-shot	UCF101, HMDB51	RGB + Text	GitHub
TC-CLIP	ECCV 2024	Zero-shot/Few-shot/Fully-supervised	HMDB51, UCF101, Kinetics-400, Something-Something V2	RGB + Text	-
InternVideo2	arXiv 2024	Self-supervised + Multimodal	Kinetics-400, Kinetics-600, Kinetics-700, MiT, Something-Something V2, ActivityNet, HACS, Charades, HMDB51	RGB + Audio + Text	GitHub
OmniViD	CVPR 2024	Supervised	Kinetics-400, Something-Something V2, UCF101, HMDB51	RGB + Text	GitHub
LoCATe-GAT	TETCI 2024	Zero-shot	UCF101, HMDB51, ActivityNet, Kinetics-400	RGB + Text	GitHub
STDD	arXiv 2024	Zero-shot	Kinetics-600, UCF101, HMDB51	RGB + Text	GitHub

💻 Datasets Used in The Journey of Action Recognition

Click to expand Table 12

Datasets	Year	# Classes	# Subjects	# Views	# Video clips	Sensor	Modalities	Dataset type
KTH	2004	6	25	1	2391	Static camera	RGB	Human actions (e.g., walking, jogging)
Weizmann	2005	10	9	1	90	-	RGB	Human actions (e.g., jumping, running)
IXMAS	2006	11	10	5	330	-	RGB	Movie Scenes (e.g., kissing, running)
Hollywood	2008	8	-	-	1422	-	RGB	Movie Scenes (e.g., eating, driving)
Hollywood2	2009	12	-	-	1709	-	RGB	Movie Scenes (e.g., running, kissing)
ADL	2009	10	5	-	150	Static camera	RGB	Daily Activities (e.g., brushing teeth, reading)
Olympic Sports	2010	16	-	-	783	-	RGB	Sports (e.g., high jumping, diving)
MSRAction3D	2010	20	10	1	567	Kinect v1	Depth+3DJoints	Daily Activities (e.g., drinking, walking)
CAD-60	2011	14	4	-	68	Kinect v1	RGB+Depth+3DJoints	Human performing activities (e.g., cleaning objects)
HMDB51	2011	51	-	-	6,766	-	RGB	Human actions (e.g., jumping, running)
MSRDailyActivity3D	2012	16	10	1	320	Kinect v1	RGB+Depth+3DJoints	Daily Activities (e.g., calling, playing game)
UCF101	2012	101	-	-	13,320	-	RGB	Body motion, Human-object interactions, sports etc.
UTKinect-Action3D	2012	10	10	1	199	Kinect v1	RGB+Depth+3DJoints	Human actions (e.g., waving hands, pushing)
MPII Cooking	2012	64	12	1	3,748	-	RGB	Cooking
G3D-Gaming	2012	20	10	1	-	Kinect v1	RGB+Depth+3DJoints	Gaming scenario (e.g., defending, climbing)
Berkeley MHAD	2013	11	12	4	660	Multi-baseline stereo cameras	RGB+Depth+3DJoints+Accelerometer+Audio	Human actions (e.g., throwing, clapping hands)
CAD-120	2013	10	4	-	120	Kinect v1	RGB+Depth+3DJoints	Human performing activities (e.g., picking objects)
UCF50	2013	50	-	-	6676	-	RGB	Body motion, Human-object interactions, sports etc.
Florence3D-Action	2013	9	10	1	215	Kinect v1	RGB+Depth+3DJoints	Human actions (e.g., bowing, drinking)
MSRActionPairs3D	2013	12	10	1	360	Kinect v1	RGB+Depth+3DJoints	Human actions (e.g., picking up, putting down)
Sports-1M	2014	487	-	-	1,000,000	-	RGB	Sports (e.g., swimming, skiing)
THUMOS14	2014	101	-	-	5,613	-	RGB	Human Actions (e.g., making up, archery)
Northwestern-UCLA	2014	10	10	3	1494	Kinect v1	RGB+Depth+3DJoints	Human actions (e.g., dropping trash)
UWA3D Multiview Activity	2014	30	10	1	701	Kinect v1	RGB+Depth+3DJoints	Daily Activities (e.g., holding head, walking)
ActivityNet	2015	203	-	-	27,801	-	RGB	Human actions (e.g., drawing, washing)
MPII Cooking 2	2015	67	30	1	273	Static camera	RGB	Cooking
UWA3D Multiview Activity II	2015	30	9	4	1,070	Kinect v1	RGB+Depth+3DJoints	Daily Activities (e.g., waving head, jumping)
SYSU 3D HOI	2015	12	40	-	480	Kinect v1	RGB+Depth+3DJoints	Human-Object Interactions (e.g., sweeping the floor)
NTU RGB+D	2016	60	40	80	56,880	Kinect v2	RGB+Depth+3DJoints	Daily actions, health-related actions etc.
InfAR	2016	12	40	-	600	Infrared camera	Infrared	Human actions (e.g., jogging)
TSF	2016	2	-	1	44	FLIR ONE	Infrared	Falls and normal activities
Charades	2016	157	-	-	66,500	-	RGB+Flow	Indoor activities (e.g., cleaning)
PKU-MMD I	2017	51	66	3	1,076	Kinect v2	RGB+Depth+Infrared+3DJoints	Human actions (e.g., walking)
NfS	2017	-	-	-	100	240 FPS camera	RGB	Visual object tracking
Kinetics-400	2017	400	-	-	306,245	-	RGB	Human-centered actions (e.g., playing instruments)
Something-Something V1	2017	174	-	-	108,499	-	RGB	Human performing actions with everyday objects
Kinetics-skeleton	2017	400	-	-	260,232	-	2DJoints	Human-centered actions
HACS	2017	200	-	-	1,500,000	-	RGB+Flow	Human actions (e.g., dancing)
Charades-Ego	2018	157	112	2	68,536	Head-mounted+standard camera	RGB	Egocentric indoor activities
AVA	2018	80	-	-	211,000	-	RGB+Flow	Human actions (e.g., talking, sitting)
Diving48	2018	48	-	-	18,404	-	RGB+Flow	Diving actions
Epic-Kitchens	2018	149	32	-	39,594	-	RGB+Flow	Cooking
Something-Something V2	2018	174	-	-	220,847	-	RGB	Human performing actions with everyday objects
MiT	2018	339	-	-	1,000,000+	-	RGB+Audio+Flow	Dynamic actions (e.g., human, animals)
Kinetics-600	2018	600	-	-	495,547	-	RGB	Human-centered actions (e.g., playing instruments)
NTU RGB+D 120	2019	120	106	155	114,480	Kinect v2	RGB+Depth+3DJoints+Infrared	Daily actions, health-related actions etc.
IITR-IAR	2019	21	35	-	1,470	FLIR T1020	Infrared	Human actions (hugging, fighting)
Kinetics-700	2019	700	-	-	650,317	-	RGB	Human-centered actions (e.g., playing instruments)
HowTo100M	2019	23,611	-	-	136,000,000	-	RGB	Instructional videos (e.g., cooking)
CATER	2019	301	-	-	5,500	-	RGB	Compositional actions and temporal reasoning
FineGym	2020	530	-	-	32,697	-	RGB	Gymnasium videos (e.g., balance beam)
PKU-MMD II	2020	41	13	3	1,009	Kinect v2	RGB+Depth+Infrared+3DJoints	Human actions (e.g., standing)
EPIC-KITCHENS-100	2020	4,053	37	-	89,977	GoPro Hero7 Black	RGB+Flow	Cooking
UAV-Human	2021	155	119	-	22,476	UAV Camera	RGB+3DJoints	Human Actions (e.g., walking, jogging)

❤️‍🔥❤️‍🔥❤️‍🔥 Contribution

We warmly invite everyone to contribute to this repository and help enhance its quality and scope. Feel free to submit pull requests to add new methods, datasets or other useful resources, as well as to correct any errors you discover. To ensure consistency, please format your pull requests using our tables' structures. We greatly appreciate your valuable contributions and support!

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Video-Action-Recognition

🔥🔥🔥The Journey of Action Recognition✈️

📑 Citation

🚀 News

🔦 Table of Contents

🧰 Methods Used in The Journey of Action Recognition

Handcrafted Methods

2D-based Methods

3D-based Methods

Two-stream Methods

(2+1)D-based Methods

Transformer-based Methods

Skeletons-based Methods

Depth-based Methods

Infrared-based Methods

Point Cloud Methods

Text/Audio Methods

💻 Datasets Used in The Journey of Action Recognition

❤️‍🔥❤️‍🔥❤️‍🔥 Contribution

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Video-Action-Recognition

🔥🔥🔥The Journey of Action Recognition✈️

📑 Citation

🚀 News

🔦 Table of Contents

🧰 Methods Used in The Journey of Action Recognition

Handcrafted Methods

2D-based Methods

3D-based Methods

Two-stream Methods

(2+1)D-based Methods

Transformer-based Methods

Skeletons-based Methods

Depth-based Methods

Infrared-based Methods

Point Cloud Methods

Text/Audio Methods

💻 Datasets Used in The Journey of Action Recognition

❤️‍🔥❤️‍🔥❤️‍🔥 Contribution

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Packages