open-mmlab / mmaction2
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
AI Architecture Analysis
This repository is indexed by RepoMind. By analyzing open-mmlab/mmaction2 in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.
Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.
Repository Overview (README excerpt)
Crawler view OpenMMLab website HOT OpenMMLab platform TRY IT OUT 📘Documentation | 🛠️Installation | 👀Model Zoo | 🆕Update News | 🚀Ongoing Projects | 🤔Reporting Issues English | 简体中文 📄 Table of Contents • 📄 Table of Contents • 🥳 🚀 What's New • 📖 Introduction • 🎁 Major Features • 🛠️ Installation • 👀 Model Zoo • 👨🏫 Get Started • 🎫 License • 🖊️ Citation • 🙌 Contributing • 🤝 Acknowledgement • 🏗️ Projects in OpenMMLab 🥳 🚀 What's New 🔝 **The default branch has been switched to (previous ) from (current ), and we encourage users to migrate to the latest version with more supported models, stronger pre-training checkpoints and simpler coding. Please refer to Migration Guide for more details.** **Release (2023.10.12)**: v1.2.0 with the following new features: • Support VindLU multi-modality algorithm and the Training of ActionClip • Support lightweight model MobileOne TSN/TSM • Support video retrieval dataset MSVD • Support SlowOnly K700 feature to train localization models • Support Video and Audio Demos 📖 Introduction 🔝 MMAction2 is an open-source toolbox for video understanding based on PyTorch. It is a part of the OpenMMLab project. Action Recognition on Kinetics-400 (left) and Skeleton-based Action Recognition on NTU-RGB+D-120 (right) Skeleton-based Spatio-Temporal Action Detection and Action Recognition Results on Kinetics-400 Spatio-Temporal Action Detection Results on AVA-2.1 🎁 Major Features 🔝 • **Modular design**: We decompose a video understanding framework into different components. One can easily construct a customized video understanding framework by combining different modules. • **Support five major video understanding tasks**: MMAction2 implements various algorithms for multiple video understanding tasks, including action recognition, action localization, spatio-temporal action detection, skeleton-based action detection and video retrieval. • **Well tested and documented**: We provide detailed documentation and API reference, as well as unit tests. 🛠️ Installation 🔝 MMAction2 depends on PyTorch, MMCV, MMEngine, MMDetection (optional) and MMPose (optional). Please refer to install.md for detailed instructions. Quick instructions 👀 Model Zoo 🔝 Results and models are available in the model zoo. Supported model Action Recognition C3D (CVPR'2014) TSN (ECCV'2016) I3D (CVPR'2017) C2D (CVPR'2018) I3D Non-Local (CVPR'2018) R(2+1)D (CVPR'2018) TRN (ECCV'2018) TSM (ICCV'2019) TSM Non-Local (ICCV'2019) SlowOnly (ICCV'2019) SlowFast (ICCV'2019) CSN (ICCV'2019) TIN (AAAI'2020) TPN (CVPR'2020) X3D (CVPR'2020) MultiModality: Audio (ArXiv'2020) TANet (ArXiv'2020) TimeSformer (ICML'2021) ActionCLIP (ArXiv'2021) VideoSwin (CVPR'2022) VideoMAE (NeurIPS'2022) MViT V2 (CVPR'2022) UniFormer V1 (ICLR'2022) UniFormer V2 (Arxiv'2022) VideoMAE V2 (CVPR'2023) Action Localization BSN (ECCV'2018) BMN (ICCV'2019) TCANet (CVPR'2021) Spatio-Temporal Action Detection ACRN (ECCV'2018) SlowOnly+Fast R-CNN (ICCV'2019) SlowFast+Fast R-CNN (ICCV'2019) LFB (CVPR'2019) VideoMAE (NeurIPS'2022) Skeleton-based Action Recognition ST-GCN (AAAI'2018) 2s-AGCN (CVPR'2019) PoseC3D (CVPR'2022) STGCN++ (ArXiv'2022) CTRGCN (CVPR'2021) MSG3D (CVPR'2020) Video Retrieval CLIP4Clip (ArXiv'2022) Supported dataset Action Recognition HMDB51 ( Homepage ) (ICCV'2011) UCF101 ( Homepage ) (CRCV-IR-12-01) ActivityNet ( Homepage ) (CVPR'2015) Kinetics-[400/600/700] ( Homepage ) (CVPR'2017) SthV1 (ICCV'2017) SthV2 ( Homepage ) (ICCV'2017) Diving48 ( Homepage ) (ECCV'2018) Jester ( Homepage ) (ICCV'2019) Moments in Time ( Homepage ) (TPAMI'2019) Multi-Moments in Time ( Homepage ) (ArXiv'2019) HVU ( Homepage ) (ECCV'2020) OmniSource ( Homepage ) (ECCV'2020) FineGYM ( Homepage ) (CVPR'2020) Kinetics-710 ( Homepage ) (Arxiv'2022) Action Localization THUMOS14 ( Homepage ) (THUMOS Challenge 2014) ActivityNet ( Homepage ) (CVPR'2015) HACS ( Homepage ) (ICCV'2019) Spatio-Temporal Action Detection UCF101-24* ( Homepage ) (CRCV-IR-12-01) JHMDB* ( Homepage ) (ICCV'2015) AVA ( Homepage ) (CVPR'2018) AVA-Kinetics ( Homepage ) (Arxiv'2020) MultiSports ( Homepage ) (ICCV'2021) Skeleton-based Action Recognition PoseC3D-FineGYM ( Homepage ) (ArXiv'2021) PoseC3D-NTURGB+D ( Homepage ) (ArXiv'2021) <a href="https://github.com/open-mmlab/mmaction2/b _...truncated for preview_