Package for action recognition with multimodal masked autoencoders. This package enables behavioral analysis from video, audio, depth cameras, etc.
The work has been road tested (and developed for) two recent articles from the lab:
- MammalAlps (CVPR highlight 2025)
- EPFL smart kitchen (NeurIPS 2025).
It will appear in early December (in time for NeurIPS), stay tuned!