Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.
- 
            Updated
            Jan 31, 2025 
- Python
Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.
kapture is a file format as well as a set of tools for manipulating datasets, and in particular Visual Localization and Structure from Motion data.
A collection of multimodal datasets, and visual features for VQA and captionning in pytorch. Just run "pip install multimodal"
Emotional Video to Audio Transformation with ANFIS-DeepRNN (Vanilla RNN and LSTM-DeepRNN) [MPE 2020]
Original VinVL visual backbone with simplified APIs to easily extract features, boxes, object detections, in a few lines of Python code.
Stitching and fusion of on-board surround view BEV real world image sequences, odometer estimation and output of large pixel map
Stitching and fusion of on-board surround view BEV real world image sequences, odometer estimation and output of large pixel map
Recommends Apparel based on Text, Visual features, and weighted similarity using brand and color similarity.
Add a description, image, and links to the visual-features topic page so that developers can more easily learn about it.
To associate your repository with the visual-features topic, visit your repo's landing page and select "manage topics."