Optimize OTX Data Augmentations pipeline

### 📄 Description

**Current Problem:**

1. **Mixed augmentation pipelines**: OTX currently uses a combination of `torchvision` and OpenMMLab augmentations (some self-implemented in the repo). Later in the pipeline, operations rely on NumPy images, while `torchvision` expects PyTorch tensors. This causes constant conversion between tensors and NumPy arrays, introducing a **performance bottleneck** that slows down training.  

2. **Inconsistent interfaces**: Parameter names, formats, and APIs differ across the various augmentation implementations, making the codebase harder to maintain and extend.  

3. **Redundant self-implemented augmentations**: Several augmentations are re-implemented locally in OTX, even though robust third-party solutions exist and are actively maintained.  

---

**Proposed Solution:**

Evaluate **Kornia** as the primary augmentation library for OTX. Kornia is a differentiable computer vision library built on top of PyTorch, with an extensive set of augmentation operators.  

Key benefits of Kornia:  
- 🟢 **PyTorch-first design**: All augmentations operate directly on PyTorch tensors, removing unnecessary conversions from/to NumPy and improving training efficiency.  
- 🟢 **Rich functionality**: Provides a wide range of augmentations, including geometric, color, and intensity transformations, as well as advanced ones (e.g., motion blur, random perspective, cutout).  
- 🟢 **Differentiable & GPU-accelerated**: Augmentations are differentiable, making them suitable for integration with modern deep learning pipelines, and can be executed efficiently on GPUs.  
- 🟢 **Unified API & consistency**: Standardized function signatures simplify maintainability and reduce code complexity.  
- 🟢 **Extended support**: Handles not only images but also masks, bounding boxes, and keypoints, aligning with the requirements of detection and segmentation tasks.  

---

**Benchmarking Plan (Kornia vs torchvision.v2):**  
- Compare **throughput (images/sec)** during training with augmentation-heavy pipelines.  
- Measure **GPU utilization and memory footprint** when using Kornia vs torchvision v2.  
- Evaluate **coverage of transformations** (ensure Kornia includes all augmentations currently used in OTX).  
- Validate **annotation consistency** (masks, bounding boxes, keypoints remain synchronized after augmentation).  

---

### 🎯 Objective

- Unify OTX augmentation pipeline under a **PyTorch-first** approach.  
- Improve **training performance** by reducing tensor/NumPy conversions.  
- Increase **maintainability and clarity** of augmentation code by eliminating redundant implementations.  
- Provide **benchmark results** to decide between Kornia and torchvision v2.  

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize OTX Data Augmentations pipeline #5020

📄 Description

🎯 Objective

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Optimize OTX Data Augmentations pipeline #5020

Description

📄 Description

🎯 Objective

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions