Skip to content

Commit 9f3c50e

Browse files
committed
Merge remote-tracking branch 'origin/develop' into vs/kp_clean
2 parents c244084 + 3249377 commit 9f3c50e

File tree

190 files changed

+28280
-1086
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

190 files changed

+28280
-1086
lines changed

CHANGELOG.md

Lines changed: 22 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
All notable changes to this project will be documented in this file.
44

5-
## \[unreleased\]
5+
## \[2.2.0\]
66

77
### New features
88

@@ -45,15 +45,31 @@ All notable changes to this project will be documented in this file.
4545
(<https://github.com/openvinotoolkit/training_extensions/pull/3769>)
4646
- Refactoring `ConvModule` by removing `conv_cfg`, `norm_cfg`, and `act_cfg`
4747
(<https://github.com/openvinotoolkit/training_extensions/pull/3783>, <https://github.com/openvinotoolkit/training_extensions/pull/3816>, <https://github.com/openvinotoolkit/training_extensions/pull/3809>)
48+
- Support ImageFromBytes
49+
(<https://github.com/openvinotoolkit/training_extensions/pull/3948>)
50+
- Enable model export
51+
(<https://github.com/openvinotoolkit/training_extensions/pull/3952>)
52+
- Move templates from OTX1.X to OTX2.X
53+
(<https://github.com/openvinotoolkit/training_extensions/pull/3951>)
54+
- Include Geti arrow dataset subset names
55+
(<https://github.com/openvinotoolkit/training_extensions/pull/3962>)
56+
- Include full image with anno in case there's no tile in tile dataset
57+
(<https://github.com/openvinotoolkit/training_extensions/pull/3964>)
58+
- Add type checker in converter for callable functions (optimizer, scheduler)
59+
(<https://github.com/openvinotoolkit/training_extensions/pull/3968>)
4860

4961
### Bug fixes
5062

5163
- Fix Combined Dataloader & unlabeled warmup loss in Semi-SL
52-
(https://github.com/openvinotoolkit/training_extensions/pull/3723)
64+
(<https://github.com/openvinotoolkit/training_extensions/pull/3723>)
5365
- Revert #3579 to fix issues with replacing coco_instance with a different format in some dataset
54-
(https://github.com/openvinotoolkit/training_extensions/pull/3753)
66+
(<https://github.com/openvinotoolkit/training_extensions/pull/3753>)
5567
- Add num_devices in Engine for multi-gpu training
56-
(https://github.com/openvinotoolkit/training_extensions/pull/3778)
68+
(<https://github.com/openvinotoolkit/training_extensions/pull/3778>)
69+
- Add missing tile recipes and various tile recipe changes
70+
(<https://github.com/openvinotoolkit/training_extensions/pull/3942>)
71+
- Change categories mapping logic
72+
(<https://github.com/openvinotoolkit/training_extensions/pull/3946>)
5773

5874
## \[v2.1.0\]
5975

@@ -191,6 +207,8 @@ All notable changes to this project will be documented in this file.
191207
(<https://github.com/openvinotoolkit/training_extensions/pull/3684>)
192208
- Fix MaskRCNN SwinT NNCF Accuracy Drop
193209
(<https://github.com/openvinotoolkit/training_extensions/pull/3685>)
210+
- Fix MaskRCNN SwinT NNCF Accuracy Drop By Adding More PTQ Configs
211+
(<https://github.com/openvinotoolkit/training_extensions/pull/3929>)
194212

195213
### Known issues
196214

README.md

Lines changed: 29 additions & 68 deletions
Original file line numberDiff line numberDiff line change
@@ -166,83 +166,44 @@ In addition to the examples above, please refer to the documentation for tutoria
166166

167167
---
168168

169-
## Updates
170-
171-
### v2.1.0 (3Q24)
172-
173-
> _**NOTES**_
174-
>
175-
> OpenVINO™ Training Extensions, version 2.1.0 does not include the latest functional and security updates. OpenVINO™ Training Extensions, version 2.2.0 is targeted to be released in September 2024 and will include additional functional and security updates. Customers should update to the latest version as it becomes available.
169+
## Updates - v2.2.0 (3Q24)
176170

177171
### New features
178172

179-
- Add a flag to enable OV inference on dGPU
180-
- Add early stopping with warmup. Remove mandatory background label in semantic segmentation task
181-
- RTMDet-tiny enablement for detection task
182-
- Add data_format validation and update in OTXDataModule
183-
- Add torchvision.MaskRCNN
184-
- Add Semi-SL for Multi-class Classification (EfficientNet-B0)
185-
- Decoupling mmaction for action classification (MoviNet, X3D)
186-
- Add Semi-SL Algorithms for mv3-large, effnet-v2, deit-tiny, dino-v2
187-
- RTMDet-tiny enablement for detection task (export/optimize)
188-
- Enable ruff & ruff-format into otx/algo/classification/backbones
189-
- Add TV MaskRCNN Tile Recipe
190-
- Add rotated det OV recipe
173+
- Add RT-DETR model for Object Detection
174+
- Add Multi-Label & H-label Classification with torchvision models
175+
- Add Hugging-Face Model Wrapper for Classification
176+
- Add LoRA finetuning capability for ViT Architectures
177+
- Add Hugging-Face Model Wrapper for Object Detection
178+
- Add Hugging-Face Model Wrapper for Semantic Segmentation
179+
- Enable torch.compile to work with classification
180+
- Add `otx benchmark` subcommand
181+
- Add RTMPose for Keypoint Detection Task
182+
- Add Semi-SL MeanTeacher algorithm for Semantic Segmentation
183+
- Update head and h-label format for hierarchical label classification
184+
- Support configurable input size
191185

192186
### Enhancements
193187

194-
- Change load_stat_dict to on_load_checkpoint
195-
- Add try - except to keep running the remaining tests
196-
- Update instance_segmentation.py to resolve conflict with 2.0.0
197-
- Update XPU install
198-
- Sync rgb order between torch and ov inference of action classification task
199-
- Make Perf test available to load pervious Perf test to skip training stage
200-
- Reenable e2e classification XAI tests
201-
- Remove action detection task support
202-
- Increase readability of pickling error log during HPO & fix minor bug
203-
- Update RTMDet checkpoint url
204-
- Refactor Torchvision Model for Classification Semi-SL
205-
- Add coverage omit mm-related code
206-
- Add docs semi-sl part
207-
- Refactor docs design & Add contents
208-
- Add execution example of auto batch size in docs
209-
- Add Semi-SL for cls Benchmark Test
210-
- Move value to device before logging for metric
211-
- Add .codecov.yaml
212-
- Update benchmark tool for otx2.1
213-
- Collect pretrained weight binary files in one place
214-
- Minimize compiled dependency files
215-
- Update README & CODEOWNERS
216-
- Update Engine's docstring & CLI --help outputs
217-
- Align integration test to exportable code interface update for release branch
218-
- Refactor exporter for anomaly task and fix a bug with exportable code
219-
- Update pandas version constraint
220-
- Include more models to export test into test_otx_e2e
221-
- Move assigning tasks to Models from Engine to Anomaly Model Classes
222-
- Refactoring detection modules
188+
- Reimplement of ViT Architecture following TIMM
189+
- Enable to override data configurations
190+
- Enable to use input_size at transforms in recipe
191+
- Enable to use polygon and bitmap mask as prompt inputs for zero-shot learning
192+
- Refactoring `ConvModule` by removing `conv_cfg`, `norm_cfg`, and `act_cfg`
193+
- Support ImageFromBytes
194+
- enable model export
195+
- Move templates from OTX1.X to OTX2.X
196+
- Include Geti arrow dataset subset names
197+
- Include full image with anno in case there's no tile in tile dataset
198+
- Add type checker in converter for callable functions (optimizer, scheduler)
223199

224200
### Bug fixes
225201

226-
- Fix conflicts between develop and 2.0.0
227-
- Fix polygon mask
228-
- Fix vpm intg test error
229-
- Fix anomaly
230-
- Bug fix in Semantic Segmentation + enable DINOV2 export in ONNX
231-
- Fix some export issues. Remove EXPORTABLE_CODE as export parameter.
232-
- Fix `load_from_checkpoint` to apply original model's hparams
233-
- Fix `load_from_checkpoint` args to apply original model's hparams
234-
- Fix zero-shot `learn` for ov model
235-
- Various fixes for XAI in 2.1
236-
- Fix tests to work in a mm-free environment
237-
- Fix a bug in benchmark code
238-
- Update exportable code dependency & fix a bug
239-
- Fix getting wrong shape during resizing
240-
- Fix detection prediction outputs
241-
- Fix RTMDet PTQ performance
242-
- Fix segmentation fault on VPM PTQ
243-
- Fix NNCF MaskRCNN-Eff accuracy drop
244-
- Fix optimize with Semi-SL data pipeline
245-
- Fix MaskRCNN SwinT NNCF Accuracy Drop
202+
- Fix Combined Dataloader & unlabeled warmup loss in Semi-SL
203+
- Revert #3579 to fix issues with replacing coco_instance with a different format in some dataset
204+
- Add num_devices in Engine for multi-gpu training
205+
- Add missing tile recipes and various tile recipe changes
206+
- Change categories mapping logic
246207

247208
### Known issues
248209

docker/build.sh

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
11
#!/bin/bash
2-
# shellcheck disable=SC2154
2+
# shellcheck disable=SC2154,SC2035,SC2046
33

4-
OTX_VERSION=$(python -c 'import otx; print(otx.__version__)')
4+
if [ "$OTX_VERSION" == "" ]; then
5+
OTX_VERSION=$(python -c 'import otx; print(otx.__version__)')
6+
fi
57
THIS_DIR=$(dirname "$0")
68

79
echo "Build OTX ${OTX_VERSION} CUDA Docker image..."

docker/download_pretrained_weights.py

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -32,10 +32,6 @@ def download_all() -> None:
3232
msg = f"Skip {config_path} since it is not a PyTorch config."
3333
logger.warning(msg)
3434
continue
35-
if "anomaly_" in str(config_path) or "dino_v2" in str(config_path) or "h_label_cls" in str(config_path):
36-
msg = f"Skip {config_path} since those models show errors on instantiation."
37-
logger.warning(msg)
38-
continue
3935

4036
config = OmegaConf.load(config_path)
4137
init_model = next(iter(partial_instantiate_class(config.model)))
Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
Configurable Input Size
2+
=======================
3+
4+
The Configurable Input Size feature allows users to adjust the input resolution of their deep learning models
5+
to balance between training and inference speed and model performance.
6+
This flexibility enables users to tailor the input size to their specific needs without manually altering
7+
the data pipeline configurations.
8+
9+
To utilize this feature, simply specify the desired input size as an argument during the train command.
10+
Additionally, OTX ensures compatibility with model trained on non-default input sizes by automatically adjusting
11+
the data pipeline to match the input size during other engine entry points.
12+
13+
Usage example:
14+
15+
.. code-block::
16+
17+
$ otx train \
18+
--config ... \
19+
20+
.. tab-set::
21+
22+
.. tab-item:: API 1
23+
24+
.. code-block:: python
25+
26+
from otx.algo.detection.yolox import YOLOXS
27+
from otx.core.data.module import OTXDataModule
28+
from otx.engine import Engine
29+
30+
input_size = (512, 512)
31+
model = YOLOXS(label_info=5, input_size=input_size) # should be tuple[int, int]
32+
datamodule = OTXDataModule(..., input_size=input_size)
33+
engine = Engine(model=model, datamodule=datamodule)
34+
engine.train()
35+
36+
.. tab-item:: API 2
37+
38+
.. code-block:: python
39+
40+
from otx.core.data.module import OTXDataModule
41+
from otx.engine import Engine
42+
43+
datamodule = OTXDataModule(..., input_size=(512, 512))
44+
engine = Engine(model="yolox_s", datamodule=datamodule) # model input size will be aligned with the datamodule input size
45+
engine.train()
46+
47+
.. tab-item:: CLI
48+
49+
.. code-block:: bash
50+
51+
(otx) ...$ otx train ... --data.input_size 512
52+
53+
.. _adaptive-input-size:
54+
55+
Adaptive Input Size
56+
-------------------
57+
58+
The Adaptive Input Size feature intelligently determines an optimal input size for the model
59+
by analyzing the dataset's statistics.
60+
It operates in two distinct modes: "auto" and "downscale".
61+
In "auto" mode, the input size may increase or decrease based on the dataset's characteristics.
62+
In "downscale" mode, the input size will either decrease or remain unchanged, ensuring that the model training or inference speed deosn't drop.
63+
64+
65+
To activate this feature, use the following command with the desired mode:
66+
67+
.. tab-set::
68+
69+
.. tab-item:: API
70+
71+
.. code-block:: python
72+
73+
from otx.algo.detection.yolox import YOLOXS
74+
from otx.core.data.module import OTXDataModule
75+
from otx.engine import Engine
76+
77+
datamodule = OTXDataModule(
78+
...
79+
adaptive_input_size="auto", # auto or downscale
80+
input_size_multiplier=YOLOXS.input_size_multiplier, # should set the input_size_multiplier of the model
81+
)
82+
model = YOLOXS(label_info=5, input_size=datamodule.input_size)
83+
engine = Engine(model=model, datamodule=datamodule)
84+
engine.train()
85+
86+
.. tab-item:: CLI
87+
88+
.. code-block:: bash
89+
90+
(otx) ...$ otx train ... --data.adaptive_input_size "auto | downscale"
91+
92+
The adaptive process includes the following steps:
93+
94+
1. OTX computes robust statistics from the input dataset.
95+
96+
2. The initial input size is set based on the typical large image size within the dataset.
97+
98+
3. (Optional) The input size may be further refined to account for the sizes of objects present in the dataset.
99+
The model's minimum recognizable object size, typically ranging from 16x16 to 32x32 pixels, serves as a reference to
100+
proportionally adjust the input size relative to the average small object size observed in the dataset.
101+
For instance, if objects are generally 64x64 pixels in a 512x512 image, the input size would be adjusted
102+
to 256x256 to maintain detectability.
103+
104+
Adjustments are subject to the following constraints:
105+
106+
* If the recalculated input size exceeds the maximum image size determined in the previous step, it will be capped at that maximum size.
107+
* If the recalculated input size falls below the minimum threshold defined by MIN_DETECTION_INPUT_SIZE, the input size will be scaled up. This is done by increasing the smaller dimension (width or height) to MIN_DETECTION_INPUT_SIZE while maintaining the aspect ratio, ensuring that the model's minimum criteria for object detection are met.
108+
109+
4. (downscale only) Any scale-up beyond the default model input size is restricted.
110+
111+
112+
.. Note::
113+
Opting for a smaller input size can be advantageous for datasets with lower-resolution images or larger objects,
114+
as it may improve speed with minimal impact on model accuracy. However, it is important to consider that selecting
115+
a smaller input size could affect model performance depending on the task, model architecture, and dataset
116+
properties.

docs/source/guide/explanation/additional_features/hpo.rst

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -143,10 +143,16 @@ Here is explanation of all HPO configuration.
143143

144144
- **mode** (*str*, *default='max'*) - Optimization mode for the metric. It determines whether the metric should be maximized or minimized. The possible values are 'max' and 'min', respectively.
145145

146-
- **num_workers** (*int*, *default=1*) How many trials will be executed in parallel.
146+
- **num_trials** (*int*, *default=None*) The number of training trials to perform during HPO. If not provided, the number of trials will be determined based on the expected time ratio. Defaults to None.
147+
148+
- **num_workers** (*int*, *default=None*) The number of trials that will be run concurrently.
147149

148150
- **expected_time_ratio** (*int*, *default=4*) How many times to use for HPO compared to training time.
149151

152+
- **metric_name** (*str*, *default=None*) The name of the performance metric to be optimized during HPO. If not specified, the metric will be selected based on the configured callbacks. Defaults to None.
153+
154+
- **adapt_bs_search_space_max_val** (*Literal["None", "Safe", "Full"]*, *default="None"*) Whether to execute `Auto-adapt batch size` prior to HPO. This step finds the maximum batch size value, which then serves as the upper limit for the batch size search space during HPO. For further information on `Auto-adapt batch size`, please refer to the `Auto-configuration` documentation. Defaults to "None".
155+
150156
- **maximum_resource** (*int*, *default=None*) - Maximum number of training epochs for each trial. When the training epochs reaches this value, the trial stop to train.
151157

152158
- **minimum_resource** (*int*, *default=None*) - Minimum number of training epochs for each trial. Each trial will run at least this epochs, even if the performance of the model is not improving.

docs/source/guide/explanation/additional_features/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,3 +14,4 @@ Additional Features
1414
fast_data_loading
1515
tiling
1616
class_incremental_sampler
17+
configurable_input_size

docs/source/guide/explanation/algorithms/action/action_detection.rst

Lines changed: 0 additions & 47 deletions
This file was deleted.

docs/source/guide/explanation/algorithms/action/index.rst

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,4 +6,3 @@ Action Recognition
66

77

88
action_classification
9-
action_detection

docs/source/guide/get_started/cli_commands.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -339,11 +339,11 @@ The results will be saved in ``./otx-workspace/`` folder by default. The output
339339
340340
(otx) ...$ otx train --model <model-class-path-or-name> --task <task-type> --data_root <dataset-root>
341341
342-
For example, if you want to use the ``otx.algo.detection.atss.ATSS`` model class, you can train it as shown below.
342+
For example, if you want to use the ``otx.algo.classification.torchvision_model.TVModelForMulticlassCls`` model class, you can train it as shown below.
343343
344344
.. code-block:: shell
345345
346-
(otx) ...$ otx train --model otx.algo.detection.atss.ATSS --model.variant mobilenetv2 --task DETECTION ...
346+
(otx) ...$ otx train --model otx.algo.classification.torchvision_model.TVModelForMulticlassCls --model.backbone mobilenet_v3_small ...
347347
348348
.. note::
349349
You also can visualize the training using ``Tensorboard`` as these logs are located in ``<work_dir>/tensorboard``.

0 commit comments

Comments
 (0)