Porting over qurator-spk/sbb_pixelwise_segmentation#25 #201

kba · 2025-10-16T18:24:13Z

Don't merge, this is just the first try at porting over a PR from a repository that has been --allow-unrelated-history-merged into this. It's not usable as such but at least we have a proper diff.

Add LICENSE

Update README.md

continue training, loss functions, rotation and ...

…d as a class for layout segmentation

…in label images after artificial label

…or no patch case

…no patch augmentation

…herwise can give an cv2 error

…to `model.fit`

Changed unsafe basename extraction: `file_name = i.split('.')[0]` to `file_name = os.path.splitext(i)[0]` and `filename = n[i].split('.')[0]` to `filename = os.path.splitext(n[i])[0]` because `"Vat.sam.2_206.jpg` -> `Vat` instead of `"Vat.sam.2_206`

Keep safely the full basename without extension

# Learning Rate Warmup and Optimization Implementation ## Overview Added learning rate warmup functionality to improve training stability, especially when using pretrained weights. The implementation uses TensorFlow's native learning rate scheduling for better performance. ## Changes Made ### 1. Configuration Updates (`runs/train_no_patches_448x448.json`) Added new configuration parameters for warmup: ```json { "warmup_enabled": true, "warmup_epochs": 5, "warmup_start_lr": 1e-6 } ``` ### 2. Training Script Updates (`train.py`) #### A. Optimizer and Learning Rate Schedule - Replaced fixed learning rate with dynamic scheduling - Implemented warmup using `tf.keras.optimizers.schedules.PolynomialDecay` - Maintained compatibility with existing ReduceLROnPlateau and EarlyStopping ```python if warmup_enabled: lr_schedule = tf.keras.optimizers.schedules.PolynomialDecay( initial_learning_rate=warmup_start_lr, decay_steps=warmup_epochs * steps_per_epoch, end_learning_rate=learning_rate, power=1.0 # Linear decay ) optimizer = Adam(learning_rate=lr_schedule) else: optimizer = Adam(learning_rate=learning_rate) ``` #### B. Learning Rate Behavior - Initial learning rate: 1e-6 (configurable via `warmup_start_lr`) - Target learning rate: 5e-5 (configurable via `learning_rate`) - Linear increase over 5 epochs (configurable via `warmup_epochs`) - After warmup, learning rate remains at target value until ReduceLROnPlateau triggers ## Benefits 1. Improved training stability during initial epochs 2. Better handling of pretrained weights 3. Efficient implementation using TensorFlow's native scheduling 4. Configurable through JSON configuration file 5. Maintains compatibility with existing callbacks (ReduceLROnPlateau, EarlyStopping) ## Usage To enable warmup: 1. Set `warmup_enabled: true` in the configuration file 2. Adjust `warmup_epochs` and `warmup_start_lr` as needed 3. The warmup will automatically integrate with existing learning rate reduction and early stopping To disable warmup: - Set `warmup_enabled: false` or remove the warmup parameters from the configuration file

# Training Script Improvements ## Learning Rate Management Fixes ### 1. ReduceLROnPlateau Implementation - Fixed the learning rate reduction mechanism by replacing the manual epoch loop with a single `model.fit()` call - This ensures proper tracking of validation metrics across epochs - Configured with: ```python reduce_lr = ReduceLROnPlateau( monitor='val_loss', factor=0.2, # More aggressive reduction patience=3, # Quick response to plateaus min_lr=1e-6, # Minimum learning rate min_delta=1e-5, # Minimum change to be considered improvement verbose=1 ) ``` ### 2. Warmup Implementation - Added learning rate warmup using TensorFlow's native scheduling - Gradually increases learning rate from 1e-6 to target (2e-5) over 5 epochs - Helps stabilize initial training phase - Implemented using `PolynomialDecay` schedule: ```python lr_schedule = tf.keras.optimizers.schedules.PolynomialDecay( initial_learning_rate=warmup_start_lr, decay_steps=warmup_epochs * steps_per_epoch, end_learning_rate=learning_rate, power=1.0 # Linear decay ) ``` ### 3. Early Stopping - Added early stopping to prevent overfitting - Configured with: ```python early_stopping = EarlyStopping( monitor='val_loss', patience=6, restore_best_weights=True, verbose=1 ) ``` ## Model Saving Improvements ### 1. Epoch-based Model Saving - Implemented custom `ModelCheckpointWithConfig` to save both model and config - Saves after each epoch with corresponding config.json - Maintains compatibility with original script's saving behavior ### 2. Best Model Saving - Saves the best model at training end - If early stopping triggers: saves the best model from training - If no early stopping: saves the final model ## Configuration All parameters are configurable through the JSON config file: ```json { "reduce_lr_enabled": true, "reduce_lr_monitor": "val_loss", "reduce_lr_factor": 0.2, "reduce_lr_patience": 3, "reduce_lr_min_lr": 1e-6, "reduce_lr_min_delta": 1e-5, "early_stopping_enabled": true, "early_stopping_monitor": "val_loss", "early_stopping_patience": 6, "early_stopping_restore_best_weights": true, "warmup_enabled": true, "warmup_epochs": 5, "warmup_start_lr": 1e-6 } ``` ## Benefits 1. More stable training with proper learning rate management 2. Better handling of training plateaus 3. Automatic saving of best model 4. Maintained compatibility with existing config saving 5. Improved training monitoring and control

… ReduceLROnPlateau # Conflicts: # LICENSE # README.md # requirements.txt # train.py

Rezanezhad, Vahid and others added 30 commits December 5, 2019 12:01

code to produce models

136a767

add files needed for training

d4bc814

add files needed for training

038d776

Update config_params.json

cd1990d

Update README

a216dcc

Update README

036e2e9

Update README

f7a5a57

Update README

bbe6f99

Delete README

e4013fe

Add new file

f69d445

📝 howto: Be more verbose with the subtree pull

4897fd3

Update README

0cddfff

Update README.md

c5e1e2d

Merge commit 'c5e1e2dda7542c6d8a9787fa496b538ce8519794'

3ac99b4

Update main.py

bb212da

Add LICENSE

2e768e4

Merge pull request #2 from cneud/add-license-1

8bdb295

Add LICENSE

Update README.md

d2a8119

Update README.md

a9c86b2

Merge pull request #7 from qurator-spk/update-readme

5b4df66

Update README.md

Update README.md

7063789

Update README.md

63fcb96

first updates, padding, rotations

5fb7552

continue training, losses and etc

4bea9fd

Merge pull request #15 from vahidrezanezhad/master

75dc5f3

continue training, loss functions, rotation and ...

Update README.md

040d3cf

Update README.md

e698463

Update README.md

3ec551d

Update README.md

57f8827

Update README.md

9221b6c

vahidrezanezhad and others added 28 commits July 16, 2024 18:29

printspace_as_class_in_layout is integrated. Printspace can be define…

55f3cb9

…d as a class for layout segmentation

adding degrading and brightness augmentation to no patches case training

9521768

brightness augmentation modified

f2692cf

increasing margin in the case of pixelwise inference

c340fbb

erosion and dilation parameters are changed & separators are written …

30894dd

…in label images after artificial label

inference updated

5fbe941

erosion rate changed

59e5892

add documentation from wiki as markdown file to the codebase

b6bdf94

save only layout output. different from overlayed layout on image

f4bad09

update

85dd59f

augmentation function for red textlines, rgb background and scaling f…

7be326d

…or no patch case

updating augmentations

95bbdf8

scaling, channels shuffling, rgb background and red content added to …

f31219b

…no patch augmentation

using prepared binarized images in the case of augmentation

9904846

early dilation for textline artificial class

4f0e3ef

adding foreground rgb to augmentation

c502e67

fixing artificial class bug

5f456cf

new augmentations for patchwise training

cca4d17

Update inference.py to check if save_layout was passed as argument ot…

df4a47a

…herwise can give an cv2 error

Changed deprecated lr to learning_rate and model.fit_generator …

451188c

…to `model.fit`

Update utils.py

be57f13

Update utils.py

102b04c

Changed unsafe basename extraction: `file_name = i.split('.')[0]` to `file_name = os.path.splitext(i)[0]` and `filename = n[i].split('.')[0]` to `filename = os.path.splitext(n[i])[0]` because `"Vat.sam.2_206.jpg` -> `Vat` instead of `"Vat.sam.2_206`

Update gt_gen_utils.py

1bf8019

Keep safely the full basename without extension

move src/.../train.py to root to accomodate old PR

30fe51f

Merge remote-tracking branch 'pixelwise_local/ReduceLROnPlateau' into…

54132a4

… ReduceLROnPlateau # Conflicts: # LICENSE # README.md # requirements.txt # train.py

move train.py back

ad53ea3

kba changed the base branch from main to training-installation October 16, 2025 18:37

Base automatically changed from training-installation to integrate-training-from-sbb_pixelwise_segmentation October 16, 2025 18:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Porting over qurator-spk/sbb_pixelwise_segmentation#25 #201

Porting over qurator-spk/sbb_pixelwise_segmentation#25 #201

Uh oh!

kba commented Oct 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Porting over qurator-spk/sbb_pixelwise_segmentation#25 #201

Are you sure you want to change the base?

Porting over qurator-spk/sbb_pixelwise_segmentation#25 #201

Uh oh!

Conversation

kba commented Oct 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants