-
-
Notifications
You must be signed in to change notification settings - Fork 509
Process
Vladimir Mandic edited this page Feb 14, 2023
·
5 revisions
Dataset preprocessing is a highly critical and complex operation required for any succesful training.
This repository includes end-to-end processing solution in cli/modules/process.py
script.
Note that processing can result in no or multiple images for each input as it performs optional steps:
- Extract images frames from video files
- Extract faces from images
- Extract body from images
- Keep original image as well as extracted faces and bodies
Additionaly for each processed image it can:
- For extracted faces
- Run upscaling to improve resolution
- Run face restoration to improve quality
- Remove background from images containing extracted faces
- Verify if all extracted images are of sufficient quality:
- Meet minimum resolution
- Meet face visibility requirement
- Meet framing requirement (e.g. head cut off)
- Run similarity checks to remove near-duplicates
- Run brightness dynamic range checks to remove images with low contrast
- Run blur detection to remove blurry images
- Create captions and tags from images using multiple interrogate models
Processing can be used manually using cli/modules/process.py
script or as part of all included training solutions
Currently to adjust processing parameters you need to edit cli/modules/process.py
script
params = Map({
# general settings
'clear_dst': True, # remove all files from destination at the start
'format': '.jpg', # image format
'target_size': 512, # target resolution
'square_images': True, # should output images be squared
'segmentation_model': 0, # segmentation model 0/general 1/landscape
'segmentation_background': (192, 192, 192), # segmentation background color
'blur_samplesize': 60, # sample size to use for blur detection
'similarity_size': 64, # base similarity detection on reduced images
# original image processing settings
'keep_original': True, # keep original image
# face processing settings
'extract_face': False, # extract face from image
'face_score': 0.7, # min face detection score
'face_pad': 0.2, # pad face image percentage
'face_model': 1, # which face model to use 0/close-up 1/standard
'face_blur_score': 1.5, # max score for face blur detection
'face_range_score': 0.15, # min score for face dynamic range detection
'face_restore': True, # attempt to restore face quality
'face_upscale': True, # attempt to scale small faces
'face_segmentation': False, # segmentation enabled
# body processing settings
'extract_body': False, # extract face from image
'body_score': 0.9, # min body detection score
'body_visibility': 0.5, # min visibility score for each detected body part
'body_parts': 15, # min number of detected body parts with sufficient visibility
'body_pad': 0.2, # pad body image percentage
'body_model': 2, # body model to use 0/low 1/medium 2/high
'body_blur_score': 1.8, # max score for body blur detection
'body_range_score': 0.15, # min score for body dynamic range detection
'body_segmentation': False, # segmentation enabled
# similarity detection settings
'similarity_score': 0.8, # maximum similarity score before image is discarded
# interrogate settings
'interrogate_model': ['clip', 'deepdanbooru'], # interrogate models
'interrogate_captions': True, # write captions to file
'tag_limit': 5, # number of tags to extract
})