Process

DataSet Preprocessing

Dataset preprocessing is a highly critical and complex operation required for any succesful training.
This repository includes end-to-end processing solution in cli/modules/process.py script.

Note that processing can result in no or multiple images for each input as it performs optional steps:

Extract images frames from video files
Extract faces from images
Extract body from images
Keep original image as well as extracted faces and bodies

Additionaly for each processed image it can:

For extracted faces
- Run upscaling to improve resolution
- Run face restoration to improve quality
- Remove background from images containing extracted faces
Verify if all extracted images are of sufficient quality:
- Meet minimum resolution
- Meet face visibility requirement
- Meet framing requirement (e.g. head cut off)
- Run similarity checks to remove near-duplicates
- Run brightness dynamic range checks to remove images with low contrast
- Run blur detection to remove blurry images
Create captions and tags from images using multiple interrogate models

Processing can be used manually using cli/modules/process.py script or as part of all included training solutions

Currently to adjust processing parameters you need to edit cli/modules/process.py script

params = Map({
    # general settings
    'clear_dst': True, # remove all files from destination at the start
    'format': '.jpg', # image format
    'target_size': 512, # target resolution
    'square_images': True, # should output images be squared
    'segmentation_model': 0, # segmentation model 0/general 1/landscape
    'segmentation_background': (192, 192, 192), # segmentation background color
    'blur_samplesize': 60, # sample size to use for blur detection
    'similarity_size': 64, # base similarity detection on reduced images
    # original image processing settings
    'keep_original': True, # keep original image
    # face processing settings
    'extract_face': False, # extract face from image
    'face_score': 0.7, # min face detection score
    'face_pad': 0.2, # pad face image percentage
    'face_model': 1, # which face model to use 0/close-up 1/standard
    'face_blur_score': 1.5, # max score for face blur detection
    'face_range_score': 0.15, # min score for face dynamic range detection
    'face_restore': True, # attempt to restore face quality
    'face_upscale': True, # attempt to scale small faces
    'face_segmentation': False, # segmentation enabled
    # body processing settings
    'extract_body': False, # extract face from image
    'body_score': 0.9, # min body detection score
    'body_visibility': 0.5, # min visibility score for each detected body part
    'body_parts': 15, # min number of detected body parts with sufficient visibility
    'body_pad': 0.2,  # pad body image percentage
    'body_model': 2, # body model to use 0/low 1/medium 2/high
    'body_blur_score': 1.8, # max score for body blur detection
    'body_range_score': 0.15, # min score for body dynamic range detection
    'body_segmentation': False, # segmentation enabled
    # similarity detection settings
    'similarity_score': 0.8, # maximum similarity score before image is discarded
    # interrogate settings
    'interrogate_model': ['clip', 'deepdanbooru'], # interrogate models
    'interrogate_captions': True, # write captions to file
    'tag_limit': 5, # number of tags to extract
})

Uh oh!

Process

DataSet Preprocessing

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!