Skip to content

Comprehensive face parsing model with 5 different backbones (ResNet18/34, EfficientNet B0/B1/B2). Features PyTorch training, ONNX export, and performance comparison tools. Optimized for real-time facial feature segmentation.

License

Notifications You must be signed in to change notification settings

Mrkomiljon/face-parsing

Repository files navigation

BiSeNet: Bilateral Segmentation Network for Real-time Face Parsing

Downloads GitHub Repo stars License GitHub Repository

This is a comprehensive face parsing model for high-precision facial feature segmentation based on BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation. This model accurately segments various facial components such as the eyes, nose, mouth, and the contour of the face from images.

Input Images

ResNet34

ResNet18

EfficientNet B0

EfficientNet B1

EfficientNet B2

Table of Contents

Project Description

Face parsing model segments facial features with remarkable accuracy, making it ideal for applications in digital makeup, augmented reality, facial recognition, and emotion detection. The model processes input images and outputs a detailed mask that highlights individual facial components, distinguishing between skin, hair, eyes, and other key facial landmarks.

Supported Backbones:

  • ResNet18 - Fast and efficient (52.8MB, 407 FPS, ~11.7M params)
  • ResNet34 - Balanced performance (91.4MB, 295 FPS, ~21.8M params)
  • EfficientNet B0 - Lightweight and fast (26.6MB, 170 FPS, ~5.3M params)
  • EfficientNet B1 - Good accuracy/speed balance (36.3MB, 150 FPS, ~7.8M params)
  • EfficientNet B2 - Higher accuracy (41.5MB, 150 FPS, ~9.2M params)

Recent Updates:

  • [2025-07-23] Added comprehensive performance comparison tools
  • [2025-07-23] Fixed ONNX export compatibility issues
  • [2025-07-23] Added input size testing and analysis
  • [2025-07-23] Improved training with optimized parameters
  • [2025-07-23] Added EfficientNet B0, B1, B2 backbones
  • [2025-07-23] Enhanced download script for all models

Model Performance Comparison

Speed and Size Comparison (256x256 input)

Model PyTorch Speed ONNX Speed PyTorch Size ONNX Size Parameters Best Use Case
ResNet18 2.45ms (407 FPS) 10.58ms (94 FPS) 52.8MB 50.7MB ~11.7M Fastest inference
ResNet34 3.38ms (295 FPS) 12.83ms (78 FPS) 91.4MB 89.3MB ~21.8M Balanced performance
EfficientNet B0 5.85ms (170 FPS) 11.69ms (85 FPS) 26.6MB 26.3MB ~5.3M Smallest model
EfficientNet B1 6.65ms (150 FPS) 13.6ms (73 FPS) 36.3MB 35.8MB ~7.8M Good accuracy/speed
EfficientNet B2 6.66ms (150 FPS) 13.57ms (73 FPS) 41.5MB 41.0MB ~9.2M Higher accuracy

Input Size Impact Analysis

Key Findings:

  • Model size stays constant regardless of input size
  • 📊 Speed scales with input area (width × height)
  • 💾 Memory usage increases with larger inputs
  • 🎯 256x256 provides optimal balance of speed and accuracy
Input Size Speed Impact Memory Impact Recommendation
128x128 Fastest Lowest Maximum speed
256x256 Good Moderate Optimal balance
512x512 Slower Higher Higher accuracy
1024x1024 Slowest Highest Maximum accuracy

Model Rankings Summary

🏆 Speed Rankings (Fastest to Slowest):

  1. ResNet18 - 2.45ms (407 FPS) - Fastest inference
  2. ResNet34 - 3.38ms (295 FPS) - Fast inference
  3. EfficientNet B0 - 5.85ms (170 FPS) - Fast inference
  4. EfficientNet B1 - 6.65ms (150 FPS) - Medium speed
  5. EfficientNet B2 - 6.66ms (150 FPS) - Medium speed

📦 Size Rankings (Smallest to Largest):

  1. EfficientNet B0 - 26.6MB - Smallest model
  2. EfficientNet B1 - 36.3MB - Small model
  3. EfficientNet B2 - 41.5MB - Medium model
  4. ResNet18 - 52.8MB - Medium model
  5. ResNet34 - 91.4MB - Largest model

🎯 Accuracy Rankings (Best to Good):

  1. EfficientNet B2 - Highest accuracy
  2. EfficientNet B1 - Better accuracy
  3. ResNet34 - Better accuracy
  4. ResNet18 - Good accuracy
  5. EfficientNet B0 - Good accuracy

Installation

To get started with the Face Parsing Model, clone this repository and install the required dependencies:

git clone https://github.com/Mrkomiljon/face-parsing.git
cd face-parsing
pip install -r requirements.txt

Quick Start

1. Download Pre-trained Models

# Download all models (ResNet18, ResNet34, EfficientNet B0/B1/B2)
bash download.sh

2. Run Inference

# PyTorch inference with ResNet18
python inference.py --model resnet18 --weight ./weights/resnet18.pt --input ./assets/images --output ./assets/results/resnet18

# ONNX inference with ResNet18
python onnx_inference.py --model ./weights/resnet18.onnx --input ./assets/images --output ./assets/results/onnx/resnet18

3. Performance Testing

# Compare PyTorch vs ONNX performance
python speed_comparison.py

# Test different input sizes
python input_size_test.py

Dataset Preparation

1. Download CelebAMask-HQ Dataset

# Download dataset
wget https://github.com/switchablenorms/CelebAMask-HQ/releases/download/v1.0/CelebAMask-HQ.zip

# Extract dataset
unzip CelebAMask-HQ.zip

# Organize dataset structure
mkdir -p data/CelebAMask-HQ
mv CelebAMask-HQ/* data/CelebAMask-HQ/

2. Prepare Labels

python utils/prepare_labels.py --data-root ./data/CelebAMask-HQ

Dataset Structure:

data/CelebAMask-HQ/
├── image/
│   ├── 0.jpg
│   ├── 1.jpg
│   └── ...
├── label/
│   ├── 0.png
│   ├── 1.png
│   └── ...
└── list/
    ├── train.txt
    ├── val.txt
    └── test.txt

Training

Training Configuration

Default Parameters (Optimized):

  • Image Size: 256x256 (optimized for speed/accuracy balance)
  • Batch Size: 16 (adjust based on GPU memory)
  • Learning Rate: 0.01 (with warmup and polynomial decay)
  • Epochs: 150
  • Backbone: resnet18 (can be changed)

Training Commands

1. Basic Training (ResNet18)

python train.py --backbone resnet18 --image-size 256 256 --batch-size 16

2. Training with Different Backbones

# ResNet34
python train.py --backbone resnet34 --image-size 256 256 --batch-size 12

# EfficientNet B0
python train.py --backbone efficientnet_b0 --image-size 256 256 --batch-size 16

# EfficientNet B1
python train.py --backbone efficientnet_b1 --image-size 256 256 --batch-size 12

# EfficientNet B2
python train.py --backbone efficientnet_b2 --image-size 256 256 --batch-size 8

3. Advanced Training Options

python train.py \
    --backbone resnet18 \
    --image-size 256 256 \
    --batch-size 16 \
    --lr-start 0.01 \
    --epochs 150 \
    --momentum 0.9 \
    --weight-decay 0.0005 \
    --lr-warmup-epochs 5 \
    --print-freq 100

Training Arguments

Parameter Default Description
--backbone resnet18 Model backbone (resnet18, resnet34, efficientnet_b0, efficientnet_b1, efficientnet_b2)
--image-size 256 256 Input image size (width height)
--batch-size 16 Training batch size
--lr-start 0.01 Initial learning rate
--epochs 150 Number of training epochs
--momentum 0.9 SGD momentum
--weight-decay 0.0005 Weight decay
--lr-warmup-epochs 5 Learning rate warmup epochs
--data-root ./data/CelebAMask-HQ Dataset root directory
--resume False Resume from checkpoint

Training Monitoring

Loss Function: OhemLossWrapper (combines multiple scale losses)

  • Primary Loss: OhemCELoss (CrossEntropy with Online Hard Example Mining)
  • Multi-scale Training: 3 scales (0.75x, 1.0x, 1.25x)

Learning Rate Schedule:

  • Warmup: Linear warmup for first 5 epochs
  • Decay: Polynomial decay with power=0.9

Checkpoints:

  • Models saved every 10 epochs
  • Best model saved based on validation accuracy
  • Final model saved as {backbone}.pt

ONNX Export

Export PyTorch Models to ONNX

1. Basic Export

python onnx_export.py --model resnet18 --weight ./weights/resnet18.pt

2. Export All Models

# ResNet18
python onnx_export.py --model resnet18 --weight ./weights/resnet18.pt

# ResNet34
python onnx_export.py --model resnet34 --weight ./weights/resnet34.pt

# EfficientNet B0
python onnx_export.py --model efficientnet_b0 --weight ./weights/efficientnet_b0.pt

# EfficientNet B1
python onnx_export.py --model efficientnet_b1 --weight ./weights/efficientnet_b1.pt

# EfficientNet B2
python onnx_export.py --model efficientnet_b2 --weight ./weights/efficientnet_b2.pt

ONNX Export Features

Optimizations Applied:

  • OpSet 19 compatibility (ONNX Runtime support)
  • Dynamic batch size support
  • ONNX-safe operations (adaptive pooling)
  • Constant folding enabled
  • Input/Output naming for easy integration

Export Parameters:

  • OpSet Version: 19 (compatible with ONNX Runtime)
  • Dynamic Axes: Batch size dimension
  • Input Names: ['input']
  • Output Names: ['output', 'output16', 'output32']

Inference

PyTorch Inference

1. Basic Inference

python inference.py --model resnet18 --weight ./weights/resnet18.pt --input ./assets/images --output ./assets/results/resnet18

2. Inference with Different Models

# ResNet18
python inference.py --model resnet18 --weight ./weights/resnet18.pt --input ./assets/images --output ./assets/results/resnet18

# ResNet34
python inference.py --model resnet34 --weight ./weights/resnet34.pt --input ./assets/images --output ./assets/results/resnet34

# EfficientNet B0
python inference.py --model efficientnet_b0 --weight ./weights/efficientnet_b0.pt --input ./assets/images --output ./assets/results/efficientnet_b0

ONNX Inference

1. Basic ONNX Inference

python onnx_inference.py --model ./weights/resnet18.onnx --input ./assets/images --output ./assets/results/onnx/resnet18

2. ONNX Inference with Different Models

# ResNet18 ONNX
python onnx_inference.py --model ./weights/resnet18.onnx --input ./assets/images --output ./assets/results/onnx/resnet18

# ResNet34 ONNX
python onnx_inference.py --model ./weights/resnet34.onnx --input ./assets/images --output ./assets/results/onnx/resnet34

# EfficientNet B0 ONNX
python onnx_inference.py --model ./weights/efficientnet_b0.onnx --input ./assets/images --output ./assets/results/onnx/efficientnet_b0

Inference Arguments

Parameter Default Description
--model - Model name (PyTorch) or path (ONNX)
--weight - Path to PyTorch model weights
--input ./assets/images Input image or directory
--output ./assets/results Output directory

Performance Testing

1. PyTorch vs ONNX Speed Comparison

python speed_comparison.py

Output:

  • Detailed speed comparison table
  • Average speedup statistics
  • Best performing models
  • Recommendations for deployment

2. Input Size Impact Analysis

python input_size_test.py

Output:

  • Comprehensive input size comparison
  • Model size consistency verification
  • Speed scaling analysis
  • Memory usage patterns
  • Recommendations for different use cases

3. Model Size and Speed Analysis

python final_comparison.py

Output:

  • Model rankings by speed and size
  • Training status verification
  • Performance recommendations

Model Architecture

BiSeNet Components

  1. Context Path (Backbone)

    • Feature extraction from different scales
    • Attention Refinement Modules (ARM)
    • Multi-scale feature fusion
  2. Spatial Path

    • High-resolution feature preservation
    • Spatial detail enhancement
  3. Feature Fusion Module (FFM)

    • Combines context and spatial features
    • Attention mechanism for feature selection
  4. Output Heads

    • Multi-scale output for better accuracy
    • 19-class segmentation (facial components)

Supported Backbones

Backbone Parameters Input Size Output Channels
ResNet18 ~11.7M 256x256 128, 256, 512
ResNet34 ~21.8M 256x256 128, 256, 512
EfficientNet B0 ~5.3M 256x256 24, 40, 80
EfficientNet B1 ~7.8M 256x256 24, 40, 80
EfficientNet B2 ~9.2M 256x256 24, 48, 88

Troubleshooting

Common Issues

  1. CUDA Out of Memory

    # Reduce batch size
    python train.py --batch-size 8
    
    # Reduce image size
    python train.py --image-size 224 224
  2. ONNX Export Errors

    # Ensure PyTorch version compatibility
    pip install torch==2.0.1 torchvision==0.15.2
    
    # Use CPU for export if GPU issues
    export CUDA_VISIBLE_DEVICES=""
    python onnx_export.py --model resnet18 --weight ./weights/resnet18.pt
  3. Training Not Converging

    # Reduce learning rate
    python train.py --lr-start 0.001
    
    # Increase warmup epochs
    python train.py --lr-warmup-epochs 10

Performance Optimization

  1. For Speed:

    • Use ResNet18 backbone
    • Use 256x256 input size
    • Use ONNX for inference
  2. For Accuracy:

    • Use EfficientNet B2 backbone
    • Use 512x512 input size
    • Train for more epochs
  3. For Memory Efficiency:

    • Use EfficientNet B0 backbone
    • Use smaller batch size
    • Use 224x224 input size

Contributing

Contributions to improve the Face Parsing Model are welcome. Feel free to fork the repository and submit pull requests, or open issues to suggest features or report bugs.

License

The project is licensed under the MIT license.

Citation

@misc{face-parsing,
  author = {Komiljon Mukhammadiev},
  title = {face-parsing},
  year = {2025},
  publisher = {GitHub},
  howpublished = {\url{https://github.com/Mrkomiljon/face-parsing}},
  note = {GitHub repository}
}

Reference

The project is based on face-parsing.

The original model was created by zllrunning/face-parsing.PyTorch. It was then re-implemented and improved by yakhyo/face-parsing, and I have further modified and optimized it for better performance and usability.

About

Comprehensive face parsing model with 5 different backbones (ResNet18/34, EfficientNet B0/B1/B2). Features PyTorch training, ONNX export, and performance comparison tools. Optimized for real-time facial feature segmentation.

Topics

Resources

License

Stars

Watchers

Forks