This is a comprehensive face parsing model for high-precision facial feature segmentation based on BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation. This model accurately segments various facial components such as the eyes, nose, mouth, and the contour of the face from images.
Input Images |
![]() |
![]() |
![]() |
![]() |
ResNet34 |
![]() |
![]() |
![]() |
![]() |
ResNet18 |
![]() |
![]() |
![]() |
![]() |
EfficientNet B0 |
![]() |
![]() |
![]() |
![]() |
EfficientNet B1 |
![]() |
![]() |
![]() |
![]() |
EfficientNet B2 |
![]() |
![]() |
![]() |
![]() |
- Project Description
- Model Performance Comparison
- Installation
- Quick Start
- Dataset Preparation
- Training
- ONNX Export
- Inference
- Performance Testing
- Contributing
- License
Face parsing model segments facial features with remarkable accuracy, making it ideal for applications in digital makeup, augmented reality, facial recognition, and emotion detection. The model processes input images and outputs a detailed mask that highlights individual facial components, distinguishing between skin, hair, eyes, and other key facial landmarks.
- ResNet18 - Fast and efficient (52.8MB, 407 FPS, ~11.7M params)
- ResNet34 - Balanced performance (91.4MB, 295 FPS, ~21.8M params)
- EfficientNet B0 - Lightweight and fast (26.6MB, 170 FPS, ~5.3M params)
- EfficientNet B1 - Good accuracy/speed balance (36.3MB, 150 FPS, ~7.8M params)
- EfficientNet B2 - Higher accuracy (41.5MB, 150 FPS, ~9.2M params)
- [2025-07-23] Added comprehensive performance comparison tools
- [2025-07-23] Fixed ONNX export compatibility issues
- [2025-07-23] Added input size testing and analysis
- [2025-07-23] Improved training with optimized parameters
- [2025-07-23] Added EfficientNet B0, B1, B2 backbones
- [2025-07-23] Enhanced download script for all models
| Model | PyTorch Speed | ONNX Speed | PyTorch Size | ONNX Size | Parameters | Best Use Case |
|---|---|---|---|---|---|---|
| ResNet18 | 2.45ms (407 FPS) | 10.58ms (94 FPS) | 52.8MB | 50.7MB | ~11.7M | Fastest inference |
| ResNet34 | 3.38ms (295 FPS) | 12.83ms (78 FPS) | 91.4MB | 89.3MB | ~21.8M | Balanced performance |
| EfficientNet B0 | 5.85ms (170 FPS) | 11.69ms (85 FPS) | 26.6MB | 26.3MB | ~5.3M | Smallest model |
| EfficientNet B1 | 6.65ms (150 FPS) | 13.6ms (73 FPS) | 36.3MB | 35.8MB | ~7.8M | Good accuracy/speed |
| EfficientNet B2 | 6.66ms (150 FPS) | 13.57ms (73 FPS) | 41.5MB | 41.0MB | ~9.2M | Higher accuracy |
Key Findings:
- ✅ Model size stays constant regardless of input size
- 📊 Speed scales with input area (width × height)
- 💾 Memory usage increases with larger inputs
- 🎯 256x256 provides optimal balance of speed and accuracy
| Input Size | Speed Impact | Memory Impact | Recommendation |
|---|---|---|---|
| 128x128 | Fastest | Lowest | Maximum speed |
| 256x256 | Good | Moderate | Optimal balance |
| 512x512 | Slower | Higher | Higher accuracy |
| 1024x1024 | Slowest | Highest | Maximum accuracy |
🏆 Speed Rankings (Fastest to Slowest):
- ResNet18 - 2.45ms (407 FPS) - Fastest inference
- ResNet34 - 3.38ms (295 FPS) - Fast inference
- EfficientNet B0 - 5.85ms (170 FPS) - Fast inference
- EfficientNet B1 - 6.65ms (150 FPS) - Medium speed
- EfficientNet B2 - 6.66ms (150 FPS) - Medium speed
📦 Size Rankings (Smallest to Largest):
- EfficientNet B0 - 26.6MB - Smallest model
- EfficientNet B1 - 36.3MB - Small model
- EfficientNet B2 - 41.5MB - Medium model
- ResNet18 - 52.8MB - Medium model
- ResNet34 - 91.4MB - Largest model
🎯 Accuracy Rankings (Best to Good):
- EfficientNet B2 - Highest accuracy
- EfficientNet B1 - Better accuracy
- ResNet34 - Better accuracy
- ResNet18 - Good accuracy
- EfficientNet B0 - Good accuracy
To get started with the Face Parsing Model, clone this repository and install the required dependencies:
git clone https://github.com/Mrkomiljon/face-parsing.git
cd face-parsing
pip install -r requirements.txt# Download all models (ResNet18, ResNet34, EfficientNet B0/B1/B2)
bash download.sh# PyTorch inference with ResNet18
python inference.py --model resnet18 --weight ./weights/resnet18.pt --input ./assets/images --output ./assets/results/resnet18
# ONNX inference with ResNet18
python onnx_inference.py --model ./weights/resnet18.onnx --input ./assets/images --output ./assets/results/onnx/resnet18# Compare PyTorch vs ONNX performance
python speed_comparison.py
# Test different input sizes
python input_size_test.py# Download dataset
wget https://github.com/switchablenorms/CelebAMask-HQ/releases/download/v1.0/CelebAMask-HQ.zip
# Extract dataset
unzip CelebAMask-HQ.zip
# Organize dataset structure
mkdir -p data/CelebAMask-HQ
mv CelebAMask-HQ/* data/CelebAMask-HQ/python utils/prepare_labels.py --data-root ./data/CelebAMask-HQDataset Structure:
data/CelebAMask-HQ/
├── image/
│ ├── 0.jpg
│ ├── 1.jpg
│ └── ...
├── label/
│ ├── 0.png
│ ├── 1.png
│ └── ...
└── list/
├── train.txt
├── val.txt
└── test.txt
Default Parameters (Optimized):
- Image Size: 256x256 (optimized for speed/accuracy balance)
- Batch Size: 16 (adjust based on GPU memory)
- Learning Rate: 0.01 (with warmup and polynomial decay)
- Epochs: 150
- Backbone: resnet18 (can be changed)
python train.py --backbone resnet18 --image-size 256 256 --batch-size 16# ResNet34
python train.py --backbone resnet34 --image-size 256 256 --batch-size 12
# EfficientNet B0
python train.py --backbone efficientnet_b0 --image-size 256 256 --batch-size 16
# EfficientNet B1
python train.py --backbone efficientnet_b1 --image-size 256 256 --batch-size 12
# EfficientNet B2
python train.py --backbone efficientnet_b2 --image-size 256 256 --batch-size 8python train.py \
--backbone resnet18 \
--image-size 256 256 \
--batch-size 16 \
--lr-start 0.01 \
--epochs 150 \
--momentum 0.9 \
--weight-decay 0.0005 \
--lr-warmup-epochs 5 \
--print-freq 100| Parameter | Default | Description |
|---|---|---|
--backbone |
resnet18 | Model backbone (resnet18, resnet34, efficientnet_b0, efficientnet_b1, efficientnet_b2) |
--image-size |
256 256 | Input image size (width height) |
--batch-size |
16 | Training batch size |
--lr-start |
0.01 | Initial learning rate |
--epochs |
150 | Number of training epochs |
--momentum |
0.9 | SGD momentum |
--weight-decay |
0.0005 | Weight decay |
--lr-warmup-epochs |
5 | Learning rate warmup epochs |
--data-root |
./data/CelebAMask-HQ | Dataset root directory |
--resume |
False | Resume from checkpoint |
Loss Function: OhemLossWrapper (combines multiple scale losses)
- Primary Loss: OhemCELoss (CrossEntropy with Online Hard Example Mining)
- Multi-scale Training: 3 scales (0.75x, 1.0x, 1.25x)
Learning Rate Schedule:
- Warmup: Linear warmup for first 5 epochs
- Decay: Polynomial decay with power=0.9
Checkpoints:
- Models saved every 10 epochs
- Best model saved based on validation accuracy
- Final model saved as
{backbone}.pt
python onnx_export.py --model resnet18 --weight ./weights/resnet18.pt# ResNet18
python onnx_export.py --model resnet18 --weight ./weights/resnet18.pt
# ResNet34
python onnx_export.py --model resnet34 --weight ./weights/resnet34.pt
# EfficientNet B0
python onnx_export.py --model efficientnet_b0 --weight ./weights/efficientnet_b0.pt
# EfficientNet B1
python onnx_export.py --model efficientnet_b1 --weight ./weights/efficientnet_b1.pt
# EfficientNet B2
python onnx_export.py --model efficientnet_b2 --weight ./weights/efficientnet_b2.ptOptimizations Applied:
- ✅ OpSet 19 compatibility (ONNX Runtime support)
- ✅ Dynamic batch size support
- ✅ ONNX-safe operations (adaptive pooling)
- ✅ Constant folding enabled
- ✅ Input/Output naming for easy integration
Export Parameters:
- OpSet Version: 19 (compatible with ONNX Runtime)
- Dynamic Axes: Batch size dimension
- Input Names: ['input']
- Output Names: ['output', 'output16', 'output32']
python inference.py --model resnet18 --weight ./weights/resnet18.pt --input ./assets/images --output ./assets/results/resnet18# ResNet18
python inference.py --model resnet18 --weight ./weights/resnet18.pt --input ./assets/images --output ./assets/results/resnet18
# ResNet34
python inference.py --model resnet34 --weight ./weights/resnet34.pt --input ./assets/images --output ./assets/results/resnet34
# EfficientNet B0
python inference.py --model efficientnet_b0 --weight ./weights/efficientnet_b0.pt --input ./assets/images --output ./assets/results/efficientnet_b0python onnx_inference.py --model ./weights/resnet18.onnx --input ./assets/images --output ./assets/results/onnx/resnet18# ResNet18 ONNX
python onnx_inference.py --model ./weights/resnet18.onnx --input ./assets/images --output ./assets/results/onnx/resnet18
# ResNet34 ONNX
python onnx_inference.py --model ./weights/resnet34.onnx --input ./assets/images --output ./assets/results/onnx/resnet34
# EfficientNet B0 ONNX
python onnx_inference.py --model ./weights/efficientnet_b0.onnx --input ./assets/images --output ./assets/results/onnx/efficientnet_b0| Parameter | Default | Description |
|---|---|---|
--model |
- | Model name (PyTorch) or path (ONNX) |
--weight |
- | Path to PyTorch model weights |
--input |
./assets/images | Input image or directory |
--output |
./assets/results | Output directory |
python speed_comparison.pyOutput:
- Detailed speed comparison table
- Average speedup statistics
- Best performing models
- Recommendations for deployment
python input_size_test.pyOutput:
- Comprehensive input size comparison
- Model size consistency verification
- Speed scaling analysis
- Memory usage patterns
- Recommendations for different use cases
python final_comparison.pyOutput:
- Model rankings by speed and size
- Training status verification
- Performance recommendations
-
Context Path (Backbone)
- Feature extraction from different scales
- Attention Refinement Modules (ARM)
- Multi-scale feature fusion
-
Spatial Path
- High-resolution feature preservation
- Spatial detail enhancement
-
Feature Fusion Module (FFM)
- Combines context and spatial features
- Attention mechanism for feature selection
-
Output Heads
- Multi-scale output for better accuracy
- 19-class segmentation (facial components)
| Backbone | Parameters | Input Size | Output Channels |
|---|---|---|---|
| ResNet18 | ~11.7M | 256x256 | 128, 256, 512 |
| ResNet34 | ~21.8M | 256x256 | 128, 256, 512 |
| EfficientNet B0 | ~5.3M | 256x256 | 24, 40, 80 |
| EfficientNet B1 | ~7.8M | 256x256 | 24, 40, 80 |
| EfficientNet B2 | ~9.2M | 256x256 | 24, 48, 88 |
-
CUDA Out of Memory
# Reduce batch size python train.py --batch-size 8 # Reduce image size python train.py --image-size 224 224
-
ONNX Export Errors
# Ensure PyTorch version compatibility pip install torch==2.0.1 torchvision==0.15.2 # Use CPU for export if GPU issues export CUDA_VISIBLE_DEVICES="" python onnx_export.py --model resnet18 --weight ./weights/resnet18.pt
-
Training Not Converging
# Reduce learning rate python train.py --lr-start 0.001 # Increase warmup epochs python train.py --lr-warmup-epochs 10
-
For Speed:
- Use ResNet18 backbone
- Use 256x256 input size
- Use ONNX for inference
-
For Accuracy:
- Use EfficientNet B2 backbone
- Use 512x512 input size
- Train for more epochs
-
For Memory Efficiency:
- Use EfficientNet B0 backbone
- Use smaller batch size
- Use 224x224 input size
Contributions to improve the Face Parsing Model are welcome. Feel free to fork the repository and submit pull requests, or open issues to suggest features or report bugs.
The project is licensed under the MIT license.
@misc{face-parsing,
author = {Komiljon Mukhammadiev},
title = {face-parsing},
year = {2025},
publisher = {GitHub},
howpublished = {\url{https://github.com/Mrkomiljon/face-parsing}},
note = {GitHub repository}
}
The project is based on face-parsing.
The original model was created by zllrunning/face-parsing.PyTorch. It was then re-implemented and improved by yakhyo/face-parsing, and I have further modified and optimized it for better performance and usability.
























