STMicroelectronics – STM32 model zoo services

Welcome to STM32 model zoo services!

🎉 We are excited to announce that the STM32 AI model zoo now includes comprehensive PyTorch support, joining TensorFlow and ONNX. It now features a vast library of PyTorch models, all seamlessly integrated with our end-to-end workflows. Whether you want to train, evaluate, quantize, benchmark, or deploy, you’ll find everything you need – plus the flexibility to choose between PyTorch, TensorFlow, and ONNX. Dive into the expanded STM32 model zoo and take your AI projects further than ever on STM32 devices.

The STM32 AI model zoo is a set of services and scripts used to ease end to end AI models integration on ST devices. This can be used in conjunction with the STM32 model zoo, which contains a collection of reference machine learning models optimized to run on STM32 microcontrollers. Available on GitHub, it is a valuable resource for anyone looking to add AI capabilities to their STM32-based projects.

Scripts to easily retrain or fine-tune any model from user datasets (BYOD and BYOM)
A set of services and chained services to quantize, benchmark, predict, and evaluate any model (BYOM)
Application code examples automatically generated from user AI models

These models can be useful for quick deployment if you are interested in the categories they were trained on. We also provide training scripts to perform transfer learning or to train your own model from scratch on your custom dataset.

The performance on reference STM32 MCUs and MPUs is provided for both float and quantized models. This project is organized by application. For each application, you will have a step-by-step guide indicating how to train and deploy the models.

To clone the repository please use:

git clone https://github.com/STMicroelectronics/stm32ai-modelzoo-services.git --depth 1

What's new in releases :

4.0:

Major PyTorch support for Image Classification (IC) and Object Detection (OD)
Support of STEdgeAI Core v3.0.0
New training and evaluation scripts for PyTorch models
Expanded model selection and improved documentation
Unified workflow for TensorFlow and PyTorch
Performance and usability improvements
New use cases: Face Detection (FD), Arc Fault Detection (AFD), Re-Identification (ReID)
New mixed precision models (Weights 4-bits, Activations 8-bits) for IC and OD use cases
Support for Keras 3.8.0, TensorFlow 2.18.0, PyTorch 2.7.1, and ONNX 1.16.1
Python software architecture rework
Docker-based setup available, with a ready-to-use image including the full software stack.

3.2:

Support of STEdgeAI Core v2.2.0.
Support of X-Linux-AI v6.1.0 support for MPU.
New use cases added: StyleTransfer and FastDepth.
New models added: Face Detection, available in the Object Detection use case, and Face Landmarks, available in the Pose Estimation use case.
Architecture and codebase clean-up.

3.1:

Support for STEdgeAI Core v2.1.0.
Application code for STM32N6 board is now directly available in the STM32 model zoo repository, eliminating the need for separate downloads.
Support of On device evaluation and On device prediction on the STM32N6570-DK boards integrated in evaluation and prediction services.
More models supported: Yolov11, LSTM model added in Speech Enhancement, ST Yolo X variants.
ClearML support.
A few bug fixes and improvements, such as proper imports and OD metrics alignment.

3.0:

Full support of the new STM32N6570-DK board.
Included additional models compatible with the STM32N6.
Included support for STEdgeAI Core v2.0.0.
Split of model zoo and services into two GitHub repositories
Integrated support for ONNX model quantization and evaluation from h5 models.
Expanded use case support to include Instance Segmentation and Speech Enhancement.
Added Pytorch support through the speech enhancement Use Case.
Support of On device evaluation and prediction on the STM32N6570-DK boards.
Model Zoo hosted on Hugging Face

2.1:

Included additional models compatible with the STM32MP257F-EV1 board.
Added support for per-tensor quantization.
Integrated support for ONNX model quantization and evaluation.
Included support for STEdgeAI Core v1.0.0.
Expanded use case support to include Pose Estimation and Semantic Segmentation.
Standardized logging information for a unified experience.

2.0:

An aligned and uniform architecture for all the use cases
A modular design to run different operation modes (training, benchmarking, evaluation, deployment, quantization) independently or with an option of chaining multiple modes in a single launch.
A simple and single entry point to the code : a .yaml configuration file to configure all the needed services.
Support of the Bring Your Own Model (BYOM) feature to allow the user (re-)training his own model. Example is provided here, chapter 5.1.
Support of the Bring Your Own Data (BYOD) feature to allow the user finetuning some pretrained models with his own datasets. Example is provided here, chapter 2.3.

Available use-cases

The ST model zoo provides a collection of independent services and pre-built chained services that can be used to perform various functions related to machine learning. The individual services include tasks such as training or quantization of a model, while the chained services combine multiple services to perform more complex functions, such as training the model, quantizing it, and evaluating the quantized model successively before benchmarking it on a HW of your choice.

All trained models in the STM32 model zoo are provided with their configuration .yaml file used to generate them. This is a very good baseline to start with!

Tip

All services are available for following use cases with quick and easy examples that are provided and can be executed for a fast ramp up (click on use cases links below).

Image Classification
Object Detection
Pose Estimation
Face Detection
Semantic Segmentation
Instance Segmentation
Depth Estimation
Neural Style Transfer
Re-Identification
Audio Event Detection
Speech Enhancement
Human Activity Recognition
Hand Posture Recognition
Arc Fault Detection

Image Classification

Image classification is used to classify the content of an image within a predefined set of classes. Only one class is predicted from an input image.

Image classification (IC) models

Suitable Targets for Deployment	Models
STM32H747I-DISCO	MobileNet v1 0.25, MobileNet v1 0.5, MobileNet v2 0.35, ResNet8 v1, ST ResNet8, ResNet32 v1, SqueezeNet v1.1, FD MobileNet 0.25, ST FD MobileNet, ST EfficientNet, Mnist
NUCLEO-H743ZI2	MobileNet v1 0.25, MobileNet v1 0.5, MobileNet v2 0.35, ResNet8 v1, ST ResNet8, ResNet32 v1, SqueezeNet v1.1, FD MobileNet 0.25, ST FD MobileNet, ST EfficientNet, Mnist
STM32MP257F-EV1	MobileNet v1 1.0, MobileNet v2 1.0, MobileNet v2 1.4, ResNet50 v2, EfficientNet v2
STM32N6570-DK	MobileNet v1 1.0, MobileNet v2 1.0, MobileNet v2 1.4, ResNet50 v2, EfficientNet v2, DarkNet_pt, Dla_pt, FdMobileNet_pt, HardNet_pt, MnasNet_pt, MobileNet_pt, MobileNetv2_pt, MobileNetv4_pt, PeleeNet_pt, PreresNet18_pt, ProxylessNas_pt, RegNet_pt, SemnasNet_pt, ShuffleNetv2_pt, Sqnxt_pt, SqueezeNet_pt, St_ResNet_pt

Selecting a model for a specific task or a specific device is not always an easy task, and relying on metrics like the inference time and the accuracy, as in the example figure on food-101 classification below, can help you make the right choice before fine-tuning your model.

Please find below some tutorials for a quick ramp up!

Image Classification top readme here

Object Detection

Object detection is used to detect, locate and estimate the occurrences probability of predefined objects from input images.

Object Detection (OD) Models

Suitable Targets for Deployment	Models
STM32H747I-DISCO	ST Yolo LC v1
STM32N6570-DK	Tiny Yolo v2, ST Yolo X, Yolo v8, Yolo v11, Blazeface front, SSD_MobileNetV1_pt, SSD_MobileNetV2_pt, SSDLite_MobileNetV1_pt, SSDLite_MobileNetV2_pt, SSDLite_MobileNetV3Large_pt, SSDLite_MobileNetV3Small_pt, ST_YoloDv2Milli_pt, ST_YoloDv2Tiny_pt

Relying on metrics like the inference time and the mean Average Precision (mAP) as in example figure on people detection below can help making the right choice before fine tuning your model, as well as checking HW capabilities for OD task.

Please find below some tutorials for a quick ramp up!

Object Detection top readme here

Face Detection

Face detection is used to detect, locate and estimate the occurrences probability of faces from input images.

Face Detection (FD) Models

Models	Input Resolutions	Supported Services	Targets for deployment
Blazeface front	128x128x3	Benchmarking / Prediction / Deployment/ Evaluation	STM32N6570-DK
Yunet	3x320x320	Benchmarking / Prediction / Deployment/ Evaluation	STM32N6570-DK

Full FD Services : evaluation, quantization, benchmarking, prediction, deployment

Please find below some tutorials for a quick ramp up!

Face Detection top readme here

Pose Estimation

Pose estimation allows to detect key points on some specific objects (people, hand, face, ...). It can be single pose where key points can be extracted from a single object, or multi pose where location of key points are estimated on all detected objects from the input images.

Pose Estimation (PE) Models

Models	Input Resolutions	Supported Services	Targets for deployment
Yolo v8n pose	192x192x3 256x256x3 320x320x3	Evaluation / Benchmarking / Prediction / Deployment	STM32N6570-DK
Yolo v11n pose	256x256x3 320x320x3	Benchmarking / Prediction / Deployment	STM32N6570-DK
ST MoveNet	192x192x3 224x224x3 256x256x3	All services	STM32N6570-DK STM32MP257F-EV1
MoveNet	192x192x3 256x256x3	Evaluation / Quantization / Benchmarking / Prediction	STM32MP257F-EV1
Face landmarks	192x192x3	Benchmarking / Prediction	STM32N6570-DK
Hand landmarks	224x224x3	Benchmarking / Prediction	STM32N6570-DK

Full PE Services : training, evaluation, quantization, benchmarking, prediction, deployment

Various metrics can be used to estimate quality of a single or multiple pose estimation use case. Metrics like the inference time and the Object Key point Similarity (OKS) as in example figure on single pose estimation below can help making the right choice before fine tuning your model, as well as checking HW capabilities for PE task.

Please find below some tutorials for a quick ramp up!

Pose Estimation top readme here

Semantic Segmentation

Semantic segmentation is an algorithm that associates a label to every pixel in an image. It is used to recognize a collection of pixels that form distinct categories. It doesn't differentiate instances of the same category, which is the main difference between instance and semantic segmentation.

Semantic Segmentation (SemSeg) Models

Models	Input Resolutions	Supported Services	Targets for deployment
DeepLab v3	256x256x3 320x320x3 416x416x3 512x512x3	Full Seg Services	STM32MP257F-EV1 STM32N6570-DK

Full Seg Services : training, evaluation, quantization, benchmarking, prediction, deployment

Various metrics can be used to estimate the quality of a segmentation use case. Metrics like the inference time and IoU, as in the example figure on person segmentation below, can help you make the right choice before fine-tuning your model, as well as checking HW capabilities for the segmentation task.

Please find below some tutorials for a quick ramp up!

Semantic Segmentation top readme here

Instance Segmentation

Instance segmentation is an algorithm that associates a label to every pixel in an image. It also outputs bounding boxes on detected class objects. It is used to recognize a collection of pixels that form distinct categories and instances of each category. It differentiates instances of the same category, which is the main difference between instance and semantic segmentation.

Instance Segmentation (InstSeg) Models

Models	Input Resolutions	Supported Services	Targets for deployment
yolov8n_seg	256x256x3 320x320x3	Prediction, Benchmark, Deployment	STM32N6570-DK
yolov11n_seg	256x256x3 320x320x3	Prediction, Benchmark, Deployment	STM32N6570-DK

Please find below some tutorials for a quick ramp up!

How can I deploy an Ultralytics Yolov8 instance segmentation model?

Instance Segmentation top readme here

Depth Estimation

This allows to predict the distance to objects from an image as a pixel-wise depth map.

Depth Estimation (DE) Models

Models	Input Resolutions	Supported Services
fast_depth	224x224x3 256x256x3 320x320x3	benchmarking / prediction

Depth Estimation top readme here.

Neural Style Transfer

Neural Style Transfer is a deep learning technique that applies the artistic style of one image to the content of another image by optimizing a new image to simultaneously match the content features of the original and the style features of the reference image.

Neural style transfer (NST) Models

Models	Input Resolutions	Supported Services
Xinet_picasso_muse	160x160x3	Prediction, Benchmark

Neural style transfer top readme here

Re-Identification (ReID)

Re-Identification is used to recognize a specific object (person, vehicle, ...) from a set of images.

Re-Identification (ReID) models

Models	Input Resolutions	Supported Services	Suitable Targets for deployment
MobileNet v2	256x128x3	Full IC Services	STM32N6570-DK
OSNet	256x128x3	Full IC Services	STM32N6570-DK

Full IC Services : training, evaluation, quantization, benchmarking, prediction, deployment

Re-Identification top readme here

Audio Event Detection

This is used to detect a set of pre-defined audio events.

Audio Event Detection (AED) Models

Audio Event Detection use case

Models	Input Resolutions	Supported Services	Targets for deployment
miniresnet	64x50x1	Full AED Services	B-U585I-IOT02A
miniresnet v2	64x50x1	Full AED Services	B-U585I-IOT02A
yamnet 256/1024	64x96x1	Full AED Services	B-U585I-IOT02A STM32N6570-DK

Full AED Services : training, evaluation, quantization, benchmarking, prediction, deployment

Various metrics can be used to estimate quality of an audio event detection UC. The main ones are the inference time and the accuracy (percentage of good detections) on esc-10 dataset as in example figure below. This may help making the right choice before fine tuning your model, as well as checking HW capabilities for such AED task.

Please find below some tutorials for a quick ramp up!

Audio Event Detection top readme here

Speech Enhancement

Speech Enhancement is an algorithm that enhances audio perception in a noisy environment.

Speech Enhancement (SE) Models

Models	Input Resolutions	Supported Services	Targets for deployment
stft_tcnn	257x40	Full SE Services	STM32N6570-DK

Full SE Services : training, evaluation, quantization, benchmarking, deployment

Speech Enhancement top readme here

Human Activity Recognition

This allows to recognize various activities like walking, running, ...

Human Activity Recognition (HAR) Models

Human Activity Recognition use case

Models	Input Resolutions	Supported Services	Targets for deployment
gmp	24x3x1 48x3x1	training / Evaluation / Benchmarking / Deployment	B-U585I-IOT02A
ign	24x3x1 48x3x1	training / Evaluation / Benchmarking / Deployment	B-U585I-IOT02A

Please find below some tutorials for a quick ramp up!

Human Activity Recognition top readme here

Hand Posture Recognition

This allows to recognize a set of hand postures using Time of Flight (ToF) sensor.

Hand Posture Recognition (HPR) Models

Hand Posture Recognition use case

Models	Input Resolutions	Supported Services	Targets for deployment
ST CNN 2D Hand Posture	64x50x1	training / Evaluation / Benchmarking / Deployment	NUCLEO-F401RE with X-NUCLEO-53LxA1 Time-of-Flight Nucleo expansion board

Hand Posture Recognition top readme here

Arc Fault Detection

Arc fault detection is used to classify electrical signals as normal or arc fault conditions.

Arc Fault Detection (AFD) Models

Arc Fault Detection use case

Models	Input Resolutions	Supported Services
st_conv	4x512x1 1x512x1	training, evaluation, quantization, benchmarking, prediction
st_dense	8x512x1 1x512x1	training, evaluation, quantization, benchmarking, prediction

Please find below some tutorials for a quick ramp up!

Arc Fault Detection top readme here

STM32 model zoo Docker Image

A Docker-based setup is available for the STM32AI Model Zoo, including a ready-to-use image that captures the full software stack (tools, dependencies, and configuration) in a single, consistent environment. This Docker configuration reduces host-specific installation and compatibility issues, and offers a straightforward way to run the project on different platforms with identical behavior. It also makes it easier to share and reproduce workflows, whether training, evaluating, or running experiments, by keeping the runtime environment standardized across machines.

Hugging Face host

The Model Zoo Dashboard is hosted in a Docker environment under the STMicroelectronics Organization. This dashboard is developed using Dash Plotly and Flask, and it operates within a Docker container. It can also run locally if Docker is installed on your system. The dashboard provides the following features:

• Training: Train machine learning models. • Evaluation: Evaluate the performance of models. • Benchmarking: Benchmark your model using ST Edge AI Developer Cloud • Visualization: Visualize model performance and metrics. • User Configuration Update: Update and modify user configurations directly from the dashboard. • Output Download: Download model results and outputs.

You can also find our models on Hugging Face under the STMicroelectronics Organization. Each model from the STM32AI Model Zoo is represented by a model card on Hugging Face, providing all the necessary information about the model and linking to dedicated scripts.

Before you start

For a detailed guide on installing and setting up the model zoo and its requirements, especially when operating behind a proxy in a corporate environment, refer to the wiki article How to install STM32 model zoo.

Create an account on myST and sign in to STEdgeAI Developer Cloud to access the service.
Alternatively, install STEdgeAI Core locally and obtain the path to the stm32ai executable.
If using a GPU, install the appropriate GPU driver. For NVIDIA GPUs, refer to the CUDA and cuDNN installation guide. On Windows, for optimal GPU training performance, avoid using WSL. If using conda, see below for installation. to https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html to install CUDA and CUDNN. On Windows, it is not recommended to use WSL to get the best GPU training acceleration. If using conda, see below for installation.
For Docker-based execution of the Model Zoo, see README.md.
Python 3.12.9 is required. Download it from python.org.
- On Windows, ensure the Add python.exe to PATH option is selected during installation.
- On Windows, if you plan to use the pesq library (for speech quality evaluation), you must have Visual Studio with C++ build tools installed. Download from Visual Studio Downloads.

Clone this repository:

git clone https://github.com/STMicroelectronics/stm32ai-modelzoo-services.git --depth 1
cd stm32ai-modelzoo-services

Create a Python environment using either venv or conda:

With venv:
```
python -m venv st_zoo
```
With conda:
```
conda create -n st_zoo python=3.12.9
```

Activate your environment:

venv (Windows):
```
st_zoo\Scripts\activate.bat
```
venv (Unix/Mac):
```
source st_zoo/bin/activate
```
conda:
```
conda activate st_zoo
```

If using an NVIDIA GPU with conda, install CUDA libraries and set the path:

conda install -c conda-forge cudatoolkit=11.8 cudnn
mkdir -p $CONDA_PREFIX/etc/conda/activate.d
echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/' > $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh

Then install all required Python packages:

pip install -r requirements.txt

Initialize Git Submodules

Some application code in this repository is provided as git submodules. These submodules contain essential code for specific use cases and are not included in the main repository by default. To ensure all features and application examples work correctly, you need to initialize and update the submodules after cloning the repository:

git submodule update --init --recursive

This command will download all necessary submodules content, it is only needed if you plan to use deployment features.

Practical Notes

Important

stm32ai-tao is a GitHub repository provides Python scripts and Jupyter notebooks to manage a complete life cycle of a model from training, to compression, optimization and benchmarking using NVIDIA TAO Toolkit and STEdgeAI Developer Cloud.

Caution

If there are any white spaces in the paths (for Python, STM32CubeIDE, or STEdgeAI Core local installation), this can result in errors. Avoid having paths with white spaces.

Tip

In this project, we are using the ClearML library to log the results of different runs.

ClearML Setup

Sign Up: Sign up for free to the ClearML Hosted Service. Alternatively, you can set up your own server as described here.
Create Credentials: Go to your ClearML workspace and create new credentials.

Configure ClearML: Create a clearml.conf file and paste the credentials into it. If you are behind a proxy or using SSL portals, add verify_certificate = False to the configuration to make it work. Here is an example of what your clearml.conf file might look like:

api {
    web_server: https://app.clear.ml
    api_server: https://api.clear.ml
    files_server: https://files.clear.ml
    # Add this line if you are behind a proxy or using SSL portals
    verify_certificate = False
    credentials {
        "access_key" = "YOUR_ACCESS_KEY"
        "secret_key" = "YOUR_SECRET_KEY"
    }
}

Once configured, your experiments will be logged directly and shown in the project section under the name of your project.

MLflow Setup

In this project, we are also using the MLflow library to log the results of different runs.

Windows Path Length Limitation

Depending on which version of Windows OS you are using or where you place the project, the output log files might have a very long path, which might result in an error at the time of logging the results. By default, Windows uses a path length limitation (MAX_PATH) of 256 characters. To avoid this potential error, follow these steps:

Enable Long Paths: Create (or edit) a variable named LongPathsEnabled in the Registry Editor under Computer\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem and assign it a value of 1. This will change the maximum length allowed for the file path on Windows machines and will avoid any errors resulting due to this. For more details, refer to Naming Files, Paths, and Namespaces.
GIT Configuration: If you are using Git, the line below may help solve the long path issue:
```
git config --system core.longpaths true
```

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
api		api
application_code		application_code
arc_fault_detection		arc_fault_detection
audio_event_detection		audio_event_detection
common		common
depth_estimation		depth_estimation
docker		docker
face_detection		face_detection
hand_posture		hand_posture
human_activity_recognition		human_activity_recognition
image_classification		image_classification
instance_segmentation		instance_segmentation
neural_style_transfer		neural_style_transfer
object_detection		object_detection
pose_estimation		pose_estimation
re_identification		re_identification
semantic_segmentation		semantic_segmentation
speech_enhancement		speech_enhancement
tutorials		tutorials
.gitmodules		.gitmodules
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.md		LICENSE.md
README.md		README.md
SECURITY.md		SECURITY.md
__init__.py		__init__.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

STMicroelectronics – STM32 model zoo services

What's new in releases :

Available use-cases

Image Classification

Object Detection

Face Detection

Pose Estimation

Semantic Segmentation

Instance Segmentation

Depth Estimation

Neural Style Transfer

Re-Identification (ReID)

Audio Event Detection

Speech Enhancement

Human Activity Recognition

Hand Posture Recognition

Arc Fault Detection

STM32 model zoo Docker Image

Hugging Face host

Before you start

Initialize Git Submodules

Practical Notes

ClearML Setup

MLflow Setup

Windows Path Length Limitation

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

STMicroelectronics/stm32ai-modelzoo-services

Folders and files

Latest commit

History

Repository files navigation

STMicroelectronics – STM32 model zoo services

What's new in releases :

Available use-cases

Before you start

Initialize Git Submodules

Practical Notes

ClearML Setup

MLflow Setup

Windows Path Length Limitation

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Uh oh!

Uh oh!

Languages