Model Conversion and Quantization Script

This repository contains a Python script to convert Hugging Face models to ONNX and ORT formats, perform quantization, and generate README files for the converted models. The script automates the process of optimizing models for deployment, making it easier to use models in different environments.

日本語版READMEはこちら

Model Conversion and Quantization Script
- Features
- Requirements
- Usage
- Example Usage
- Notes
- License
- Contribution

Features

Model Conversion: Convert Hugging Face models to ONNX and ORT formats.
Model Optimization: Optimize the ONNX models for better performance.
Quantization: Perform FP16, INT8, and UINT8 quantization on the models.
README Generation: Automatically generate English and Japanese README files for the converted models.
Hugging Face Integration: Optionally upload the converted models to Hugging Face Hub.

Requirements

Python 3.11 or higher
Install required packages using requirements.txt:
```
pip install -r requirements.txt
```

Alternatively, you can install the packages individually:

pip install torch transformers onnx onnxruntime onnxconverter-common onnxruntime-tools onnxruntime-transformers huggingface_hub

Usage

Clone the Repository

git clone https://github.com/yourusername/model_conversion.git
cd model_conversion

Install Dependencies

Ensure that you have Python 3.11 or higher installed. Install the required packages using requirements.txt:
```
pip install -r requirements.txt
```
Run the Conversion Script

The script convert_model.py converts and quantizes the model.
```
python convert_model.py --model your-model-name --output_dir output_directory
```
- Replace your-model-name with the name or path of the Hugging Face model you want to convert.
- The --output_dir argument specifies the output directory. If not provided, it defaults to the model name.
Example:
```
python convert_model.py --model bert-base-uncased --output_dir bert_onnx
```
Upload to Hugging Face (Optional)

To upload the converted models to Hugging Face Hub, add the --upload flag.
```
python convert_model.py --model your-model-name --output_dir output_directory --upload
```
Make sure you are logged in to Hugging Face CLI:
```
huggingface-cli login
```

Example Usage

After running the conversion script, you can use the converted models as shown below:

import onnxruntime as ort
import numpy as np
from transformers import AutoTokenizer
import os

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained('your-model-name')

# Prepare inputs
text = 'Replace this text with your input.'
inputs = tokenizer(text, return_tensors='np')

# Specify the model paths
# Test both the ONNX model and the ORT model
model_paths = [
    'onnx_models/model_opt.onnx',    # ONNX model
    'ort_models/model.ort'           # ORT format model
]

# Run inference with each model
for model_path in model_paths:
    print(f'\n===== Using model: {model_path} =====')
    # Get the model extension
    model_extension = os.path.splitext(model_path)[1]

    # Load the model
    if model_extension == '.ort':
        # Load the ORT format model
        session = ort.InferenceSession(model_path, providers=['CPUExecutionProvider'])
    else:
        # Load the ONNX model
        session = ort.InferenceSession(model_path)

    # Run inference
    outputs = session.run(None, dict(inputs))

    # Display the output shapes
    for idx, output in enumerate(outputs):
        print(f'Output {idx} shape: {output.shape}')

    # Display the results (add further processing if needed)
    print(outputs)

Notes

Ensure that your ONNX Runtime version is 1.15.0 or higher to use ORT format models.
Adjust the providers parameter based on your hardware (e.g., 'CUDAExecutionProvider' for GPUs).

License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.

Contribution

Contributions are welcome! Please open an issue or submit a pull request for improvements.

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
.github/workflows		.github/workflows
.dockerignore		.dockerignore
.gitignore		.gitignore
.python-version		.python-version
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
README_ja.md		README_ja.md
convert_model.py		convert_model.py
process_issues.py		process_issues.py
pyproject.toml		pyproject.toml
readme_generator.py		readme_generator.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Model Conversion and Quantization Script

Features

Requirements

Usage

Example Usage

Notes

License

Contribution

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

ayutaz/convert_and_quantize_model

Folders and files

Latest commit

History

Repository files navigation

Model Conversion and Quantization Script

Features

Requirements

Usage

Example Usage

Notes

License

Contribution

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages