Skip to content

Conversation

ahsan-ca
Copy link

@ahsan-ca ahsan-ca commented Sep 26, 2025

Motivation

Doing exhaustive tune for the models improves performance on MI350. We can add these models to the quick tune list to improve out of the box performance for MI350.

Technical Details

Add problem configs for models run using onnxruntime examples on MI350.

Test Plan

Test Result

Submission Checklist

-t f32 -out_datatype f32 -transA false -transB false -g 1 -m 1 -n 4096 -k 4096
-t f32 -out_datatype f32 -transA false -transB false -g 1 -m 1 -n 32000 -k 4096
#inception_v3_int8_bs16.onnx
conv -F 1 -f GNC01 -I NGC01 -O NGC01 -n 16 -c 3 -H 299 -W 299 -k 32 -y 3 -x 3 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -g 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

convolution should be int8, right? this should start by convint8 instead of conv

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was captured from the script that we have. I will double check it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have double checked, that is what the model produces. This is from one of the onnxruntime-inference-examples, it seems the quantization is done at runtime for this and that is why we don't see convint8 for this. The int8 label is used because the model comes from migx_onnxrt_inception_v3_int8_benchmarks in DLM.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does it mean that the quantization is done at runtime?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a calibration table that is used for int8 inference. I have shared more details in a Teams message with you regarding how its being done.

Copy link
Contributor

@dhernandez0 dhernandez0 Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, I've answered in teams as well. Something seems wrong, if we calibrate for int8, why don't we do inference in int8?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dhernandez0
Problem configs are collected from inceptionv3 model that is not quantized for int8. I have updated the label name to reflect that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants