Skip to content

support to export static afp8 model #662

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

n1ck-guo
Copy link
Contributor

@n1ck-guo n1ck-guo commented Jul 16, 2025

example command:

python -m auto_round --format "auto_round" --bits 8 --act_bits 8 --group_size -1 --iters 0 --nsample 2 --seqlen 1024 --model /models/opt-125m/ --data_type fp8 --act_data_type "fp8" --disable_act_dynamic

Signed-off-by: n1ck-guo <heng.guo@intel.com>
@wenhuach21
Copy link
Contributor

@yiliu30 @n1ck-guo Could we rename fp8_sym to something simpler like fp8? That was my mistake, sorry about that.

@@ -486,15 +486,30 @@ def tune(args):
model_name = args.model.rstrip("/")

if model_name.split('/')[-1].strip('.') == "" and "gguf" not in args.format:
export_dir = os.path.join(args.output_dir, f"w{autoround.bits}g{autoround.group_size}")
if autoround.group_size == -1:
Copy link
Contributor

@wenhuach21 wenhuach21 Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we support group_size=0 as per_tensor now. Please check the config or support this config

n1ck-guo added 4 commits July 17, 2025 01:07
Signed-off-by: n1ck-guo <heng.guo@intel.com>
Signed-off-by: n1ck-guo <heng.guo@intel.com>
Signed-off-by: n1ck-guo <heng.guo@intel.com>
@wenhuach21
Copy link
Contributor

wenhuach21 commented Jul 17, 2025

you also need to handle zeros in activation during calibration, raise error or do some tricky op like using unit scale.
And moe layer may have no activation

@yiliu30
Copy link
Contributor

yiliu30 commented Jul 17, 2025

@yiliu30 @n1ck-guo Could we rename fp8_sym to something simpler like fp8? That was my mistake, sorry about that.

It is ok for me.

n1ck-guo added 2 commits July 17, 2025 04:34
Signed-off-by: n1ck-guo <heng.guo@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants