support to export static afp8 model #662

n1ck-guo · 2025-07-16T09:00:56Z

example command:

python -m auto_round --format "auto_round" --bits 8 --act_bits 8 --group_size -1 --iters 0 --nsample 2 --seqlen 1024 --model /models/opt-125m/ --data_type fp8 --act_data_type "fp8" --disable_act_dynamic

Signed-off-by: n1ck-guo <heng.guo@intel.com>

auto_round/export/export_to_autoround/export_to_fp8_woq.py

test/test_cpu/test_export.py

wenhuach21 · 2025-07-16T09:18:38Z

@yiliu30 @n1ck-guo Could we rename fp8_sym to something simpler like fp8? That was my mistake, sorry about that.

auto_round/autoround.py

wenhuach21 · 2025-07-16T09:27:45Z

auto_round/script/llm.py

@@ -486,15 +486,30 @@ def tune(args):
    model_name = args.model.rstrip("/")

    if model_name.split('/')[-1].strip('.') == "" and "gguf" not in args.format:
-        export_dir = os.path.join(args.output_dir, f"w{autoround.bits}g{autoround.group_size}")
+        if autoround.group_size == -1:


we support group_size=0 as per_tensor now. Please check the config or support this config

auto_round/autoround.py

Signed-off-by: n1ck-guo <heng.guo@intel.com>

wenhuach21 · 2025-07-17T07:19:00Z

you also need to handle zeros in activation during calibration, raise error or do some tricky op like using unit scale.
And moe layer may have no activation

yiliu30 · 2025-07-17T07:31:35Z

@yiliu30 @n1ck-guo Could we rename fp8_sym to something simpler like fp8? That was my mistake, sorry about that.

It is ok for me.

Signed-off-by: n1ck-guo <heng.guo@intel.com>

…auto-round into hengguo/export_static_fp8

support to export static afp8 model

4dd6e6b

Signed-off-by: n1ck-guo <heng.guo@intel.com>

n1ck-guo requested review from wenhuach21, yiliu30 and WeiweiZhang1 July 16, 2025 09:01

yiliu30 requested changes Jul 16, 2025

View reviewed changes

auto_round/export/export_to_autoround/export_to_fp8_woq.py Outdated Show resolved Hide resolved

test/test_cpu/test_export.py Show resolved Hide resolved

wenhuach21 reviewed Jul 16, 2025

View reviewed changes