Skip to content

Cannot load Qwen3 models #936

@toanhuynhnguyen

Description

@toanhuynhnguyen

Feature request

I cannot load Qwen3 models, such as when I run this command:

docker run --entrypoint optimum-cli \
    -v $(pwd)/data:/data \
    --privileged \
    -e HF_TOKEN=${HF_TOKEN} \
    -e HF_AUTO_CAST_TYPE="bf16" \
       ghcr.io/huggingface/neuronx-tgi:latest \
       export neuron \
       --model Qwen/Qwen3-8B \
       --auto_cast_type bf16 \
       --batch_size 20 \
       --sequence_length 13000 \
       /data1/qwen3_neuron_8B_bz20_13k

I get this error:

ubuntu@ip-172-31-50-88:~$ docker run --entrypoint optimum-cli \
    -v $(pwd)/data:/data \
    --privileged \
    -e HF_TOKEN=${HF_TOKEN} \
    -e HF_AUTO_CAST_TYPE="bf16" \
       ghcr.io/huggingface/neuronx-tgi:latest \
       export neuron \
       --model Qwen/Qwen3-8B \
       --auto_cast_type bf16 \
       --batch_size 20 \
       --sequence_length 13000 \
       /data1/qwen3_neuron_8B_bz20_13k
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 1071, in from_pretrained
    config_class = CONFIG_MAPPING[config_dict["model_type"]]
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 773, in __getitem__
    raise KeyError(key)
KeyError: 'qwen3'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.10/dist-packages/optimum/exporters/neuron/__main__.py", line 786, in <module>
    main()
  File "/usr/local/lib/python3.10/dist-packages/optimum/exporters/neuron/__main__.py", line 732, in main
    input_shapes, neuron_config_class = get_input_shapes_and_config_class(task, args)
  File "/usr/local/lib/python3.10/dist-packages/optimum/exporters/neuron/__main__.py", line 125, in get_input_shapes_and_config_class
    neuron_config_constructor = get_neuron_config_class(task, args.model)
  File "/usr/local/lib/python3.10/dist-packages/optimum/exporters/neuron/__main__.py", line 132, in get_neuron_config_class
    config = AutoConfig.from_pretrained(model_id)
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 1073, in from_pretrained
    raise ValueError(
ValueError: The checkpoint you are trying to load has model type `qwen3` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

You can update Transformers with the command `pip install --upgrade transformers`. If this does not work, and the checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get the most up-to-date code by installing Transformers from source with the command `pip install git+https://github.com/huggingface/transformers.git`
Traceback (most recent call last):
  File "/usr/local/bin/optimum-cli", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/optimum/commands/optimum_cli.py", line 208, in main
    service.run()
  File "/usr/local/lib/python3.10/dist-packages/optimum/commands/export/neuronx.py", line 305, in run
    subprocess.run(full_command, shell=True, check=True)
  File "/usr/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'python3 -m optimum.exporters.neuron --model Qwen/Qwen3-8B --auto_cast_type bf16 --batch_size 20 --sequence_length 13000 /data1/qwen3_neuron_8B_bz20_13k' returned non-zero exit status 1.

Motivation

Cannot load Qwen3 models

Your contribution

Cannot load Qwen3 models

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions