Manually converted `Qwen2.5-Coder-0.5B-Instruct` does not work, but  `onnx-community/Qwen2.5-0.5B-Instruct` does

### System Info

Test environment

| Component  | Version |
| :--- | :--- | 
| transformers.js  | `^3.7.2` |
| Chrome browser | Latest |

Environment used to convert HF model to ONNX-

| Component  | Version |
| :--- | :--- | 
| transformers.js  | `main` |
| `scripts\convert.py` | `requirements.txt` |

### Environment/Platform

- [x] Website/web-app
- [ ] Browser extension
- [ ] Server-side (e.g., Node.js, Deno, Bun)
- [ ] Desktop app (e.g., Electron)
- [ ] Other (e.g., VSCode extension)

### Description

I am trying to run the example [webgpu-chat](https://github.com/huggingface/transformers.js/tree/main/examples/webgpu-chat) from the` main` branch to simply generate a text output within the browser.

When I use [onnx-community/Qwen2.5-0.5B-Instruct](https://huggingface.co/onnx-community/Qwen2.5-0.5B-Instruct) onnx model, it works and I get an output from the model. 

However, when I use `convert.py` from scripts folder to convert HF [Qwen2.5-Coder-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-0.5B-Instruct) into onnx, and use the custom model, then there is exception and no output is generated.

### Reproduction

I cloned the HF model locally to a new folder `custom` and successfully generated the ONNX using below command-

```shell
python -m scripts.convert --model-id custom/Qwen2.5-Coder-0.5B-Instruct --quantize --task text-generation
```

Below is the code snippet I used-

````javascript
import { pipeline } from '@huggingface/transformers'; 

const model_id = "onnx-community/Qwen2.5-0.5B-Instruct"; 
const generator = await pipeline("text-generation", model_id, { dtype: "fp16", device: "webgpu" }); 

const text = 'def hello_world():'; 
const output = await generator(text, { max_new_tokens: 45, temperature: 0.5, top_k: 0.5, do_sample: false}); 

console.debug(output[0].generated_text);
````

- [x] `onnx-community/Qwen2.5-Coder-0.5B-Instruct` works correctly and I get an output.
- [ ] `custom/Qwen2.5-Coder-0.5B-Instruct` throws an exception from `await generator` with a message `2374903832` :-/

I also tried an alternative code snippet-

````javascript
import { AutoTokenizer, AutoModelForCausalLM} from '@huggingface/transformers'; 

const model_id = "onnx-community/Qwen2.5-0.5B-Instruct"; 
const tokenizer = await AutoTokenizer.from_pretrained(model_id); 
const model = await AutoModelForCausalLM.from_pretrained(model_id, { dtype: 'fp16', device: 'webgpu' }); 
const generated_text = tokenizer.decode(outputs[0], { skip_special_tokens: true });
console.debug(generated_text); 
````

- [x] Again `onnx-community/Qwen2.5-0.5B-Instruct` works correctly and I get an output.
- [ ] But using `custom/Qwen2.5-0.5B-Instruct` gives below runtime error :-/

```shell
Uncaught (in promise) RuntimeError: Aborted(). Build with -sASSERTIONS for more info.
    at L (ort-wasm-simd-threaded.jsep.mjs:22:101)
    at tb (ort-wasm-simd-threaded.jsep.mjs:60:98)
    at ort-wasm-simd-threaded.jsep.wasm:0x1365fa
    at ort-wasm-simd-threaded.jsep.wasm:0x404b9
    at ort-wasm-simd-threaded.jsep.wasm:0x75f9b
    at ort-wasm-simd-threaded.jsep.wasm:0xd24b
    at ort-wasm-simd-threaded.jsep.wasm:0x1bdf9
    at ort-wasm-simd-threaded.jsep.wasm:0x9d8cc
    at ort-wasm-simd-threaded.jsep.wasm:0xed51eb
    at ort-wasm-simd-threaded.jsep.wasm:0x114f19
```

Expected behavior-

- [ ] ONNX models generated using `scripts/convert.py` should work without errors

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Manually converted `Qwen2.5-Coder-0.5B-Instruct` does not work, but `onnx-community/Qwen2.5-0.5B-Instruct` does #1415

System Info

Environment/Platform

Description

Reproduction

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Component	Version
transformers.js	`main`
`scripts\convert.py`	`requirements.txt`

Manually converted Qwen2.5-Coder-0.5B-Instruct does not work, but onnx-community/Qwen2.5-0.5B-Instruct does #1415

Description

System Info

Environment/Platform

Description

Reproduction

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Manually converted `Qwen2.5-Coder-0.5B-Instruct` does not work, but `onnx-community/Qwen2.5-0.5B-Instruct` does #1415