-
Notifications
You must be signed in to change notification settings - Fork 1k
Open
Labels
Description
System Info
Test environment
Component | Version |
---|---|
transformers.js | ^3.7.2 |
Chrome browser | Latest |
Environment used to convert HF model to ONNX-
Component | Version |
---|---|
transformers.js | main |
scripts\convert.py |
requirements.txt |
Environment/Platform
- Website/web-app
- Browser extension
- Server-side (e.g., Node.js, Deno, Bun)
- Desktop app (e.g., Electron)
- Other (e.g., VSCode extension)
Description
I am trying to run the example webgpu-chat from the main
branch to simply generate a text output within the browser.
When I use onnx-community/Qwen2.5-0.5B-Instruct onnx model, it works and I get an output from the model.
However, when I use convert.py
from scripts folder to convert HF Qwen2.5-Coder-0.5B-Instruct into onnx, and use the custom model, then there is exception and no output is generated.
Reproduction
I cloned the HF model locally to a new folder custom
and successfully generated the ONNX using below command-
python -m scripts.convert --model-id custom/Qwen2.5-Coder-0.5B-Instruct --quantize --task text-generation
Below is the code snippet I used-
import {
pipeline } from '@huggingface/transformers';
const model_id = "onnx-community/Qwen2.5-0.5B-Instruct";
const generator = await pipeline("text-generation", model_id, { dtype: "fp16", device: "webgpu" });
const text = 'def hello_world():';
const output = await generator(text, { max_new_tokens: 45,
temperature: 0.5,
top_k: 0.5,
do_sample: false});
console.debug(output[0].generated_text);
-
onnx-community/Qwen2.5-Coder-0.5B-Instruct
works correctly and I get an output. -
custom/Qwen2.5-Coder-0.5B-Instruct
throws an exception fromawait generator
with a message2374903832
:-/
I also tried an alternative code snippet-
import {
AutoTokenizer,
AutoModelForCausalLM} from '@huggingface/transformers';
const model_id = "onnx-community/Qwen2.5-0.5B-Instruct";
const tokenizer = await AutoTokenizer.from_pretrained(model_id);
const model = await AutoModelForCausalLM.from_pretrained(model_id, {
dtype: 'fp16',
device: 'webgpu' });
const generated_text = tokenizer.decode(outputs[0], { skip_special_tokens: true });
console.debug(generated_text);
- Again
onnx-community/Qwen2.5-0.5B-Instruct
works correctly and I get an output. - But using
custom/Qwen2.5-0.5B-Instruct
gives below runtime error :-/
Uncaught (in promise) RuntimeError: Aborted(). Build with -sASSERTIONS for more info.
at L (ort-wasm-simd-threaded.jsep.mjs:22:101)
at tb (ort-wasm-simd-threaded.jsep.mjs:60:98)
at ort-wasm-simd-threaded.jsep.wasm:0x1365fa
at ort-wasm-simd-threaded.jsep.wasm:0x404b9
at ort-wasm-simd-threaded.jsep.wasm:0x75f9b
at ort-wasm-simd-threaded.jsep.wasm:0xd24b
at ort-wasm-simd-threaded.jsep.wasm:0x1bdf9
at ort-wasm-simd-threaded.jsep.wasm:0x9d8cc
at ort-wasm-simd-threaded.jsep.wasm:0xed51eb
at ort-wasm-simd-threaded.jsep.wasm:0x114f19
Expected behavior-
- ONNX models generated using
scripts/convert.py
should work without errors