Skip to content

Cannot run unsloth finetuned Gemma-3-4b-it model on webgpu #1438

@azizbekabdul

Description

@azizbekabdul

System Info

transformers.js: 3.7.5

Environment/Platform

  • Website/web-app
  • Browser extension
  • Server-side (e.g., Node.js, Deno, Bun)
  • Desktop app (e.g., Electron)
  • Other (e.g., VSCode extension)

Description

Description
We fine-tuned unsloth/gemma-3-4b-it and merged LoRA adapters into the base model in bf16 to prepare for ONNX export.

What we tried

  • Transformers.js conversion script

    • python -m scripts.convert --model_id azizbekabdullaev/gemma-3-4b-it --quantize

    • Result: conversion fails during ONNX export for Gemma-3.

    • python -m onnxruntime_genai.models.builder -i "C:\models\Gemma3-4B-SFT-merged-bf16" -o "C:\models\gemma3-4B-webgpu" -e webgpu -p int4

    • This produced a WebGPU Q4 package but it does not run in Transformers.js.

Expected
A path to convert Gemma-3-4B-IT to a Transformers.js-compatible ONNX package for WebGPU text-generation.

Reproduction

# Try official conversion
python -m scripts.convert --model_id azizbekabdullaev/gemma-3-4b-it --quantize

# Then attempt to load in JS
import { pipeline } from "@huggingface/transformers";
const pipe = await pipeline("text-generation", "./models/gemma-3-4b-it", {
  device: "webgpu",
  dtype: "q4",
});

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions