[ROCm] gemm_a16w16 upstreaming #26969

maleksan85 · 2025-10-16T01:03:23Z

GPT OSS, m and n to check: ROCm@bcc4e69

HIP_VISIBLE_DEVICES=7 \
HSA_NO_SCRATCH_RECLAIM=1 \
NCCL_MIN_NCHANNELS=112 \
USE_FASTSAFETENSOR=1 \
SAFETENSORS_FAST_GPU=1 \
VLLM_DISABLE_COMPILE_CACHE=1 \
VLLM_ROCM_USE_AITER=1 \
VLLM_USE_AITER_UNIFIED_ATTENTION=1 \
VLLM_ROCM_USE_AITER_MHA=0 \
VLLM_USE_AITER_TRITON_GEMM=0 \
TRITON_HIP_PRESHUFFLE_SCALES=1 \
vllm serve /data/models/openai/gpt-oss-120b \
    --host localhost \
    --port 30000 \
    --tensor-parallel-size 1 \
    --max-num-batched-tokens 8192 \
    --max-num-seqs 32 \
    --gpu-memory-utilization 0.9 \
    --max-model-len 2048 \
    --swap-space 16 \
    --block-size 64 \
    --async-scheduling \
    --no-enable-prefix-caching \
    --disable-log-requests \
    --compilation-config='{"pass_config":{"enable_attn_fusion":true,"enable_noop":true,"enable_fusion":true},"cudagraph_mode":"FULL","custom_ops":["+rms_norm","+silu_and_mul","+quant_fp8"],"splitting_ops":[]}'

vllm bench \
  --host localhost \
  --port 30000 \
  --model /data/models/openai/gpt-oss-120b \
  --dataset-name random \
  --random-input-len 1024 \
  --random-output-len 1024 \
  --random-prefix-len 0 \
  --request-rate "inf" \
  --max-concurrency 32 \
  --num-prompts 640 \
  --ignore-eos \
  --percentile-metrics ttft,tpot,itl,e2el

Signed-off-by: Aleksandr Malyshev <maleksan@amd.com>

gemm_a16w16 upstreaming

eef1b16

Signed-off-by: Aleksandr Malyshev <maleksan@amd.com>

mergify bot added the rocm Related to AMD ROCm label Oct 16, 2025

triton fp16 kernel

5538c0f

Signed-off-by: Aleksandr Malyshev <maleksan@amd.com>

mergify bot added the gpt-oss Related to GPT-OSS models label Oct 17, 2025

github-project-automation bot added this to gpt-oss Issues & Enhancements Oct 17, 2025

github-project-automation bot moved this to To Triage in gpt-oss Issues & Enhancements Oct 17, 2025

triton fp16 kernel

1350384

Signed-off-by: Aleksandr Malyshev <maleksan@amd.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[ROCm] gemm_a16w16 upstreaming #26969

[ROCm] gemm_a16w16 upstreaming #26969

maleksan85 commented Oct 16, 2025 •

edited by github-actions bot

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

[ROCm] gemm_a16w16 upstreaming #26969

Are you sure you want to change the base?

[ROCm] gemm_a16w16 upstreaming #26969

Conversation

maleksan85 commented Oct 16, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

maleksan85 commented Oct 16, 2025 •

edited by github-actions bot

Loading