Skip to content

GPT-QModel v5.2.0

Choose a tag to compare

@Qubitium Qubitium released this 02 Nov 17:14
· 19 commits to main since this release
baf9674

Notable Changes:

  • Minimax M2, Granite Nano, Qwen3-VL, Brumpy model support
  • AWQ quantization now out of beta and now fully integrated into life cycle
  • New VramStrategy.Balanced property to spread MoE modules to different gpus
  • New pure torch AWQ kernel
  • New calibration_concat_separator property
  • Fixed HF bug that did not save mtp layers for GLM 4.5/4.6 (air) models.
  • Fixed multi-gpu cuda asserts due to stream/sync

What's Changed

New Contributors

Full Changelog: v5.0.0...v5.2.0