Skip to content

[Feature]: Add support for Guided Decoding #177

@shen-shanshan

Description

@shen-shanshan

Overview

In our roadmap, we plan to support guided decoding in 2025 Q1 as shown here (#71).

Currently:

I have tested vllm/examples/offline_inference/structured_outputs.py directly on NPU device and the experiment results showed that guided decoding is natively supported on NPU with outlines backend.

Plus, I have analysed the code in vLLM and have found that the tensors related to guide logits computation are all on npu device, which have also demonstrated that guided decoding is natively supported on NPU.

However, there are still some problems need to be fixed, such as incomplete json output and inference speed is too slow.

Feel free to feedback your issues when using guided decoding with vllm-ascend, and we will try to fix them if we can.

Usage

Coming soon ...

Roadmap

Community news

Adaptation for vllm-ascend

Metadata

Metadata

Assignees

No one assigned

    Labels

    guideguide note

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions