Skip to content

ValueError: This model's maximum context length is 4096 tokens. However, you requested 18546 tokens in the messages, Please reduce the length of the messages #538

@zlluye

Description

@zlluye

这是我vllm的报错。为什么在推理时,一个问题会需要多token的。而且文档越大,每次问题的token都超级大,这个有设置的地方嘛。

Image

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions