Skip to content

[Feature] The length of vectorized chunks supports customization #3776

@devnotperfect

Description

@devnotperfect

MaxKB Version

v2.0.1

Please describe your needs or suggestions for improvements

目前默认向量化支持的是按照256字符前最完整的一句话进行截断后向量化,在bge-m3、qwen-embedding这种支持长token输入的向量模型下测试检索召回的效果不理想,是否可以支持在知识库或者文档向量化时自定义这个截断的长度限制

Please describe the solution you suggest

No response

Additional Information

No response

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions