-
Notifications
You must be signed in to change notification settings - Fork 566
feat(server): Add vector index #2856
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…fields to the index label.
| "The user_data(dimension and metric) of vector index " + | ||
| "label '%s' " + | ||
| "can't be null", this.name); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I will fix it
| (cardinality == Cardinality.LIST), | ||
| "vector index can only build on Float List, " + | ||
| "but got %s(%s)", dataType, cardinality); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I will fix it
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #2856 +/- ##
============================================
- Coverage 40.91% 33.67% -7.24%
+ Complexity 333 264 -69
============================================
Files 747 749 +2
Lines 60168 60436 +268
Branches 7683 7729 +46
============================================
- Hits 24615 20351 -4264
- Misses 32975 37778 +4803
+ Partials 2578 2307 -271 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@hahahahbenny please run the script to fix this check, could also update the licenses for the newly added dependencies at the same time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds vector index support to Apache HugeGraph, enabling the storage and querying of vector data for similarity search operations. The implementation introduces a new vector backend module using the JVector library and integrates vector index functionality into the existing schema and API layers.
- Adds a new
VECTORindex type to support vector similarity search - Implements vector backend store using JVector for vector index operations
- Extends REST API with vector index creation and ANN search endpoints
Reviewed Changes
Copilot reviewed 24 out of 25 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| pom.xml | Adds JVector dependency for vector operations |
| hugegraph-server/pom.xml | Includes new vector module |
| hugegraph-vector/*.java | New vector backend implementation with JVector integration |
| hugegraph-core/src/main/java/org/apache/hugegraph/type/define/IndexType.java | Adds VECTOR enum value |
| hugegraph-core/src/main/java/org/apache/hugegraph/type/HugeType.java | Adds VECTOR_INDEX type |
| hugegraph-core/src/main/java/org/apache/hugegraph/schema/builder/IndexLabelBuilder.java | Adds vector index validation |
| hugegraph-core/src/main/java/org/apache/hugegraph/schema/Userdata.java | Adds validation for vector index metadata |
| hugegraph-core/src/main/java/org/apache/hugegraph/backend/tx/GraphIndexTransaction.java | Handles vector index transactions |
| hugegraph-core/src/main/java/org/apache/hugegraph/backend/store/BackendFeatures.java | Adds vector index feature flag |
| hugegraph-core/src/main/java/org/apache/hugegraph/backend/serializer/*.java | New serializers for vector data |
| hugegraph-api/src/main/java/org/apache/hugegraph/api/schema/IndexLabelAPI.java | REST API for vector index creation |
| hugegraph-api/src/main/java/org/apache/hugegraph/api/graph/VertexAPI.java | REST API for ANN search |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| private VectorSerializer getVectorSerializer() { | ||
| // Create VectorSerializer using the same pattern as the main serializer | ||
| HugeConfig config = this.params().configuration(); | ||
| return new VectorSerializer(config); | ||
| } |
Copilot
AI
Sep 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating a new VectorSerializer instance on every call is inefficient. Consider caching the VectorSerializer instance as a class member to avoid repeated object creation.
| } | ||
|
|
||
| // ANN search request class | ||
| private static class AnnSearchRequest { |
Copilot
AI
Sep 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AnnSearchRequest class should be public or moved to a separate public class to allow proper API documentation generation and client code usage.
| private static class AnnSearchRequest { | |
| public static class AnnSearchRequest { |
| public VectorSerializer() { | ||
| super(); | ||
| } | ||
|
|
Copilot
AI
Sep 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The no-argument constructor calls super() unnecessarily since it's implicitly called. Consider removing this constructor if it's not needed or documenting its purpose.
| public VectorSerializer() { | |
| super(); | |
| } |
| } | ||
| } | ||
| this.tables.clear(); | ||
| return false; |
Copilot
AI
Sep 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method always returns false despite successfully closing tables. Consider returning true on successful closure or changing the return type to void if the boolean return value is not needed.
| return false; | |
| return true; |
|
|
||
| @Override | ||
| protected boolean opened() { | ||
| return false; |
Copilot
AI
Sep 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The opened() method always returns false, which contradicts the opened field being set to true in the Session.open() method. This should return the actual state of the session.
| return false; | |
| return !this.closed; |
| return String.format("AnnSearchRequest{vertex_label=%s, properties=%s, user_vector=%s, metric=%s, dimension=%s}", | ||
| vertex_label, properties, Arrays.toString(user_vector), metric, dimension); |
Copilot
AI
Sep 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing import for Arrays class. Add 'import java.util.Arrays;' to the imports section.
| } | ||
|
|
||
| // Vector index must build on float list | ||
| if(this.indexType.isVector()){ |
Copilot
AI
Sep 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing space after 'if' keyword. Should be 'if (this.indexType.isVector()) {' to follow Java coding conventions.
| if(this.indexType.isVector()){ | |
| if (this.indexType.isVector()) { |
| // 基础字段(实现BackendEntry接口) | ||
| private final HugeType type; // VECTOR_INDEX | ||
| private final Id id; // 索引ID | ||
| private final Id subId; // 顶点ID | ||
|
|
||
| // 向量核心字段 | ||
| private final String vectorId; // 向量唯一标识 | ||
| private final float[] vector; // 向量数据 | ||
| private final String metricType; // 度量类型 (L2, COSINE, DOT) | ||
| private final Integer dimension; // 向量维度 |
Copilot
AI
Sep 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments should be in English to maintain consistency with the rest of the codebase. Consider translating these Chinese comments to English.
|
ok, I have run this script
|
e1183dd to
40ac0a9
Compare
imbajin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merge PR first, need enhance it in the future work & merge into master
# This is the 1st commit message: add Licensed to files # This is the commit message apache#2: feat(server): support vector index in graphdb (apache#2856) * feat(server): Add the vector index type and the detection of related fields to the index label. * fix code format * add annsearch API * add doc to explain the plan delete redundency in vertexapi
Purpose of the PR
Main Changes
In following API
POST http://localhost:8080/graphs/hugegraph/schema/indexlabelsnow support this request json body
attention
主要变化
目前在createInLabel 的接口中
POST http://localhost:8080/graphs/hugegraph/schema/indexlabels现已支持如下请求 JSON 体:
注意
Verifying these changes
Does this PR potentially affect the following parts?
Documentation Status
Doc - TODODoc - DoneDoc - No Need