Skip to content

Semantic Text Embedding in Elasticsearch 9.0 #127977

Closed
@andrelrmarques

Description

@andrelrmarques

Elasticsearch Version

9.0.1

Installed Plugins

No response

Java Version

bundled

OS Version

N/A - Hosted Deployment

Problem Description

When creating an index with field type semantic_text, the field is not automatically chunking the text and generating embeddings during document indexing.
This works in 8.17.3.

Steps to Reproduce

On Kibana > Devtools

Create an index, using semantic_text for the field in question:

PUT /my-index2
{
"mappings": {
"properties": {
"text": {
"type": "semantic_text",
"inference_id": ".multilingual-e5-small-elasticsearch"
}
}
}
}

Next, submit a document, like in this example:

POST /my-index2/_doc
{
"text": "This is a long paragraph that we expected to be chunked automatically for embedding..."
}

Then, look at the results:

GET /my-index2/_search
{
"query": {
"match_all": {}
}
}

The field "text" is not chunked. This was tested with .multilingual-e5-small-elasticsearch as well as an Azure OpenAI inference_id. The same steps on a 8.17.3 cluster do create the chunks.

Logs (if relevant)

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    :SearchOrg/RelevanceLabel for the Search (solution/org) Relevance team>bugTeam:Search - RelevanceThe Search organization Search Relevance teamTeam:SearchOrgMeta label for the Search Org (Enterprise Search)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions