Closed
Description
Elasticsearch Version
9.0.1
Installed Plugins
No response
Java Version
bundled
OS Version
N/A - Hosted Deployment
Problem Description
When creating an index with field type semantic_text, the field is not automatically chunking the text and generating embeddings during document indexing.
This works in 8.17.3.
Steps to Reproduce
On Kibana > Devtools
Create an index, using semantic_text for the field in question:
PUT /my-index2
{
"mappings": {
"properties": {
"text": {
"type": "semantic_text",
"inference_id": ".multilingual-e5-small-elasticsearch"
}
}
}
}
Next, submit a document, like in this example:
POST /my-index2/_doc
{
"text": "This is a long paragraph that we expected to be chunked automatically for embedding..."
}
Then, look at the results:
GET /my-index2/_search
{
"query": {
"match_all": {}
}
}
The field "text" is not chunked. This was tested with .multilingual-e5-small-elasticsearch as well as an Azure OpenAI inference_id. The same steps on a 8.17.3 cluster do create the chunks.
Logs (if relevant)
No response