How to use custom chunking with external API (NLM Ingestor) in OpenWebUI?

### Description  
I want to use a **custom chunking method** for document processing in OpenWebUI. Specifically, I have an external service (`nlm-ingestor`) that extracts **semantic chunks** from PDFs very accurately. I want to integrate this service with OpenWebUI's document handling pipeline.  

### Use Case  
Currently, OpenWebUI supports document processing with built-in chunking, but I need to **override** this with my own chunking logic. I chunk the input document like this:  

```python
from llmsherpa.readers import LayoutPDFReader

llmsherpa_api_url = "http://localhost:5010/api/parseDocument?renderFormat=all"
pdf_path = "/home/user/sample.pdf"

pdf_reader = LayoutPDFReader(llmsherpa_api_url)
doc = pdf_reader.read_pdf(pdf_path)

chunks = [chunk.to_text() for chunk in doc.chunks()]
```
I want to inject these extracted chunks into OpenWebUI so that they can be indexed and retrieved via its RAG system.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to use custom chunking with external API (NLM Ingestor) in OpenWebUI? #475

Description

Use Case

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

How to use custom chunking with external API (NLM Ingestor) in OpenWebUI? #475

Description

Description

Use Case

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions