Skip to content

updated agents api to handle text files on artifact retrieval#3091

Merged
tim-inkeep merged 4 commits intomainfrom
bugfix/attachments-txt
Apr 10, 2026
Merged

updated agents api to handle text files on artifact retrieval#3091
tim-inkeep merged 4 commits intomainfrom
bugfix/attachments-txt

Conversation

@tim-inkeep
Copy link
Copy Markdown
Contributor

No description provided.

@vercel vercel bot temporarily deployed to Preview – agents-docs April 9, 2026 20:10 Inactive
@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Apr 9, 2026

🦋 Changeset detected

Latest commit: 55b3948

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 10 packages
Name Type
@inkeep/agents-api Patch
@inkeep/agents-manage-ui Patch
@inkeep/agents-cli Patch
@inkeep/agents-core Patch
@inkeep/agents-email Patch
@inkeep/agents-mcp Patch
@inkeep/agents-sdk Patch
@inkeep/agents-work-apps Patch
@inkeep/ai-sdk-provider Patch
@inkeep/create-agents Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@vercel
Copy link
Copy Markdown

vercel bot commented Apr 9, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agents-api Ready Ready Preview, Comment Apr 9, 2026 8:24pm
agents-docs Ready Ready Preview, Comment Apr 9, 2026 8:24pm
agents-manage-ui Ready Ready Preview, Comment Apr 9, 2026 8:24pm

Request Review

@github-actions github-actions bot deleted a comment from claude bot Apr 9, 2026
@pullfrog
Copy link
Copy Markdown
Contributor

pullfrog bot commented Apr 9, 2026

TL;DR — When the get_reference_artifact tool retrieves a text-based file (e.g. .txt, .md, .csv), the agent now receives decoded text wrapped in an <attached_file> block instead of a raw base64 file content part. This lets models read and reason over text attachments directly rather than receiving opaque binary data.

Key changes

  • Add buildDecodedTextAttachmentBlock helper — combines byte decoding and XML block wrapping into a single reusable function in text-document-attachments.ts, used by both the conversation-history builder and the artifact tool.
  • Hydrate text artifacts as decoded text in get_reference_artifactdefault-tools.ts now detects text MIME types on artifact retrieval and returns decoded <attached_file> text parts instead of base64 file parts, with a graceful fallback to the original binary path on decode failure.
  • Simplify buildTextAttachmentPart in conversation history — refactored to defer decoding to a single try/catch at the end using the new helper, eliminating a duplicated decode-and-fallback block.
  • Add tests for text artifact hydration — three new Agent.test.ts cases cover the happy path, the toModelOutput mapping, and the decode-failure fallback; four new unit tests for buildDecodedTextAttachmentBlock cover correctness, composability, and error propagation.

Summary | 6 files | 2 commits | base: mainbugfix/attachments-txt


Text document artifacts decoded inline instead of sent as base64

Before: get_reference_artifact returned all blob-backed artifacts — including text files — as a { type: 'file', data: '<base64>', mimeType } content part. Models received opaque binary data for .txt, .md, .csv, etc.
After: Text-MIME artifacts are decoded to UTF-8 and wrapped in an <attached_file filename="…" media_type="…"> text block. If decoding fails (invalid UTF-8, control characters), the tool falls back to the original base64 file-data path.

The detection uses the existing isTextDocumentMimeType predicate. A new filename field was added to BlobBackedArtifactData so the original upload filename propagates into the attached-file block.

How does the decode fallback work?

buildDecodedTextAttachmentBlock calls decodeTextDocumentBytes, which throws InvalidUtf8TextDocumentError or TextDocumentControlCharacterError for non-text-safe payloads. The caller catches these and falls through to the existing type: 'file' base64 code path, logging a warning.

default-tools.ts · text-document-attachments.ts · conversation-history.ts · Agent.test.ts

Pullfrog  | View workflow run | Triggered by Pullfrog | Using Claude Opus𝕏

@pullfrog
Copy link
Copy Markdown
Contributor

pullfrog bot commented Apr 9, 2026

TL;DR — When the get_reference_artifact tool retrieves a text-based file (e.g. .txt, .md), it now decodes the bytes and returns the content as an inline text part wrapped in an <attached_file> block instead of a base64-encoded file part. This lets LLMs read text attachments directly rather than receiving opaque binary data.

Key changes

  • Decode text artifacts inline in get_reference_artifact — text-document MIME types are now detected and decoded to a human-readable <attached_file> block, with a graceful fallback to base64 file delivery if decoding fails.
  • Add buildDecodedTextAttachmentBlock helper — new convenience function in text-document-attachments.ts that composes decodeTextDocumentBytes + buildTextAttachmentBlock into a single call, used by both the artifact tool and conversation history builder.
  • Simplify buildTextAttachmentPart in conversation history — refactored to defer decoding to the shared helper and consolidate error handling into a single try/catch after byte acquisition.
  • Add comprehensive tests for text artifact hydration — three new test cases covering the happy path, the toModelOutput mapping, and the fallback when decode fails; plus four unit tests for buildDecodedTextAttachmentBlock.

Summary | 6 files | 4 commits | base: mainbugfix/attachments-txt


Text artifact hydration in get_reference_artifact

Before: All blob-backed artifacts — including .txt and .md files — were returned as base64-encoded file content parts, which models cannot read natively.
After: Artifacts with a text-document MIME type are decoded to UTF-8 and returned as text content parts inside an <attached_file> XML wrapper. If decoding fails (invalid UTF-8, control characters), the tool falls back to the original base64 file path.

The detection uses the existing isTextDocumentMimeType predicate. A new filename field on BlobBackedArtifactData is resolved with preference for the stored filename, falling back to the blob URI basename.

What happens when decode fails? The buildDecodedTextAttachmentBlock call is wrapped in a try/catch. On failure (e.g. invalid UTF-8 or disallowed control characters), a warning is logged and execution falls through to the existing base64 file part construction — so no artifact retrieval is lost.

default-tools.ts · text-document-attachments.ts · conversation-history.ts · Agent.test.ts

Pullfrog  | View workflow run | Triggered by Pullfrog | Using Claude Opus𝕏

Copy link
Copy Markdown
Contributor

@pullfrog pullfrog bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good fix. Text-document artifacts (e.g. .txt, .md, .csv) were being delivered as type: 'file' base64 content parts, which downstream toModelOutput maps to file-data — a format many LLMs don't handle well. This PR detects text MIME types early and decodes them into type: 'text' <attached_file> blocks, with a clean fallback to the base64 path on decode failure.

The conversation-history.ts refactor is also an improvement: the old code only had decode-error handling on the blob-download branch, leaving the inline-bytes path unprotected. The new structure handles both sources uniformly.

Minor (non-blocking): Now that resolvedFilename exists at default-tools.ts:69, the binary fallback path at line 129 (...(filename ? { filename } : {})) still uses the raw blob-URI-derived filename (e.g. sha256-abc.txt) rather than resolvedFilename. Consider updating it for consistency — though this predates the PR and isn't a regression.

Pullfrog  | View workflow run | Using Claude Opus𝕏

Comment thread .changeset/brave-chefs-refuse.md Outdated
"@inkeep/agents-api": patch
---

Patched attachments bug
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changeset message is vague — consider describing the actual fix so consumers understand what changed.

Suggested change
Patched attachments bug
Handle text-document artifacts (`text/plain`, `text/markdown`, etc.) as decoded text content parts instead of base64 file parts, improving LLM comprehension of attached text files

Copy link
Copy Markdown
Contributor

@pullfrog pullfrog bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving — the core logic is sound and well-tested. The changeset message nit and resolvedFilename consistency note are non-blocking.

Pullfrog  | View workflow run | Using Claude Opus𝕏

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(2) Total Issues | Risk: Low

🟠⚠️ Major (1) 🟠⚠️

Inline Comments:

  • 🟠 Major: .changeset/brave-chefs-refuse.md:5 Changeset message does not follow style guidelines

💭 Consider (0) 💭

No consider items.

🧹 While You're Here (1) 🧹

Inline Comments:

  • 🧹 While You're Here: default-tools.ts:129 Fallback path uses blobUri-derived filename instead of user-provided filename (pre-existing inconsistency)

🕐 Pending Recommendations (0)

No pending items — first review.


💡 APPROVE WITH SUGGESTIONS

Summary: This PR properly fixes the text file artifact retrieval bug. The implementation is clean: a new buildDecodedTextAttachmentBlock() helper consolidates decoding + wrapping, error handling gracefully falls back to base64 file delivery when text decoding fails, and comprehensive test coverage validates the happy path, toModelOutput mapping, and fallback behavior. The only actionable item is improving the changeset message to follow the repo's changelog conventions. The pre-existing filename inconsistency is a nice-to-fix if you're already in this code.

Discarded (3)
Location Issue Reason Discarded
conversation-history.test.ts Missing test for decode failure in buildTextAttachmentPart The error path is tested indirectly via text-document-attachments.test.ts which validates the helper throws correctly. Integration-level test is nice-to-have but not required since the try/catch is straightforward.
Agent.test.ts:2034 Only tests control character errors, not invalid UTF-8 errors Both error types trigger the same fallback via a generic catch block. Low risk since the catch is untyped.
default-tools.ts:69 Missing test for filename fallback when data.filename is undefined The nullish coalescing is simple logic. Existing tests verify the user-provided filename path works.
Reviewers (3)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
pr-review-standards 2 0 0 0 2 0 0
pr-review-tests 3 0 0 0 0 0 3
pr-review-errors 0 0 0 0 0 0 0
Total 5 0 0 0 2 0 3

Note: Error handling reviewer found all patterns to be well-implemented — errors are logged appropriately, fallbacks are reasonable, and test coverage validates the fallback path.

Comment thread .changeset/brave-chefs-refuse.md Outdated
@github-actions github-actions bot deleted a comment from claude bot Apr 9, 2026
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

Preview URLs

Use these stable preview aliases for testing this PR:

These point to the same Vercel preview deployment as the bot comment, but they stay stable and easier to find.

Raw Vercel deployment URLs

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(0) Total Issues | Risk: Low

Delta Review

This is a re-review scoped to changes since the previous automated review.

Prior feedback status:

Prior Issue Status
🟠 Major: Changeset message style Resolved — Commit 55b39481e updated the message to follow repo conventions
🧹 While You're Here: Pre-existing filename inconsistency in fallback path Still applicable (pre-existing, not a blocker)

Changes since last review:

  • 55b39481e — Fixed changeset message: "Fix text file artifacts being returned as base64-encoded file parts instead of decoded text content"
  • 45063b644 — Merge from main (version bumps, unrelated changes not in PR scope)

No new issues introduced. The implementation remains clean with proper error handling and test coverage.


✅ APPROVE

Summary: All prior feedback has been addressed. The changeset message now follows the repo's style guidelines with a clear, action-verb-led description of the fix. The implementation is solid — the buildDecodedTextAttachmentBlock() helper cleanly handles text artifact decoding, error handling gracefully falls back to base64 delivery, and test coverage validates both happy paths and failure scenarios. Ship it! 🚀

Reviewers (1)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
orchestrator (delta) 0 0 0 0 0 0 0
Total 0 0 0 0 0 0 0

Note: Delta review — prior issues already covered in previous review run. No new code changes to evaluate.

@itoqa
Copy link
Copy Markdown

itoqa bot commented Apr 9, 2026

Ito Test Report ✅

12 test cases ran. 12 passed.

All 12 test cases passed (0 failures), and the unified local verification found no confirmed production-code defects across the covered run-domain chat and attachment scenarios. Key findings were that valid text attachments worked end-to-end on both /run/api/chat and SSE /run/v1/chat/completions (including OpenAI-style file_data, CRLF normalization, artifact hydration, filename-optional inputs, explicit conversationId continuation, and 8-way parallel submissions with no 500s), while invalid or unsafe inputs were correctly blocked with controlled 400s and cross-app/origin auth mismatches were rejected with 403, with results interpreted in the context of dev-only local auth/provider fallback setup and the required model field on /run/v1/chat/completions.

✅ Passed (12)
Category Summary Screenshot
Adversarial Cross-app token and origin mismatches were rejected with 403 as expected. ADV-4
Adversarial Prior blocked run was setup-related; after seeding required tenant/project/app state, base chat plus 8 parallel attachment submissions completed with HTTP 200 and no 500s. ADV-5
Adversarial Initial failure mode was request-shape related (model required on /run/v1/chat/completions); mixed-route continuation succeeded after corrected payload and shared explicit conversationId. ADV-6
Edge URL-based text attachment was correctly rejected with a 400 validation error requiring inline base64 data URIs. EDGE-1
Edge Malformed inline base64 data URI was correctly rejected with HTTP 400 and malformed payload validation messaging. EDGE-2
Edge Invalid UTF-8 text input was safely rejected with controlled bad_request behavior and no crash. EDGE-3
Edge Control-character text attachment handling returned controlled 400 responses on both chat routes with no 500/crash. EDGE-4
Edge Text attachments without a filename are accepted and processed correctly with default handling. EDGE-5
Logic CRLF text decoding normalizes line endings correctly and matches current implementation behavior. LOGIC-1
Logic Text artifact hydration through get_reference_artifact returns decoded text content as expected. LOGIC-2
Happy-path Inline text attachment request completed with a non-empty assistant response after a dev-only provider-failure fallback was applied. ROUTE-1
Happy-path SSE /run/v1/chat/completions accepted OpenAI-style file_data text attachment and streamed completion events with HTTP 200. ROUTE-2

Commit: 45063b6

View Full Run


Tell us how we did: Give Ito Feedback

@tim-inkeep tim-inkeep added this pull request to the merge queue Apr 10, 2026
Merged via the queue into main with commit ab65543 Apr 10, 2026
28 checks passed
@tim-inkeep tim-inkeep deleted the bugfix/attachments-txt branch April 10, 2026 19:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants