Skip to content

Conversation

emerzon
Copy link
Contributor

@emerzon emerzon commented Jul 5, 2025

Description

This change adds a Token counter widget near the Chat Inputbox

image
image

Data Flow

  1. Token Capture: When LiteLLM processes a request, it returns usage data that includes prompt_tokens, completion_tokens, and total_tokens
  2. Storage: The token_usage_tracker stores this data in thread-local storage for cross-thread access
  3. Message Creation: When creating chat messages, the token usage is retrieved and stored in the database
  4. Frontend Display: The frontend receives token usage with each message and updates the counter
  5. Session Management: Token counters reset when switching between chat sessions

Key Features

  • Real-time Tracking: Updates token usage as messages stream
  • Visual Feedback: Color-coded progress bar (green → yellow → red based on usage)
  • Detailed Breakdown: Tooltip shows billed vs context tokens, including reasoning tokens for o1 models
  • Model-aware: Displays correct token limits based on the selected LLM model
  • Incremental Billing: Tracks actual billable tokens (not counting repeated context)

How Has This Been Tested?

Start a new chat session, after the LLM response, the widget should appear.

@emerzon emerzon requested a review from a team as a code owner July 5, 2025 22:54
Copy link

vercel bot commented Jul 5, 2025

@emerzon is attempting to deploy a commit to the Danswer Team on Vercel.

A member of the Team first needs to authorize it.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

Implements a token counter widget with complete backend-to-frontend infrastructure for tracking LLM token usage in chat sessions.

  • Added .orig and .rej files in backend/onyx/file_processing/ appear to be merge artifacts that should be removed
  • The token_usage_tracker.py implementation using thread-local storage could have thread safety issues, particularly around the 60-second cleanup mechanism
  • Document processing library imports (docx, openpyxl, pptx) were removed from extract_file_text.py while their functions are still referenced
  • Duplicated token limit calculation logic exists between ChatPage.tsx and ChatInputBar.tsx that should be centralized
  • New database JSONB column token_usage properly tracks structured usage data (prompt_tokens, completion_tokens, total_tokens) for billing accuracy

17 files reviewed, 19 comments
Edit PR Review Bot Settings | Greptile

};

// Calculate percentage based on context tokens (context window usage)
const maxTokens = currentModel?.maxTokens || 200000; // Default fallback
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: 200000 should be defined as a named constant at the module level for better maintainability

refined_answer_improvement: bool | None = None
is_agentic: bool | None = None
error: str | None = None
token_usage: dict[str, Any] | None = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: The token_usage field should have a more specific type rather than dict[str, Any]. Consider creating a TokenUsage Pydantic model to enforce the expected structure (prompt_tokens, completion_tokens, total_tokens).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +1991 to +1992
# Token usage information from LLM API responses
token_usage: Mapped[dict[str, Any] | None] = mapped_column(postgresql.JSONB(), nullable=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Add field description in the comment to explain what token_usage metrics (prompt_tokens, completion_tokens, total_tokens) can be expected in this field

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably better to not define this. Several types of tokens could show up in the response in the future (audio tokens, image tokens, etc)

Comment on lines +3 to +4
Revision ID: 42e26b80
Revises: 58c50ef19f08
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Migration ID should be longer (typical format is 32 chars). Current format differs from other migrations like 58c50ef19f08

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need review at merge time I believe

emerzon and others added 7 commits July 5, 2025 17:56
…age.py

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
@emerzon emerzon marked this pull request as draft July 5, 2025 23:12
emerzon and others added 3 commits July 5, 2025 18:13
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Copy link

This PR is stale because it has been open 75 days with no activity. Remove stale label or comment or this will be closed in 15 days.

@github-actions github-actions bot added the Stale label Sep 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant