-
Notifications
You must be signed in to change notification settings - Fork 1.1k
feat: support openapi tts model tracking #2984
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Hey @andrescrz i have followed up and open a pr in #2984 cc @vincentkoc |
def test_openai_client_audio_speech_create__happyflow( | ||
respx_mock, fake_backend, project_name, expected_project_name | ||
): | ||
respx_mock.post("https://api.aimlapi.com/v1/audio/speech").respond( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In our library integration tests we're trying not to mock LLM providers to make sure we're processing them correctly. Let's replace it with a real request.
tags=["openai"], | ||
metadata=ANY_DICT, | ||
usage={ | ||
"total_tokens": len(INPUT_FOR_TESTS), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The usage expected here doesn't look like the usage openai declares they send (audio tokens are expected to be in input_tokens_details and output_tokens_details).
if current_span_data.input and current_span_data.input.get("input"): | ||
opik_usage = llm_usage.try_build_opik_usage_or_log_error( | ||
provider=LLMProvider.OPENAI, | ||
usage={"total_tokens": len(current_span_data.input["input"])}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look correct (already mentioned in the tests comment).
We need to pass the real usage data from the response here.
error_message="Failed to log token usage from openai call", | ||
) | ||
result = arguments_helpers.EndSpanParameters( | ||
output={}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't be None, let's encode it with base64.
output={}, | ||
usage=opik_usage, | ||
metadata={}, | ||
model=current_span_data.model, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should extract the exact model name from the provider response
import pydantic | ||
|
||
|
||
class OpenAIAudioSpeechUsage(pydantic.BaseModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for this class, audio tokens are already part of the existing and known openai token usage dicts.
Details
Supported openapi tts model tracking
Issues
Resolves #2202
Testing
Followup of #2547