Replies: 3 comments
-
Beta Was this translation helpful? Give feedback.
0 replies
-
|
@krrishdholakia for your awareness, is it expected that |
Beta Was this translation helpful? Give feedback.
0 replies
-
|
A simple local test setup is: import asyncio
import time
import litellm
concurrent_requests = 100
async def acompletion_with_async_sleep(mock_delay):
await asyncio.sleep(mock_delay)
return "pong"
async def acompletion(i):
start_time = time.time()
print(f"Task {i}: Starting")
response = await litellm.acompletion(
model="azure/gpt-4",
messages=[{"role": "user", "content": "ping"}],
mock_response=True,
mock_delay=4,
)
print(f"Task {i}: Finished in {round(time.time() - start_time, 2)} seconds")
return response
tasks = [acompletion(i) for i in range(concurrent_requests)]
results = await asyncio.gather(*tasks)with output: But when using |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment


Uh oh!
There was an error while loading. Please reload this page.
-
Async
acompletionruns synccompletionin executor of the event loop partially with all callkwargs.If we set
mock_responsetoTrueand specify amock_delay, then thencompletiondoes asynctime.sleep fortime_delay. This blocks the event loop when making multipleacompletioncalls withmock_delayconcurrently.We should instead use
asyncio.sleepfor the case of mocking an async response, to make sure the event loop is not blocked and concurrent calls can be mocked.Beta Was this translation helpful? Give feedback.
All reactions