is deepspeed inference now supporting llama2 replace_with_kernel_inject=True? #4290
Unanswered
liveforfun
asked this question in
Q&A
Replies: 1 comment
-
Hi @liveforfun I am adding the support for this here: #4313 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I tried to inference with replace_with_kernel_inject=True option with llama2-70b.
but I got some errors.
as far as I know replace_with_kernel_inject=True this option is injecting the high-performance kernels.
but it might be not supporting now right?
Beta Was this translation helpful? Give feedback.
All reactions