[Feat][Graph] Support FULL_DECODE_ONLY
mode for GQA/MHA models (#2128)
#9770
Job | Run time |
---|---|
6m 12s | |
6m 12s |
FULL_DECODE_ONLY
mode for GQA/MHA models (#2128)
#9770
Job | Run time |
---|---|
6m 12s | |
6m 12s |