[Feat][Graph] Support FULL_DECODE_ONLY
mode for GQA/MHA models
#740
Job | Run time |
---|---|
5s | |
1h 27m 50s | |
1h 17m 29s | |
1h 21m 25s | |
1h 23m 24s | |
5h 30m 13s |
FULL_DECODE_ONLY
mode for GQA/MHA models
#740
Job | Run time |
---|---|
5s | |
1h 27m 50s | |
1h 17m 29s | |
1h 21m 25s | |
1h 23m 24s | |
5h 30m 13s |