Add head_dim = 64 in B200 Attention. #18864
Job | Run time |
---|---|
16m 39s | |
25m 54s | |
17m 29s | |
10m 5s | |
19m 52s | |
26m 25s | |
20m 6s | |
32m 58s | |
9m 28s | |
26m 27s | |
13m 21s | |
1h 27m 46s | |
25m 42s | |
33m 1s | |
1h 23m 5s | |
12m 40s | |
7h 40m 58s |
Job | Run time |
---|---|
16m 39s | |
25m 54s | |
17m 29s | |
10m 5s | |
19m 52s | |
26m 25s | |
20m 6s | |
32m 58s | |
9m 28s | |
26m 27s | |
13m 21s | |
1h 27m 46s | |
25m 42s | |
33m 1s | |
1h 23m 5s | |
12m 40s | |
7h 40m 58s |