Skip to content

Add head_dim = 64 in B200 Attention. (#4935) #16749

Add head_dim = 64 in B200 Attention. (#4935)

Add head_dim = 64 in B200 Attention. (#4935) #16749

Job Run time
6s
8s
2h 30m 32s
31m 6s
2h 7m 48s
3h 0m 41s
32m 22s
2h 5m 32s
3h 1m 35s
2h 31m 29s
2h 10m 3s
2h 28m 11s
2h 6m 0s
2h 1m 15s
2h 20m 55s
3h 0m 33s
2h 28m 44s
29m 11s
25m 58s
3h 0m 37s
25m 56s
29m 34s
2h 24m 41s
2h 4m 0s
2h 18m 35s
2h 7m 9s
2h 2m 11s
3h 0m 33s
2h 25m 42s
3h 0m 16s
11s
43s
23s
33s
15s
30s
25s
13s
21s
22s
24s
25s
38s
19s
12s
40s
27s
55s
31s
22s
18s
18s
32s
17s
1m 6s
16s
23s
24s
2d 9h 23m 46s