Support bf16 in blackwell cutlass decode attention kernel #18797
Job | Run time |
---|---|
18m 45s | |
25m 19s | |
12m 27s | |
11m 51s | |
10m 0s | |
25m 58s | |
20m 15s | |
26m 25s | |
26m 19s | |
1h 27m 35s | |
32m 52s | |
1h 22m 57s | |
9m 25s | |
25m 43s | |
32m 48s | |
19m 57s | |
7h 48m 36s |
Job | Run time |
---|---|
18m 45s | |
25m 19s | |
12m 27s | |
11m 51s | |
10m 0s | |
25m 58s | |
20m 15s | |
26m 25s | |
26m 19s | |
1h 27m 35s | |
32m 52s | |
1h 22m 57s | |
9m 25s | |
25m 43s | |
32m 48s | |
19m 57s | |
7h 48m 36s |