-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Open
Labels
Description
When implementing a quantized GEMM/convolution with INT8 activations and weights, it's common to also have the bias as INT32. The usual trick for adding a bias seems to be initializing the C matrix to the bias with a stride of 0. This approach would require ElementC
to be declared as INT32, yet I also want the output of the convolution to be INT8; it seems ElementC
is also implicitly the output data type as far as I can tell.
It's not clear how to achieve what I want with 2.x APIs/epilogues. Do I need to use EVT to accomplish this?