Skip to content

[QST] Quantized conv with s8 output and s32 bias #2271

@jstoecker

Description

@jstoecker

When implementing a quantized GEMM/convolution with INT8 activations and weights, it's common to also have the bias as INT32. The usual trick for adding a bias seems to be initializing the C matrix to the bias with a stride of 0. This approach would require ElementC to be declared as INT32, yet I also want the output of the convolution to be INT8; it seems ElementC is also implicitly the output data type as far as I can tell.

It's not clear how to achieve what I want with 2.x APIs/epilogues. Do I need to use EVT to accomplish this?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions