Fp8 scaling per-sequence #79

ani300 · 2025-07-11T22:05:33Z

No description provided.

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>

JRosenkranz · 2025-07-17T14:12:52Z

aiu_fms_testing_utils/utils/paged.py

@@ -140,8 +140,8 @@ def generate(
        from fms_mo.aiu_addons.fp8.fp8_utils import ScaledTensor
        kwargs["past_key_value_states"] = [
            (
-                ScaledTensor(torch.zeros(NUM_BLOCKS, BLOCK_SIZE, kvheads, head_size, dtype=torch.float8_e4m3fn), torch.tensor(1.0), False),
-                ScaledTensor(torch.zeros(NUM_BLOCKS, BLOCK_SIZE, kvheads, head_size, dtype=torch.float8_e4m3fn), torch.tensor(1.0), False),
+                ScaledTensor(torch.zeros(NUM_BLOCKS, BLOCK_SIZE, kvheads, head_size, dtype=torch.float8_e4m3fn), torch.tensor([1.0] * input_ids.shape[0], dtype=torch.float32), False),


why is the dtype torch.float32 here?

FP8 scales are stored in fp32 always

JRosenkranz · 2025-07-18T00:35:31Z

bot:test
TEST_FILE=test_decoders.py MODEL_ID=ibm-granite/granite-3.3-8b-instruct BATCH_SIZE=8 SEQUENCE_LENGTH=64 USE_TINY_MODEL=1

JRosenkranz · 2025-07-18T00:59:51Z

bot:test
TEST_FILE=test_decoders.py MODEL_ID=ibm-granite/granite-3.3-8b-instruct BATCH_SIZE=8 SEQUENCE_LENGTH=64 USE_TINY_MODEL=0

JRosenkranz

lgtm

Add support for FP8 scaling per sequence

c6a3d00

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>

ani300 requested review from JRosenkranz and andrea-fasoli July 11, 2025 22:05

ani300 added 3 commits July 16, 2025 15:01

fix scale dimensions

5cd32a7

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>

Merge branch 'main' into fp8_sequence

ece4e9d

Merge branch 'main' into fp8_sequence

ff9b5e0

JRosenkranz reviewed Jul 17, 2025

View reviewed changes

JRosenkranz approved these changes Jul 18, 2025

View reviewed changes

JRosenkranz merged commit f43cc04 into main Jul 18, 2025
1 of 2 checks passed

tharapalanivel deleted the fp8_sequence branch August 12, 2025 05:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fp8 scaling per-sequence #79

Fp8 scaling per-sequence #79

Uh oh!

ani300 commented Jul 11, 2025

Uh oh!

JRosenkranz Jul 17, 2025

Uh oh!

ani300 Jul 17, 2025

Uh oh!

JRosenkranz commented Jul 18, 2025

Uh oh!

JRosenkranz commented Jul 18, 2025

Uh oh!

JRosenkranz left a comment

Uh oh!

Uh oh!

Uh oh!

Fp8 scaling per-sequence #79

Fp8 scaling per-sequence #79

Uh oh!

Conversation

ani300 commented Jul 11, 2025

Uh oh!

JRosenkranz Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

ani300 Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

JRosenkranz commented Jul 18, 2025

Uh oh!

JRosenkranz commented Jul 18, 2025

Uh oh!

JRosenkranz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!