added fixes for handling multiple shape warmup #13

JRosenkranz · 2025-04-01T14:38:27Z

This PR is to address issues with warming up multiple shapes on the AIU. This PR introduces the use of a prepare_model_inputs_hook which will mark certain dimensions as static/dynamic prior to forward pass. This relies on the following PR foundation-model-stack/foundation-model-stack#388

…dded tests for multiple shape warmup Signed-off-by: Joshua Rosenkranz <jmrosenk@us.ibm.com>

ani300 · 2025-04-01T15:17:23Z

aiu_fms_testing_utils/utils/__init__.py

+
+        for layer in kwargs["past_key_value_states"]:
+            for tensor in layer:
+                torch._dynamo.mark_static(tensor, 0)


we could move the mark kv cache sequence dimension as dynamic code here as well

ani300 · 2025-04-01T15:18:14Z

aiu_fms_testing_utils/utils/__init__.py

+        torch._dynamo.mark_dynamic(kwargs["mask"], 1)
+        torch._dynamo.mark_dynamic(kwargs["mask"], 2)


we probably only need to mark the dim 2 as dynamic here

ani300 · 2025-04-01T15:18:44Z

aiu_fms_testing_utils/utils/__init__.py

+        torch._dynamo.mark_static(input_ids, 0)
+        torch._dynamo.mark_static(input_ids, 1)
+        torch._dynamo.mark_static(kwargs["mask"], 0)
+        torch._dynamo.mark_static(kwargs["mask"], 1)
+        torch._dynamo.mark_static(kwargs["mask"], 2)
+        torch._dynamo.mark_static(kwargs["position_ids"], 0)
+        torch._dynamo.mark_static(kwargs["position_ids"], 1)


do we need to mark all the sequence dimensions as static or is just the batch dimensions enough?

It probably is enough, however I marked everything as static to ensure we get a static prefill -- I believe symbolic ints can cause changes in the graph in prefill that we may not want to introduce.

Signed-off-by: Joshua Rosenkranz <jmrosenk@us.ibm.com>

added fixes for handling multiple shape warmup with dynamic shapes; a…

f37b5c2

…dded tests for multiple shape warmup Signed-off-by: Joshua Rosenkranz <jmrosenk@us.ibm.com>

JRosenkranz requested a review from ani300 April 1, 2025 14:38

JRosenkranz self-assigned this Apr 1, 2025

ani300 reviewed Apr 1, 2025

View reviewed changes

added encoder warmup test; added logits validation for decoder warmup

4648b24

Signed-off-by: Joshua Rosenkranz <jmrosenk@us.ibm.com>

joerunde mentioned this pull request Apr 22, 2025

🐛 Fix static batching with multiple shapes vllm-project/vllm-spyre#104

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

added fixes for handling multiple shape warmup #13

added fixes for handling multiple shape warmup #13

Uh oh!

JRosenkranz commented Apr 1, 2025

Uh oh!

ani300 Apr 1, 2025

Uh oh!

ani300 Apr 1, 2025

Uh oh!

ani300 Apr 1, 2025

Uh oh!

JRosenkranz Apr 2, 2025

Uh oh!

Uh oh!

		torch._dynamo.mark_dynamic(kwargs["mask"], 1)
		torch._dynamo.mark_dynamic(kwargs["mask"], 2)

added fixes for handling multiple shape warmup #13

Are you sure you want to change the base?

added fixes for handling multiple shape warmup #13

Uh oh!

Conversation

JRosenkranz commented Apr 1, 2025

Uh oh!

ani300 Apr 1, 2025

Choose a reason for hiding this comment

Uh oh!

ani300 Apr 1, 2025

Choose a reason for hiding this comment

Uh oh!

ani300 Apr 1, 2025

Choose a reason for hiding this comment

Uh oh!

JRosenkranz Apr 2, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!