Fix incorrect value_info setting in MergeONNXModels #152

fpjentzsch · 2024-11-13T13:02:18Z

We connect the output of the pre_model to the input tensor of the post_model.

Old behavior: The output tensor of pre_model is added to the value_info of the output model, even though this tensor is no longer used. The input tensor of post_model is missing a value_info entry. This causes errors in FINN shape inference because no shape can be annotated to this tensor, but CustomOps expect their input tensor shape to be annotated.

New behavior: The input tensor of post_model is added to the value_info of the output model.

maltanar · 2024-12-09T15:54:53Z

Thanks @fpjentzsch . Do you have a testcase suggestion that would catch the bug in the old version? Calling InferShapes() in the merged model in test_merge_onnx_models did not raise any errors for me.

fpjentzsch · 2024-12-13T15:03:16Z

Thanks @fpjentzsch . Do you have a testcase suggestion that would catch the bug in the old version? Calling InferShapes() in the merged model in test_merge_onnx_models did not raise any errors for me.

The problem occurs in make_shape_compatible_op because many FINN CustomOps do the following there:

ishape = tuple(model.get_tensor_shape(self.onnx_node.input[0]))
assert ishape == exp_ishape

Which fails because get_tensor_shape() returns None if no value_info is found for the input. During shape inference make_shape_compatible_op is called on all nodes before the actual ONNX shape inference begins, so essentially we expect the model to have all shapes correctly inferred before any offending CustomOp is placed in the graph.

I'm not sure this assumption makes sense, maybe we should re-design the way tensor shapes are checked against node attributes or the "make_shape_compatible_op" mechanism in general?

At first glance only the ONNX CustomOps "quantavgpool2d", "max_pool", "maxpoolnhwc", and "conv" do this in make_shape_compatible_op. So we could build a failing test for the old behavior by applying MergeONNXModels() on a model where one of these ops is the first layer.

maltanar · 2024-12-15T20:43:16Z

Thanks for the clarification Felix. I'm happy to merge this as-is, though it would be even better to address the underlying issues with how we do shape inference for QONNX custom ops. I'll open an issue so we can track this separately.

On a separate note, for merging models https://onnx.ai/onnx/api/compose.html also exists and we should consider making MergeONNXModels a wrapper instead, provided that we get the same functionality. This would mean maintaining less code as part of QONNX which would be desirable.

Fix value_info in MergeOnnxModels

07fda66

fpjentzsch mentioned this pull request Nov 13, 2024

Characterization-based FIFO sizing for non-linear models Xilinx/finn#1229

Draft

maltanar merged commit 2c91d6d into fastmachinelearning:main Dec 15, 2024
5 checks passed

maltanar mentioned this pull request Dec 15, 2024

Overhaul shape inference for custom ops #161

Open

maltanar added this to the v0.4.0 milestone Dec 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix incorrect value_info setting in MergeONNXModels #152

Fix incorrect value_info setting in MergeONNXModels #152

fpjentzsch commented Nov 13, 2024

maltanar commented Dec 9, 2024

fpjentzsch commented Dec 13, 2024

maltanar commented Dec 15, 2024

Fix incorrect value_info setting in MergeONNXModels #152

Fix incorrect value_info setting in MergeONNXModels #152

Conversation

fpjentzsch commented Nov 13, 2024

maltanar commented Dec 9, 2024

fpjentzsch commented Dec 13, 2024

maltanar commented Dec 15, 2024