Skip to content

Conversation

@jiafatom
Copy link
Contributor

Description

Fix GatherBlockQuantized shape inference test

Motivation and Context

In GatherBlockQuantized op contrib_defs, we have shape inference test

        for (int i = 0; i < r; ++i) {
          if (!data_shape.dim(i).has_dim_value() ||
              !scales_shape.dim(i).has_dim_value() ||
              (i == quantize_axis && (data_shape.dim(i).dim_value() * components + block_size - 1) / block_size != scales_shape.dim(i).dim_value()) ||
              (i != quantize_axis && data_shape.dim(i).dim_value() != scales_shape.dim(i).dim_value())) {
            fail_shape_inference("data shape and scales shape do not match");
          }
        }

This code is introduced last year. However, when I try to share weight for the phi-4-mini-instruct model
image
I need to have a reshape operator into GatherBlockQuantized. The shape inference of Reshape is not from the initializer directly, but from the Concat which need to do some constant folding. Therefore, at the first sweep of shape inference, data_shape.dim(i).has_dim_value() is False, which will fail shape inference and the model cannot work. Therefore, When we want to check shape inference, we need to only check when data_shape.dim(i).has_dim_value()=True, same for scales_shape.

@jiafatom jiafatom requested a review from tianleiwu August 16, 2025 20:15
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can commit the suggested changes from lintrunner.

@jiafatom jiafatom merged commit 27cbebe into main Aug 18, 2025
103 of 104 checks passed
@jiafatom jiafatom deleted the gather_shape branch August 18, 2025 17:30
adrianlizarraga pushed a commit that referenced this pull request Aug 21, 2025
### Description
Fix GatherBlockQuantized shape inference test



### Motivation and Context
In GatherBlockQuantized op contrib_defs, we have shape inference test
```
        for (int i = 0; i < r; ++i) {
          if (!data_shape.dim(i).has_dim_value() ||
              !scales_shape.dim(i).has_dim_value() ||
              (i == quantize_axis && (data_shape.dim(i).dim_value() * components + block_size - 1) / block_size != scales_shape.dim(i).dim_value()) ||
              (i != quantize_axis && data_shape.dim(i).dim_value() != scales_shape.dim(i).dim_value())) {
            fail_shape_inference("data shape and scales shape do not match");
          }
        }
```
This code is introduced last year. However, when I try to share weight
for the phi-4-mini-instruct model
<img width="233" height="494" alt="image"
src="https://github.com/user-attachments/assets/9c220543-0b81-4867-bcd1-1b7aa49e20cd"
/>
I need to have a reshape operator into GatherBlockQuantized. The shape
inference of Reshape is not from the initializer directly, but from the
Concat which need to do some constant folding. Therefore, at the first
sweep of shape inference, `data_shape.dim(i).has_dim_value()` is
`False`, which will fail shape inference and the model cannot work.
Therefore, When we want to check shape inference, we need to only check
when `data_shape.dim(i).has_dim_value()=True`, same for `scales_shape`.
adrianlizarraga added a commit that referenced this pull request Aug 25, 2025
### Description
Cherry-pick the following PRs into the `rel-1.23.0` branch:
- #25592
- #25622
- #25688
- #25729
- #25743
- #25769
- #25745
- #25761
- #25751
- #25716
- #25228
- #25768
- #25788
- #25747
- #25800
- #25818
- #25762
- #25749
- #25831


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: quic-tirupath <[email protected]>
Co-authored-by: quic-calvnguy <[email protected]>
Co-authored-by: qti-kromero <[email protected]>
Co-authored-by: Jeff Kilpatrick <[email protected]>
Co-authored-by: Scott McKay <[email protected]>
Co-authored-by: David Fan <[email protected]>
Co-authored-by: kuanyul-qti <[email protected]>
Co-authored-by: Dmitri Smirnov <[email protected]>
Co-authored-by: Chi Lo <[email protected]>
Co-authored-by: Edward Chen <[email protected]>
Co-authored-by: Chunye Wang@AMD <[email protected]>
Co-authored-by: minfhong-qti <[email protected]>
Co-authored-by: Vishal Agarwal <[email protected]>
Co-authored-by: Maximilian Müller <[email protected]>
Co-authored-by: Maximilian Müller <[email protected]>
Co-authored-by: Changming Sun <[email protected]>
Co-authored-by: adrastogi <[email protected]>
Co-authored-by: Aditya Rastogi <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
gedoensmax pushed a commit to gedoensmax/onnxruntime that referenced this pull request Sep 2, 2025
### Description
Fix GatherBlockQuantized shape inference test



### Motivation and Context
In GatherBlockQuantized op contrib_defs, we have shape inference test
```
        for (int i = 0; i < r; ++i) {
          if (!data_shape.dim(i).has_dim_value() ||
              !scales_shape.dim(i).has_dim_value() ||
              (i == quantize_axis && (data_shape.dim(i).dim_value() * components + block_size - 1) / block_size != scales_shape.dim(i).dim_value()) ||
              (i != quantize_axis && data_shape.dim(i).dim_value() != scales_shape.dim(i).dim_value())) {
            fail_shape_inference("data shape and scales shape do not match");
          }
        }
```
This code is introduced last year. However, when I try to share weight
for the phi-4-mini-instruct model
<img width="233" height="494" alt="image"
src="https://github.com/user-attachments/assets/9c220543-0b81-4867-bcd1-1b7aa49e20cd"
/>
I need to have a reshape operator into GatherBlockQuantized. The shape
inference of Reshape is not from the initializer directly, but from the
Concat which need to do some constant folding. Therefore, at the first
sweep of shape inference, `data_shape.dim(i).has_dim_value()` is
`False`, which will fail shape inference and the model cannot work.
Therefore, When we want to check shape inference, we need to only check
when `data_shape.dim(i).has_dim_value()=True`, same for `scales_shape`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants