Skip to content
Merged
Changes from 1 commit
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
b52a1ce
finished prepack
fajin-corp Apr 28, 2025
0523106
changed interface to support blocksum2
fajin-corp Apr 29, 2025
fd92ab8
finished quantb for quant a unsigned
fajin-corp Apr 30, 2025
ed5cf8d
finished quantize a
fajin-corp May 1, 2025
b9b9691
finished Q8Int8GemmR2xC8Neon
fajin-corp May 5, 2025
685baff
finished kernels
fajin-corp May 5, 2025
6747330
fixed build
fajin-corp May 6, 2025
b087317
passed prepack
fajin-corp May 8, 2025
196c04c
finished ut for quant a
fajin-corp May 9, 2025
353d460
fixed build
fajin-corp May 9, 2025
4d62e32
Merge remote-tracking branch 'origin/main' into hari/matmul8bits_arm
hariharans29 Jun 18, 2025
e88e32d
Comment out some 4 bit tests
hariharans29 Jun 19, 2025
58011b0
Apple I8MM check
hariharans29 Jun 20, 2025
acc4b81
Tests
hariharans29 Jun 20, 2025
2700493
Tests 2
hariharans29 Jun 20, 2025
76de326
Update onnxruntime/test/mlas/unittest/test_sq8bitgemm.cpp
hariharans29 Jun 23, 2025
159d4d3
Changes
hariharans29 Jun 23, 2025
e4bc74e
Fixes
hariharans29 Jun 23, 2025
e92055b
Re-enable 4 bit tests
hariharans29 Jun 23, 2025
94f3022
Stage
hariharans29 Jun 25, 2025
61c1872
Some tests work
hariharans29 Jun 25, 2025
16da92b
Git attempt
hariharans29 Jun 25, 2025
3ce481d
Lint attempt
hariharans29 Jun 25, 2025
29f66bd
Update onnxruntime/test/mlas/unittest/test_sq8bitgemm.cpp
hariharans29 Jun 25, 2025
987574b
More changesc
hariharans29 Jun 25, 2025
d921b06
Merge branch 'hari/matmul8bits_arm' of https://github.com/microsoft/o…
hariharans29 Jun 25, 2025
cf92e6f
Fix tests
hariharans29 Jun 25, 2025
8156fc7
Stage
hariharans29 Jun 26, 2025
9a1fe22
Stage
hariharans29 Jun 26, 2025
31c8f93
Update onnxruntime/test/mlas/unittest/test_sq8bitgemm.cpp
hariharans29 Jun 26, 2025
92ec5ff
Update onnxruntime/test/mlas/unittest/test_sq8bitgemm.cpp
hariharans29 Jun 26, 2025
7159d5e
Try fix x86 builds
hariharans29 Jun 26, 2025
7ad1d36
Merge branch 'hari/matmul8bits_arm' of https://github.com/microsoft/o…
hariharans29 Jun 26, 2025
03f2916
Try fix lint errors
hariharans29 Jun 26, 2025
47420b5
Yipee zero point tests are all passing
hariharans29 Jun 27, 2025
2a5100d
Comments and Nits
hariharans29 Jun 27, 2025
d64568b
Enable MatmulNBits test
hariharans29 Jun 27, 2025
0c55755
Fixes
hariharans29 Jun 27, 2025
01d4a98
Merge remote-tracking branch 'origin/main' into hari/matmul8bits_arm
hariharans29 Jun 27, 2025
c8188d4
a
hariharans29 Jun 27, 2025
635eec9
I8MM support re-enable
hariharans29 Jun 27, 2025
f736fae
Fix warning
hariharans29 Jun 27, 2025
aa79467
Enable tests with ZP = false
hariharans29 Jun 28, 2025
10e3afa
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
hariharans29 Jun 28, 2025
c4331e0
I8MM fixes
hariharans29 Jun 28, 2025
5b7c3af
Remove unnecessary template
hariharans29 Jun 28, 2025
9ae58ee
Resolve conflicts and update PR with more fixes
hariharans29 Jul 31, 2025
b6cd309
Fix warning
hariharans29 Jul 31, 2025
98f5fe0
Properly remove warning
hariharans29 Jul 31, 2025
0d9442b
Merge remote-tracking branch 'origin' into hari/matmul8bits_arm
hariharans29 Aug 5, 2025
9c2faa6
PR feedback
hariharans29 Sep 2, 2025
47e2420
Refine
hariharans29 Sep 3, 2025
5eb9ed9
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
hariharans29 Sep 3, 2025
bb978f7
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
hariharans29 Sep 3, 2025
9b5c389
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
hariharans29 Sep 3, 2025
76d085b
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
hariharans29 Sep 3, 2025
12e3a1d
Update onnxruntime/test/contrib_ops/matmul_8bits_test.cc
hariharans29 Sep 3, 2025
3827317
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
hariharans29 Sep 3, 2025
d8f4235
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
hariharans29 Sep 3, 2025
46aa362
Ignore sending scales while pre-packing weights on ARM64
hariharans29 Sep 3, 2025
2c956ae
Fix warning
hariharans29 Sep 3, 2025
83296bb
4 bit fix
hariharans29 Sep 3, 2025
8f14500
Merge branch 'hari/matmul8bits_arm' of https://github.com/microsoft/o…
hariharans29 Sep 3, 2025
405105b
Update onnxruntime/test/contrib_ops/matmul_8bits_test.cc
hariharans29 Sep 3, 2025
303e867
Lint
hariharans29 Sep 3, 2025
91de908
Merge branch 'hari/matmul8bits_arm' of https://github.com/microsoft/o…
hariharans29 Sep 3, 2025
890a046
Fix lintrunner mess-up once and for all
hariharans29 Sep 3, 2025
ec0c8ab
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
hariharans29 Sep 3, 2025
1f71f6c
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
hariharans29 Sep 3, 2025
376fc1b
Lint
hariharans29 Sep 3, 2025
eefa72c
More fixes
hariharans29 Sep 3, 2025
e1da3d5
PR comments
hariharans29 Sep 4, 2025
77dff22
Missed out on one
hariharans29 Sep 4, 2025
7404cb3
Remove guards
hariharans29 Sep 4, 2025
edb3d72
Merge remote-tracking branch 'origin/main' into hari/matmul8bits_arm
hariharans29 Sep 4, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
  • Loading branch information
hariharans29 and github-actions[bot] authored Sep 3, 2025
commit ec0c8abe25e5cfe07d598eb9992e21d683f1a6ea
10 changes: 5 additions & 5 deletions onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
Original file line number Diff line number Diff line change
Expand Up @@ -230,11 +230,11 @@ Status MatMulNBits<T1>::PrePack(const Tensor& tensor, int input_idx, /*out*/ All
}

#if defined(MLAS_TARGET_ARM64)
if (input_idx == InputIndex::scales && packed_b_ != nullptr &&
MlasQNBitGemmScalesPacked(K_, nbits_, block_size_, compute_type_, has_zp_input_)) {
scales_are_packed_ = true;
is_packed = true;
}
if (input_idx == InputIndex::scales && packed_b_ != nullptr &&
MlasQNBitGemmScalesPacked(K_, nbits_, block_size_, compute_type_, has_zp_input_)) {
scales_are_packed_ = true;
is_packed = true;
}
#endif // MLAS_TARGET_ARM64
}

Expand Down
Loading