-
Notifications
You must be signed in to change notification settings - Fork 3.5k
[MLAS] Add 8-bit weights ARM64 Gemm implementation #25110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 23 commits
Commits
Show all changes
75 commits
Select commit
Hold shift + click to select a range
b52a1ce
finished prepack
fajin-corp 0523106
changed interface to support blocksum2
fajin-corp fd92ab8
finished quantb for quant a unsigned
fajin-corp ed5cf8d
finished quantize a
fajin-corp b9b9691
finished Q8Int8GemmR2xC8Neon
fajin-corp 685baff
finished kernels
fajin-corp 6747330
fixed build
fajin-corp b087317
passed prepack
fajin-corp 196c04c
finished ut for quant a
fajin-corp 353d460
fixed build
fajin-corp 4d62e32
Merge remote-tracking branch 'origin/main' into hari/matmul8bits_arm
hariharans29 e88e32d
Comment out some 4 bit tests
hariharans29 58011b0
Apple I8MM check
hariharans29 acc4b81
Tests
hariharans29 2700493
Tests 2
hariharans29 76de326
Update onnxruntime/test/mlas/unittest/test_sq8bitgemm.cpp
hariharans29 159d4d3
Changes
hariharans29 e4bc74e
Fixes
hariharans29 e92055b
Re-enable 4 bit tests
hariharans29 94f3022
Stage
hariharans29 61c1872
Some tests work
hariharans29 16da92b
Git attempt
hariharans29 3ce481d
Lint attempt
hariharans29 29f66bd
Update onnxruntime/test/mlas/unittest/test_sq8bitgemm.cpp
hariharans29 987574b
More changesc
hariharans29 d921b06
Merge branch 'hari/matmul8bits_arm' of https://github.com/microsoft/o…
hariharans29 cf92e6f
Fix tests
hariharans29 8156fc7
Stage
hariharans29 9a1fe22
Stage
hariharans29 31c8f93
Update onnxruntime/test/mlas/unittest/test_sq8bitgemm.cpp
hariharans29 92ec5ff
Update onnxruntime/test/mlas/unittest/test_sq8bitgemm.cpp
hariharans29 7159d5e
Try fix x86 builds
hariharans29 7ad1d36
Merge branch 'hari/matmul8bits_arm' of https://github.com/microsoft/o…
hariharans29 03f2916
Try fix lint errors
hariharans29 47420b5
Yipee zero point tests are all passing
hariharans29 2a5100d
Comments and Nits
hariharans29 d64568b
Enable MatmulNBits test
hariharans29 0c55755
Fixes
hariharans29 01d4a98
Merge remote-tracking branch 'origin/main' into hari/matmul8bits_arm
hariharans29 c8188d4
a
hariharans29 635eec9
I8MM support re-enable
hariharans29 f736fae
Fix warning
hariharans29 aa79467
Enable tests with ZP = false
hariharans29 10e3afa
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
hariharans29 c4331e0
I8MM fixes
hariharans29 5b7c3af
Remove unnecessary template
hariharans29 9ae58ee
Resolve conflicts and update PR with more fixes
hariharans29 b6cd309
Fix warning
hariharans29 98f5fe0
Properly remove warning
hariharans29 0d9442b
Merge remote-tracking branch 'origin' into hari/matmul8bits_arm
hariharans29 9c2faa6
PR feedback
hariharans29 47e2420
Refine
hariharans29 5eb9ed9
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
hariharans29 bb978f7
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
hariharans29 9b5c389
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
hariharans29 76d085b
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
hariharans29 12e3a1d
Update onnxruntime/test/contrib_ops/matmul_8bits_test.cc
hariharans29 3827317
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
hariharans29 d8f4235
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
hariharans29 46aa362
Ignore sending scales while pre-packing weights on ARM64
hariharans29 2c956ae
Fix warning
hariharans29 83296bb
4 bit fix
hariharans29 8f14500
Merge branch 'hari/matmul8bits_arm' of https://github.com/microsoft/o…
hariharans29 405105b
Update onnxruntime/test/contrib_ops/matmul_8bits_test.cc
hariharans29 303e867
Lint
hariharans29 91de908
Merge branch 'hari/matmul8bits_arm' of https://github.com/microsoft/o…
hariharans29 890a046
Fix lintrunner mess-up once and for all
hariharans29 ec0c8ab
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
hariharans29 1f71f6c
Update onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc
hariharans29 376fc1b
Lint
hariharans29 eefa72c
More fixes
hariharans29 e1da3d5
PR comments
hariharans29 77dff22
Missed out on one
hariharans29 7404cb3
Remove guards
hariharans29 edb3d72
Merge remote-tracking branch 'origin/main' into hari/matmul8bits_arm
hariharans29 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.