-
Notifications
You must be signed in to change notification settings - Fork 152
Add INC WoQ and remove ITREX dependency #880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: changwangss <[email protected]>
Co-authored-by: Ella Charlaix <[email protected]>
Co-authored-by: Ella Charlaix <[email protected]>
Co-authored-by: Ella Charlaix <[email protected]>
Co-authored-by: Ella Charlaix <[email protected]>
Signed-off-by: changwangss <[email protected]>
7a3e2e5 to
c078ca2
Compare
Signed-off-by: changwangss <[email protected]>
Signed-off-by: changwangss <[email protected]>
Signed-off-by: changwangss <[email protected]>
Co-authored-by: Ilyas Moutawwakil <[email protected]>
Co-authored-by: Ilyas Moutawwakil <[email protected]>
Signed-off-by: changwangss <[email protected]>
| ) | ||
| trainer.model = quantizer._quantized_model | ||
|
|
||
| if optim_args.apply_quantization and optim_args.quantization_approach in {"weight_only"}: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if optim_args.apply_quantization and optim_args.quantization_approach in {"weight_only"}: | |
| if optim_args.apply_quantization and optim_args.quantization_approach == "weight_only": |
Signed-off-by: changwangss <[email protected]>
Signed-off-by: changwangss <[email protected]>
Signed-off-by: changwangss <[email protected]>
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Signed-off-by: changwangss <[email protected]>
| warnings.warn( | ||
| "Weight only quantization model loading provided by intel_extension_for_transformers is deprecated and it is provided by INC now.", | ||
| DeprecationWarning, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could this be determined from the model itself (that the model was quantized through ITREX) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not noticeable to users. Only the code inside optimal-intel changes from import ITREX to import INC. Unfortunately, the model does not have an attribute to indicate the source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added a check here 08091bc : checking the quantization configuration (if present + matching algorithm parameter)
03e2629 to
08091bc
Compare
Signed-off-by: changwangss <[email protected]>
* add inc woq and remove itrex dependency Signed-off-by: changwangss <[email protected]> * Update optimum/intel/neural_compressor/modeling_base.py Co-authored-by: Ella Charlaix <[email protected]> * Update optimum/intel/neural_compressor/modeling_base.py Co-authored-by: Ella Charlaix <[email protected]> * Update optimum/intel/neural_compressor/modeling_base.py Co-authored-by: Ella Charlaix <[email protected]> * Update optimum/intel/neural_compressor/modeling_base.py Co-authored-by: Ella Charlaix <[email protected]> * fix code according comment Signed-off-by: changwangss <[email protected]> * add logger setting Signed-off-by: changwangss <[email protected]> * improve ut Signed-off-by: changwangss <[email protected]> * move woq quantization to quantization.py Signed-off-by: changwangss <[email protected]> * Update examples/neural_compressor/language-modeling/run_clm.py Co-authored-by: Ilyas Moutawwakil <[email protected]> * Update examples/neural_compressor/language-modeling/run_clm.py Co-authored-by: Ilyas Moutawwakil <[email protected]> * remove dependency Signed-off-by: changwangss <[email protected]> * Update examples/neural_compressor/language-modeling/run_clm.py * add woq saving and loading ut and logger info Signed-off-by: changwangss <[email protected]> * set transformers version limit Signed-off-by: changwangss <[email protected]> * fix installation neural_compressor[pt] Signed-off-by: changwangss <[email protected]> * improve ut Signed-off-by: changwangss <[email protected]> * refactoring * Refactor * revert * fix datasets loading issue Signed-off-by: changwangss <[email protected]> * fix --------- Signed-off-by: changwangss <[email protected]> Co-authored-by: Ella Charlaix <[email protected]> Co-authored-by: Ilyas Moutawwakil <[email protected]> Co-authored-by: Ella Charlaix <[email protected]>
What does this PR do?
I raise this PR based #841 . INC 3.0.2 released,we plan remove ITREX dependency and base INC to apply WOQ. @echarlaix
Fixes # (issue)
Before submitting