-
Notifications
You must be signed in to change notification settings - Fork 281
modify 3.x ipex example structure #1858
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 13 commits
Commits
Show all changes
17 commits
Select commit
Hold shift + click to select a range
4fb8843
modify 3.x ipex example structure
violetch24 c1597cc
Merge branch 'master' into zixuan/3x_ipex_example
violetch24 28be72b
add json path
violetch24 a1a0916
Merge branch 'master' into zixuan/3x_ipex_example
violetch24 dd7e71b
Merge branch 'master' into zixuan/3x_ipex_example
violetch24 d3f9bee
Merge branch 'master' into zixuan/3x_ipex_example
violetch24 7236eb2
fix for sq
violetch24 2cbf238
Merge branch 'master' into zixuan/3x_ipex_example
violetch24 5b5ba7d
minor fix
violetch24 c982739
Update run_clm_no_trainer.py
violetch24 34282d0
Update run_clm_no_trainer.py
violetch24 383b6a2
Update run_clm_no_trainer.py
violetch24 5f4fecf
Merge branch 'master' into zixuan/3x_ipex_example
xin3he 6b83c9e
minor fix
violetch24 00fe9d9
remove old files
violetch24 959170d
fix act_algo
violetch24 910d9e9
Merge branch 'master' into zixuan/3x_ipex_example
violetch24 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,46 @@ | ||
| { | ||
| "pytorch": { | ||
| "gpt_j_ipex":{ | ||
| "model_src_dir": "nlp/huggingface_models/language-modeling/quantization/static_quant", | ||
| "dataset_location": "", | ||
| "input_model": "", | ||
| "main_script": "run_clm_no_trainer.py", | ||
| "batch_size": 1 | ||
| }, | ||
| "gpt_j_ipex_sq":{ | ||
| "model_src_dir": "nlp/huggingface_models/language-modeling/quantization/smooth_quant", | ||
| "dataset_location": "", | ||
| "input_model": "", | ||
| "main_script": "run_clm_no_trainer.py", | ||
| "batch_size": 1 | ||
| }, | ||
| "llama2_7b_ipex":{ | ||
| "model_src_dir": "nlp/huggingface_models/language-modeling/quantization/static_quant", | ||
| "dataset_location": "", | ||
| "input_model": "", | ||
| "main_script": "run_clm_no_trainer.py", | ||
| "batch_size": 1 | ||
| }, | ||
| "llama2_7b_ipex_sq":{ | ||
| "model_src_dir": "nlp/huggingface_models/language-modeling/quantization/smooth_quant", | ||
| "dataset_location": "", | ||
| "input_model": "", | ||
| "main_script": "run_clm_no_trainer.py", | ||
| "batch_size": 1 | ||
| }, | ||
| "opt_125m_ipex":{ | ||
| "model_src_dir": "nlp/huggingface_models/language-modeling/quantization/static_quant", | ||
| "dataset_location": "", | ||
| "input_model": "", | ||
| "main_script": "run_clm_no_trainer.py", | ||
| "batch_size": 8 | ||
| }, | ||
| "opt_125m_ipex_sq":{ | ||
| "model_src_dir": "nlp/huggingface_models/language-modeling/quantization/smooth_quant", | ||
| "dataset_location": "", | ||
| "input_model": "", | ||
| "main_script": "run_clm_no_trainer.py", | ||
| "batch_size": 8 | ||
| } | ||
| } | ||
| } |
64 changes: 64 additions & 0 deletions
64
...ch/nlp/huggingface_models/language-modeling/quantization/smooth_quant/README.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,64 @@ | ||
| Step-by-Step | ||
| ============ | ||
| This document describes the step-by-step instructions to run large language models (LLMs) using Smooth Quantization on 4th Gen Intel® Xeon® Scalable Processor (codenamed Sapphire Rapids) with PyTorch and Intel® Extension for PyTorch. | ||
|
|
||
| The script `run_clm_no_trainer.py` supports `GPTJ`, `OPT`, `LLaMA2`, `BLOOM` and `Falcon` quantization and validates last word prediction accuracy with [lm_eval](https://github.com/EleutherAI/lm-evaluation-harness.git) now, and we are adding more models. | ||
|
|
||
| # Prerequisite | ||
| ## 1. Create Environment | ||
| ``` | ||
| # Installation | ||
| pip install -r requirements.txt | ||
| ``` | ||
|
|
||
| # Run | ||
|
|
||
| Here is how to run the scripts: | ||
|
|
||
| **Causal Language Modeling (CLM)** | ||
|
|
||
| `run_clm_no_trainer.py` quantizes the large language models using the dataset [NeelNanda/pile-10k](https://huggingface.co/datasets/NeelNanda/pile-10k) calibration and validates `lambada_openai`, `piqa`, `winogrande`, `hellaswag` and other datasets accuracy provided by lm_eval, an example command is as follows. | ||
| ### GPT-J-6b | ||
|
|
||
| #### Quantization | ||
| ```bash | ||
| # "--sq" is used to enable smooth quant | ||
| python run_clm_no_trainer.py \ | ||
| --model EleutherAI/gpt-j-6B \ | ||
| --quantize \ | ||
| --sq \ | ||
| --alpha 1.0 \ | ||
| --ipex \ | ||
| --output_dir "saved_results" | ||
| ``` | ||
| **Notes**: Smooth quantization here is based on torch.jit. Without past key value in example_inputs, the quantized model cannot be used for text-generation. | ||
|
|
||
| ### OPT-125m | ||
|
|
||
| #### Quantization | ||
|
|
||
| ```bash | ||
| # "--sq" is used to enable smooth quant | ||
| python run_clm_no_trainer.py \ | ||
| --model facebook/opt-125m \ | ||
| --quantize \ | ||
| --sq \ | ||
| --alpha 0.5 \ | ||
| --ipex \ | ||
| --output_dir "saved_results" | ||
| ``` | ||
|
|
||
| ### LLAMA2-7b/13b/70b | ||
| >Note: LLAMA requires IPEX requirements >= 2.1 to get better accuracy. | ||
| #### Quantization | ||
|
|
||
| ```bash | ||
| # "--sq" is used to enable smooth quant | ||
| python run_clm_no_trainer.py \ | ||
| --model meta-llama/Llama-2-7b-hf \ | ||
| --quantize \ | ||
| --sq \ | ||
| --alpha 0.8 \ | ||
| --ipex \ | ||
| --output_dir "saved_results" | ||
| ``` |
13 changes: 13 additions & 0 deletions
13
...torch/nlp/huggingface_models/language-modeling/quantization/smooth_quant/requirements.txt
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| accelerate | ||
| protobuf | ||
| sentencepiece != 0.1.92 | ||
| datasets >= 1.1.3 | ||
| torch >= 1.10 | ||
| transformers | ||
| pytest | ||
| wandb | ||
| einops | ||
| neural-compressor | ||
| intel-extension-for-transformers | ||
| lm_eval==0.4.2 | ||
| peft |
94 changes: 94 additions & 0 deletions
94
...torch/nlp/huggingface_models/language-modeling/quantization/smooth_quant/run_benchmark.sh
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,94 @@ | ||
| #!/bin/bash | ||
| set -x | ||
|
|
||
| function main { | ||
|
|
||
| init_params "$@" | ||
| run_benchmark | ||
|
|
||
| } | ||
|
|
||
| # init params | ||
| function init_params { | ||
| iters=100 | ||
| batch_size=16 | ||
| approach=static | ||
| tuned_checkpoint=saved_results | ||
| task=lambada_openai | ||
| echo ${max_eval_samples} | ||
| for var in "$@" | ||
| do | ||
| case $var in | ||
| --topology=*) | ||
| topology=$(echo $var |cut -f2 -d=) | ||
| ;; | ||
| --dataset_location=*) | ||
| dataset_location=$(echo $var |cut -f2 -d=) | ||
| ;; | ||
| --input_model=*) | ||
| input_model=$(echo $var |cut -f2 -d=) | ||
| ;; | ||
| --mode=*) | ||
| mode=$(echo $var |cut -f2 -d=) | ||
| ;; | ||
| --batch_size=*) | ||
| batch_size=$(echo $var |cut -f2 -d=) | ||
| ;; | ||
| --iters=*) | ||
| iters=$(echo ${var} |cut -f2 -d=) | ||
| ;; | ||
| --int8=*) | ||
| int8=$(echo ${var} |cut -f2 -d=) | ||
| ;; | ||
| --config=*) | ||
| tuned_checkpoint=$(echo $var |cut -f2 -d=) | ||
| ;; | ||
| *) | ||
| echo "Error: No such parameter: ${var}" | ||
| exit 1 | ||
| ;; | ||
| esac | ||
| done | ||
|
|
||
| } | ||
|
|
||
|
|
||
| # run_benchmark | ||
| function run_benchmark { | ||
| extra_cmd='' | ||
|
|
||
| if [[ ${mode} == "accuracy" ]]; then | ||
| mode_cmd=" --accuracy " | ||
| elif [[ ${mode} == "performance" ]]; then | ||
| mode_cmd=" --performance --iters "${iters} | ||
| else | ||
| echo "Error: No such mode: ${mode}" | ||
| exit 1 | ||
| fi | ||
|
|
||
| if [[ ${int8} == "true" ]]; then | ||
| extra_cmd=$extra_cmd" --int8" | ||
| fi | ||
| echo $extra_cmd | ||
|
|
||
| if [ "${topology}" = "opt_125m_ipex_sq" ]; then | ||
| model_name_or_path="facebook/opt-125m" | ||
| extra_cmd=$extra_cmd" --ipex --sq --alpha 0.5" | ||
| elif [ "${topology}" = "llama2_7b_ipex_sq" ]; then | ||
| model_name_or_path="meta-llama/Llama-2-7b-hf" | ||
| extra_cmd=$extra_cmd" --ipex --sq --alpha 0.8" | ||
| elif [ "${topology}" = "gpt_j_ipex_sq" ]; then | ||
| model_name_or_path="EleutherAI/gpt-j-6b" | ||
| extra_cmd=$extra_cmd" --ipex --sq --alpha 1.0" | ||
| fi | ||
|
|
||
| python -u run_clm_no_trainer.py \ | ||
| --model ${model_name_or_path} \ | ||
| --approach ${approach} \ | ||
| --output_dir ${tuned_checkpoint} \ | ||
| --task ${task} \ | ||
| --batch_size ${batch_size} \ | ||
| ${extra_cmd} ${mode_cmd} | ||
| } | ||
|
|
||
| main "$@" |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.