Piecewise-Quantization

Usage

There are 5 main arguments

quantize: whether to quantize parameters(per-channel) and activations(per-tensor).
imagenet_path: path to folder contains train/val folder of imagenet data
model: the type of model, should be one of ['mobilenetv2', 'resnet50', 'inceptionv3'], default to mobilenetv2
qtype: the type of quantization for weights, should be one of ['uniform', 'pws', 'pwg', 'pwl'], default to uniform
bits_weight: number of bits for weight quantization, default to 8

run the 4-bits quantized pws mobilenetv2 model by:

python main_cls.py --quantize --qtype pws --model mobilenetv2 --bits_Weight 4

The quantization in this repo is fake quantization. Inference is NOT pure Int8 arithmetics.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
PyTransformer @ 8052a4f		PyTransformer @ 8052a4f
modeling/classification		modeling/classification
utils		utils
.gitmodules		.gitmodules
README.md		README.md
main_cls.py		main_cls.py