Skip to content

jakc4103/piecewise-quantization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Piecewise-Quantization

PyTorch implementation of Near-Lossless Post-Training Quantization of Deep Neural Networks via a Piecewise Linear Approximation

Usage

There are 5 main arguments

  1. quantize: whether to quantize parameters(per-channel) and activations(per-tensor).
  2. imagenet_path: path to folder contains train/val folder of imagenet data
  3. model: the type of model, should be one of ['mobilenetv2', 'resnet50', 'inceptionv3'], default to mobilenetv2
  4. qtype: the type of quantization for weights, should be one of ['uniform', 'pws', 'pwg', 'pwl'], default to uniform
  5. bits_weight: number of bits for weight quantization, default to 8

run the 4-bits quantized pws mobilenetv2 model by:

python main_cls.py --quantize --qtype pws --model mobilenetv2 --bits_Weight 4

Notes

Fake quantization

The quantization in this repo is fake quantization. Inference is NOT pure Int8 arithmetics.

TODO

  • Uniform quantization
  • PWS quantization
  • update results for classification models
  • PWG quantization
  • PWL quantization
  • detection model
  • segmentation model

About

PyTorch implementation of Near-Lossless Post-Training Quantization of Deep Neural Networks via a Piecewise Linear Approximation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages