A Survey of Quantization Methods for Efficient Neural Network Inference

Gholami, Amir; Kim, Sehoon; Dong, Zhen; Yao, Zhewei; Mahoney, Michael W.; Keutzer, Kurt

Computer Science > Computer Vision and Pattern Recognition

arXiv:2103.13630 (cs)

[Submitted on 25 Mar 2021 (v1), last revised 21 Jun 2021 (this version, v3)]

Title:A Survey of Quantization Methods for Efficient Neural Network Inference

Authors:Amir Gholami, Sehoon Kim, Zhen Dong, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer

View PDF

Abstract:As soon as abstract mathematical computations were adapted to computation on digital computers, the problem of efficient representation, manipulation, and communication of the numerical values in those computations arose. Strongly related to the problem of numerical representation is the problem of quantization: in what manner should a set of continuous real-valued numbers be distributed over a fixed discrete set of numbers to minimize the number of bits required and also to maximize the accuracy of the attendant computations? This perennial problem of quantization is particularly relevant whenever memory and/or computational resources are severely restricted, and it has come to the forefront in recent years due to the remarkable performance of Neural Network models in computer vision, natural language processing, and related areas. Moving from floating-point representations to low-precision fixed integer values represented in four bits or less holds the potential to reduce the memory footprint and latency by a factor of 16x; and, in fact, reductions of 4x to 8x are often realized in practice in these applications. Thus, it is not surprising that quantization has emerged recently as an important and very active sub-area of research in the efficient implementation of computations associated with Neural Networks. In this article, we survey approaches to the problem of quantizing the numerical values in deep Neural Network computations, covering the advantages/disadvantages of current methods. With this survey and its organization, we hope to have presented a useful snapshot of the current research in quantization for Neural Networks and to have given an intelligent organization to ease the evaluation of future research in this area.

Comments:	Book Chapter: Low-Power Computer Vision: Improving the Efficiency of Artificial Intelligence
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2103.13630 [cs.CV]
	(or arXiv:2103.13630v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2103.13630

Submission history

From: Amir Gholami [view email]
[v1] Thu, 25 Mar 2021 06:57:11 UTC (1,191 KB)
[v2] Thu, 22 Apr 2021 23:59:25 UTC (2,204 KB)
[v3] Mon, 21 Jun 2021 21:01:12 UTC (1,549 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:A Survey of Quantization Methods for Efficient Neural Network Inference

Submission history

Access Paper:

References & Citations

2 blog links

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:A Survey of Quantization Methods for Efficient Neural Network Inference

Submission history

Access Paper:

References & Citations

2 blog links

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators