Determining optimal channel partition for 2:4 fine grained structured sparsity

Mahajan, Mohit; Hwu, Wen-Mei; Nagi, Rakesh

doi:10.1007/s11590-023-02084-8

Determining optimal channel partition for 2:4 fine grained structured sparsity

Original Paper
Published: 11 January 2024

(2024)
Cite this article

Optimization Letters Aims and scope Submit manuscript

96 Accesses
Explore all metrics

Abstract

Deep Neural Networks (DNNs) have demonstrated tremendous success in many applications, but incur high computational burden on the inference side. The 2:4 sparsity pruning method has recently been developed to effectively compress and accelerate DNNs with little to no loss in performance. The method comprises a training phase followed by a pruning step where 2 out of 4 consecutive weights are eliminated to obtain a pruned matrix, which is then retrained to fine-tune the remaining weights. The accuracy of the resultant sparse network is maximized by permuting the matrix along the channel dimension in a way that maximizes the total magnitude of weights preserved during pruning. While earlier works have proposed heuristic methods to generate good permutations, we formalized the problem as a discrete optimization problem. In this paper, we propose four different mathematical programs to determine the optimal permutations and compare their performance for small-sized instances using a standard solver. Further, we develop a complementary column generation scheme to solve DNNs with realistic number of channels.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Data-Driven Sparse Structure Selection for Deep Neural Networks

Multiple sparse spaces network pruning via a joint similarity criterion

Article 16 June 2023

Channel Pruning via Optimal Thresholding

References

Al-Ykoob, S.M., Sherali, H.D.: A complementary column generation approach for the graph equipartition problem. Informatica 31, 1–20 (2020)
Article MathSciNet Google Scholar
Anwar, S., Hwang, K., Sung, W.: Structured pruning of deep convolutional neural networks. J. Emerg. Technol. Comput. Syst. 13(3), 1 (2017)
Article Google Scholar
Barnhart, C., Johnson, E.L., Nemhauser, G.L., Savelsbergh, M.W.P., Vance, P.H.: Branch-and-price: Column generation for solving huge integer programs. Oper. Res. 46(3), 316–329 (1998)
Article MathSciNet Google Scholar
Bertsimas, D.: Introduction to linear optimization. Athena Scientific series in optimization and neural computation. Athena Scientific, Belmont, Mass (1997–1997)
Garey, M.R., Johnson, D.S.: Computers and intractability; a guide to the theory of NP-completeness. W. H. Freeman & Co., USA (1990)
Ghoniem, A., Sherali, H.D.: Complementary column generation and bounding approaches for set partitioning formulations. Optim. Lett. 3(1), 123–136 (2009)
Article MathSciNet Google Scholar
Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural networks. In: Proc. of the 28th Intl. Conference on Neural Information Processing Systems—Volume 1, NIPS’15, p. 1135–1143. MIT Press, Cambridge, MA, USA (2015)
Hestness, J., Ardalani, N., Diamos, G.: Beyond human-level accuracy: Computational challenges in deep learning. In: Proc. of the 24th Symposium on principles and practice of parallel programming, PPoPP ’19, p. 1-14. Association for Computing Machinery, New York, NY, USA (2019)
Lee, E., Lee, C.Y.: Neuralscale: Efficient scaling of neurons for resource-constrained deep neural networks. In: Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1478–1487 (2020)
Li, Y., Gu, S., Mayer, C., Van Gool, L., Timofte, R.: Group sparsity: The hinge between filter pruning and decomposition for network compression. In: Proc. of the IEEE international conf. on computer vision (2020)
Mishra, A., Latorre, J.A., Pool, J., Stosic, D., Stosic, D., Venkatesh, G., Yu, C., Micikevicius, P.: Accelerating sparse deep neural networks (2021)
NVIDIA apex library. https://github.com/NVIDIA/apex
NVIDIA: Nvidia a100 tensor core gpu architecture (2020). Whitepaper at https://images.nvidia.com/aem-dam/en-zz/Solutions/data-center/nvidia-ampere-architecture-whitepaper.pdf
Pool, J., Yu, C.: Channel permutations for n:m sparsity. In: Advances in neural information processing systems, pp. 13316–13327 (2021)
Savelsbergh, M.: A branch-and-price algorithm for the generalized assignment problem. Oper. Res. 45(6), 831–841 (1997)
Article MathSciNet Google Scholar
Zhou, A., Ma, Y., Zhu, J., Liu, J., Zhang, Z., Yuan, K., Sun, W., Li, H.: Learning n:m fine-grained structured sparse neural networks from scratch (2021)
Zhu, M., Gupta, S.: To prune, or not to prune: exploring the efficacy of pruning for model compression (2018). URL https://openreview.net/forum?id=Sy1iIDkPM

Download references

Acknowledgements

We acknowledge the immense contributions of Dr. Jeff Pool, NVIDIA, who introduced us to this problem and supported the numerical testing by sharing the APEX repository and his expertise. Rakesh Nagi also acknowledges NVIDIA for an equipment gift under the Applied Accelerator Program.

Author information

Authors and Affiliations

Department of Industrial and Systems Engineering, University of Illinois, Urbana-Champaign, IL, USA
Mohit Mahajan & Rakesh Nagi
Department of Electrical and Computer Engineering, University of Illinois, Urbana-Champaign, IL, USA
Wen-Mei Hwu

Authors

Mohit Mahajan
View author publications
You can also search for this author in PubMed Google Scholar
Wen-Mei Hwu
View author publications
You can also search for this author in PubMed Google Scholar
Rakesh Nagi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rakesh Nagi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Mahajan, M., Hwu, WM. & Nagi, R. Determining optimal channel partition for 2:4 fine grained structured sparsity. Optim Lett (2024). https://doi.org/10.1007/s11590-023-02084-8

Download citation

Received: 01 December 2022
Accepted: 04 December 2023
Published: 11 January 2024
DOI: https://doi.org/10.1007/s11590-023-02084-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Determining optimal channel partition for 2:4 fine grained structured sparsity

Abstract

Access this article

Similar content being viewed by others

Data-Driven Sparse Structure Selection for Deep Neural Networks

Multiple sparse spaces network pruning via a joint similarity criterion

Channel Pruning via Optimal Thresholding

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Determining optimal channel partition for 2:4 fine grained structured sparsity

Abstract

Access this article

Similar content being viewed by others

Data-Driven Sparse Structure Selection for Deep Neural Networks

Multiple sparse spaces network pruning via a joint similarity criterion

Channel Pruning via Optimal Thresholding

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation