Skip to main content
Log in

Determining optimal channel partition for 2:4 fine grained structured sparsity

  • Original Paper
  • Published:
Optimization Letters Aims and scope Submit manuscript

Abstract

Deep Neural Networks (DNNs) have demonstrated tremendous success in many applications, but incur high computational burden on the inference side. The 2:4 sparsity pruning method has recently been developed to effectively compress and accelerate DNNs with little to no loss in performance. The method comprises a training phase followed by a pruning step where 2 out of 4 consecutive weights are eliminated to obtain a pruned matrix, which is then retrained to fine-tune the remaining weights. The accuracy of the resultant sparse network is maximized by permuting the matrix along the channel dimension in a way that maximizes the total magnitude of weights preserved during pruning. While earlier works have proposed heuristic methods to generate good permutations, we formalized the problem as a discrete optimization problem. In this paper, we propose four different mathematical programs to determine the optimal permutations and compare their performance for small-sized instances using a standard solver. Further, we develop a complementary column generation scheme to solve DNNs with realistic number of channels.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Al-Ykoob, S.M., Sherali, H.D.: A complementary column generation approach for the graph equipartition problem. Informatica 31, 1–20 (2020)

    Article  MathSciNet  Google Scholar 

  2. Anwar, S., Hwang, K., Sung, W.: Structured pruning of deep convolutional neural networks. J. Emerg. Technol. Comput. Syst. 13(3), 1 (2017)

    Article  Google Scholar 

  3. Barnhart, C., Johnson, E.L., Nemhauser, G.L., Savelsbergh, M.W.P., Vance, P.H.: Branch-and-price: Column generation for solving huge integer programs. Oper. Res. 46(3), 316–329 (1998)

    Article  MathSciNet  Google Scholar 

  4. Bertsimas, D.: Introduction to linear optimization. Athena Scientific series in optimization and neural computation. Athena Scientific, Belmont, Mass (1997–1997)

  5. Garey, M.R., Johnson, D.S.: Computers and intractability; a guide to the theory of NP-completeness. W. H. Freeman & Co., USA (1990)

  6. Ghoniem, A., Sherali, H.D.: Complementary column generation and bounding approaches for set partitioning formulations. Optim. Lett. 3(1), 123–136 (2009)

    Article  MathSciNet  Google Scholar 

  7. Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural networks. In: Proc. of the 28th Intl. Conference on Neural Information Processing Systems—Volume 1, NIPS’15, p. 1135–1143. MIT Press, Cambridge, MA, USA (2015)

  8. Hestness, J., Ardalani, N., Diamos, G.: Beyond human-level accuracy: Computational challenges in deep learning. In: Proc. of the 24th Symposium on principles and practice of parallel programming, PPoPP ’19, p. 1-14. Association for Computing Machinery, New York, NY, USA (2019)

  9. Lee, E., Lee, C.Y.: Neuralscale: Efficient scaling of neurons for resource-constrained deep neural networks. In: Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1478–1487 (2020)

  10. Li, Y., Gu, S., Mayer, C., Van Gool, L., Timofte, R.: Group sparsity: The hinge between filter pruning and decomposition for network compression. In: Proc. of the IEEE international conf. on computer vision (2020)

  11. Mishra, A., Latorre, J.A., Pool, J., Stosic, D., Stosic, D., Venkatesh, G., Yu, C., Micikevicius, P.: Accelerating sparse deep neural networks (2021)

  12. NVIDIA apex library. https://github.com/NVIDIA/apex

  13. NVIDIA: Nvidia a100 tensor core gpu architecture (2020). Whitepaper at https://images.nvidia.com/aem-dam/en-zz/Solutions/data-center/nvidia-ampere-architecture-whitepaper.pdf

  14. Pool, J., Yu, C.: Channel permutations for n:m sparsity. In: Advances in neural information processing systems, pp. 13316–13327 (2021)

  15. Savelsbergh, M.: A branch-and-price algorithm for the generalized assignment problem. Oper. Res. 45(6), 831–841 (1997)

    Article  MathSciNet  Google Scholar 

  16. Zhou, A., Ma, Y., Zhu, J., Liu, J., Zhang, Z., Yuan, K., Sun, W., Li, H.: Learning n:m fine-grained structured sparse neural networks from scratch (2021)

  17. Zhu, M., Gupta, S.: To prune, or not to prune: exploring the efficacy of pruning for model compression (2018). URL https://openreview.net/forum?id=Sy1iIDkPM

Download references

Acknowledgements

We acknowledge the immense contributions of Dr. Jeff Pool, NVIDIA, who introduced us to this problem and supported the numerical testing by sharing the APEX repository and his expertise. Rakesh Nagi also acknowledges NVIDIA for an equipment gift under the Applied Accelerator Program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rakesh Nagi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mahajan, M., Hwu, WM. & Nagi, R. Determining optimal channel partition for 2:4 fine grained structured sparsity. Optim Lett (2024). https://doi.org/10.1007/s11590-023-02084-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11590-023-02084-8

Keywords

Navigation