当前位置: X-MOL 学术Optim. Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Determining optimal channel partition for 2:4 fine grained structured sparsity
Optimization Letters ( IF 1.6 ) Pub Date : 2024-01-11 , DOI: 10.1007/s11590-023-02084-8
Mohit Mahajan , Wen-Mei Hwu , Rakesh Nagi

Abstract

Deep Neural Networks (DNNs) have demonstrated tremendous success in many applications, but incur high computational burden on the inference side. The 2:4 sparsity pruning method has recently been developed to effectively compress and accelerate DNNs with little to no loss in performance. The method comprises a training phase followed by a pruning step where 2 out of 4 consecutive weights are eliminated to obtain a pruned matrix, which is then retrained to fine-tune the remaining weights. The accuracy of the resultant sparse network is maximized by permuting the matrix along the channel dimension in a way that maximizes the total magnitude of weights preserved during pruning. While earlier works have proposed heuristic methods to generate good permutations, we formalized the problem as a discrete optimization problem. In this paper, we propose four different mathematical programs to determine the optimal permutations and compare their performance for small-sized instances using a standard solver. Further, we develop a complementary column generation scheme to solve DNNs with realistic number of channels.



中文翻译:

确定 2:4 细粒度结构化稀疏性的最佳通道划分

摘要

深度神经网络(DNN)在许多应用中取得了巨大的成功,但在推理方面带来了很高的计算负担。最近开发出 2:4 稀疏剪枝方法,可以有效压缩和加速 DNN,而性能几乎没有损失。该方法包括训练阶段和随后的修剪步骤,其中消除 4 个连续权重中的 2 个以获得修剪矩阵,然后重新训练该矩阵以微调剩余权重。通过沿通道维度排列矩阵,以最大化剪枝期间保留的权重的总大小,从而最大化所得稀疏网络的准确性。虽然早期的工作提出了启发式方法来生成良好的排列,但我们将问题形式化为离散优化问题。在本文中,我们提出了四种不同的数学程序来确定最佳排列,并使用标准求解器比较它们对于小型实例的性能。此外,我们开发了一种补充列生成方案来解决具有实际通道数的 DNN。

更新日期:2024-01-12
down
wechat
bug