当前位置: X-MOL 学术ACM Trans. Archit. Code Optim. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
SAC: An Ultra-Efficient Spin-based Architecture for Compressed DNNs
ACM Transactions on Architecture and Code Optimization ( IF 1.6 ) Pub Date : 2024-01-19 , DOI: 10.1145/3632957
Yunping Zhao 1 , Sheng Ma 2 , Hengzhu Liu 3 , Libo Huang 3 , Yi Dai 3
Affiliation  

Deep Neural Networks (DNNs) have achieved great progress in academia and industry. But they have become computational and memory intensive with the increase of network depth. Previous designs seek breakthroughs in software and hardware levels to mitigate these challenges. At the software level, neural network compression techniques have effectively reduced network scale and energy consumption. However, the conventional compression algorithm is complex and energy intensive. At the hardware level, the improvements in the semiconductor process have effectively reduced power and energy consumption. However, it is difficult for the traditional Von-Neumann architecture to further reduce the power consumption, due to the memory wall and the end of Moore’s law. To overcome these challenges, the spintronic device based DNN machines have emerged for their non-volatility, ultra low power, and high energy efficiency. However, there is no spin-based design that has achieved innovation at both the software and hardware level. Specifically, there is no systematic study of spin-based DNN architecture to deploy compressed networks.

In our study, we present an ultra-efficient Spin-based Architecture for Compressed DNNs (SAC), to substantially reduce power consumption and energy consumption. Specifically, we propose a One-Step Compression algorithm (OSC) to reduce the computational complexity with minimum accuracy loss. We also propose a spin-based architecture to realize better performance for the compressed network. Furthermore, we introduce a novel computation flow that enables the reuse of activations and weights. Experimental results show that our study can reduce the computational complexity of compression algorithm from 𝒪(Tk3 to 𝒪(k2 log k), and achieve 14× ∼ 40× compression ratio. Furthermore, our design can attain a 2× enhancement in power efficiency and a 5× improvement in computational efficiency compared to the Eyeriss. Our models are available at an anonymous link https://bit.ly/39cdtTa.



中文翻译:

SAC:用于压缩 DNN 的超高效基于自旋的架构

深度神经网络(DNN)在学术界和工业界取得了巨大进展。但随着网络深度的增加,它们变得计算和内存密集型。以前的设计寻求软件和硬件层面的突破来缓解这些挑战。在软件层面,神经网络压缩技术有效降低了网络规模和能耗。然而,传统的压缩算法复杂且耗能。在硬件层面,半导体工艺的改进有效降低了功耗和能耗。然而,由于内存墙和摩尔定律的终结,传统的冯诺依曼架构很难进一步降低功耗。为了克服这些挑战,基于自旋电子器件的 DNN 机器因其非易失性、超低功耗和高能效而应运而生。然而,目前还没有一种基于自旋的设计在软件和硬件层面都实现了创新。具体来说,还没有系统地研究基于自旋的 DNN 架构来部署压缩网络。

在我们的研究中,我们提出了一种超高效的基于自旋的压缩 DNN 架构 (SAC),可大幅降低功耗和能耗。具体来说,我们提出了一种单步压缩算法(OSC),以最小化精度损失来降低计算复杂性。我们还提出了一种基于自旋的架构,以实现压缩网络更好的性能。此外,我们引入了一种新颖的计算流程,可以重用激活和权重。实验结果表明,我们的研究可以将压缩算法的计算复杂度从 𝒪( Tk 3到 𝒪( k 2 log k ) 降低到 𝒪( k 2 log k ),并实现 14× ∼ 40× 压缩比。此外,我们的设计可以实现 2 倍的功率增强与 Eyeriss 相比,计算效率提高了 5 倍。我们的模型可通过匿名链接 https://bit.ly/39cdtTa 获得。

更新日期:2024-01-19
down
wechat
bug