Energy-Efficient Brain Floating Point Convolutional Neural Network Using Memristors,IEEE Transactions on Electron Devices

当前位置： X-MOL 学术 › IEEE Trans. Elect. Dev. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Energy-Efficient Brain Floating Point Convolutional Neural Network Using Memristors
IEEE Transactions on Electron Devices ( IF 3.1 ) Pub Date : 2024-04-02 , DOI: 10.1109/ted.2024.3379953
Shao-Qin Tong ₁ , Han Bao ₁ , Jian-Cong Li ₁ , Ling Yang ₁ , Hou-Ji Zhou ₁ , Yi Li ₁ , Xiang-Shui Miao ₁

Affiliation

In this article, a memristor-based convolutional neural network (CNN) is implemented to achieve both brain floating point (BF16) processing accuracy and high energy efficiency for cloud artificial intelligence (AI) acceleration for the first time. A low-cost in-memory floating-point (FP) computing arithmetic is developed with an approximate computing technique, which sufficiently considers the requirement of both efficiency and accuracy in FP-CNN applications. For further optimization, we investigate the impact of nonideal effects at the device and array level (including the device variation, line resistance, array parasitic capacitance, etc.) on analog computing. Based on these, a bit-weight slicing technique is employed for highly efficient and accurate FP computing within the memristor crossbar array. Meanwhile, an in-memory convolutional operating method is proposed to further reduce the hardware overhead in deploying large-scale CNNs for complex datasets. By combining the above strategies, we evaluate the performance of memristor-implemented BF16-CNNs using the VGG-16 network on the CIFAR-10 dataset. An 85.47% classification accuracy with 1.987 TFLOPS/W energy efficiency and 20.9895 GFLOPS/mm2 area efficiency is obtained.

中文翻译：

使用忆阻器的节能大脑浮点卷积神经网络

本文实现了基于忆阻器的卷积神经网络（CNN），首次实现了云人工智能（AI）加速的脑浮点（BF16）处理精度和高能效。采用近似计算技术开发了一种低成本内存浮点（FP）计算算法，充分考虑了FP-CNN应用中对效率和精度的要求。为了进一步优化，我们研究了器件和阵列层面的非理想效应（包括器件变化、线路电阻、阵列寄生电容等）对模拟计算的影响。在此基础上，采用位权切片技术在忆阻器交叉阵列内进行高效、准确的浮点计算。同时，提出了一种内存中卷积运算方法，以进一步减少为复杂数据集部署大规模 CNN 时的硬件开销。通过结合上述策略，我们在 CIFAR-10 数据集上使用 VGG-16 网络评估忆阻器实现的 BF16-CNN 的性能。分类精度为 85.47%，能量效率为 1.987 TFLOPS/W，面积效率为 20.9895 GFLOPS/mm2。

更新日期：2024-04-02

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>