Data clustering for efficient approximate computing,Design Automation for Embedded Systems

当前位置： X-MOL 学术 › Des. Autom. Embed. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Data clustering for efficient approximate computing
Design Automation for Embedded Systems ( IF 1.4 ) Pub Date : 2019-11-09 , DOI: 10.1007/s10617-019-09228-z
Michael G. Jordan , Marcelo Brandalero , Guilherme M. Malfatti , Geraldo F. Oliveira , Arthur F. Lorenzon , Bruno C. da Silva , Luigi Carro , Mateus B. Rutzig , Antonio Carlos S. Beck

Given the saturation of single-threaded performance improvements in General-Purpose Processor, novel architectural techniques are required to meet emerging demands. In this paper, we propose a generic acceleration framework for approximate algorithms that replaces function execution by table look-up accesses in dedicated memories. A strategy based on the K-Means Clustering algorithm is used to learn mappings from arbitrary function inputs to frequently occurring outputs at compile-time. At run-time, these learned values are fetched from dedicated look-up tables and the best result is selected using the Nearest-Centroid Classifier, which is implemented in hardware. The proposed approach improves over the state-of-the-art neural acceleration solution, with nearly 3X times better performance, \(18.72\%\) up to \(90.99\%\) energy reductions and \(17\%\) area savings under similar levels of quality, thus opening new opportunities for performance harvesting in approximate accelerators.

中文翻译：

数据聚类，实现高效的近似计算

考虑到通用处理器中单线程性能改进的饱和，需要新颖的架构技术来满足新兴需求。在本文中，我们为近似算法提出了一种通用的加速框架，该框架通过专用存储器中的表查找访问来代替函数执行。基于K均值聚类算法的策略用于学习在编译时从任意函数输入到频繁出现的输出的映射。在运行时，将从专用查找表中获取这些学习值，并使用以硬件实现的Nearest-Centroid分类器选择最佳结果。所提出的方法改进了国家的最先进的神经加速解决方案，用了近3倍倍更好的性能，\（18.72 \％\）高达在类似的质量水平下，可降低\（90.99 \％\）的能耗并节省\（17 \％\）的面积，从而为近似加速器中的性能提升提供了新的机会。

更新日期：2019-11-09

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>