当前位置: X-MOL 学术Future Gener. Comput. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Coarse-to-Fine: A hierarchical DNN inference framework for edge computing
Future Generation Computer Systems ( IF 7.5 ) Pub Date : 2024-03-16 , DOI: 10.1016/j.future.2024.03.009
Zao Zhang , Yuning Zhang , Wei Bao , Changyang Li , Dong Yuan

Deep neural networks (DNNs) have been increasingly used in recent years to achieve higher inference accuracy; however, implementing deeper networks in edge-computing environments can be challenging. Current methods for accelerating CNN inference focus on finding a trade-off between accuracy and latency under an assumed uniform distribution, ignoring the impact of real-world data distributions. To address this, we propose the Coarse-to-Fine (C2F) framework, which includes a C2F model and a corresponding C2F inference architecture to better exploit distributional differences in the edge environment. The C2F model is derived from various adaptations of Convolutional Neural Networks (CNNs). By deconstructing the original CNNs into multiple smaller models, the C2F model increases memory consumption within an acceptable range to improve inference speed without sacrificing accuracy. The C2F architecture deploys C2F models more logically in complex edge environments, reducing inference costs and memory consumption. We conduct experiments on the CIFAR dataset with different backbone networks and show that our C2F framework can simultaneously reduce latency and improve accuracy in complex edge environments.

中文翻译:

从粗到细:用于边缘计算的分层 DNN 推理框架

近年来,深度神经网络(DNN)得到越来越多的应用,以实现更高的推理精度;然而,在边缘计算环境中实施更深的网络可能具有挑战性。当前加速 CNN 推理的方法侧重于在假设的均匀分布下找到准确性和延迟之间的权衡,忽略了现实世界数据分布的影响。为了解决这个问题,我们提出了 Coarse-to-Fine (C2F) 框架,其中包括 C2F 模型和相应的 C2F 推理架构,以更好地利用边缘环境中的分布差异。 C2F 模型源自卷积神经网络 (CNN) 的各种改编。通过将原始CNN解构为多个更小的模型,C2F模型在可接受的范围内增加内存消耗,从而在不牺牲准确性的情况下提高推理速度。 C2F架构在复杂的边缘环境中更逻辑地部署C2F模型,降低推理成本和内存消耗。我们在具有不同骨干网络的 CIFAR 数据集上进行了实验,结果表明我们的 C2F 框架可以在复杂的边缘环境中同时减少延迟并提高准确性。
更新日期:2024-03-16
down
wechat
bug