当前位置: X-MOL 学术Sustain. Comput. Inform. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multi-exit DNN inference acceleration for intelligent terminal with heterogeneous processors
Sustainable Computing: Informatics and Systems ( IF 4.5 ) Pub Date : 2023-08-22 , DOI: 10.1016/j.suscom.2023.100906
Jinghui Zhang , Weilong Xin , Dingyang Lv , Jiawei Wang , Guangxing Cai , Fang Dong

Recently, there has been a burgeoning popularity in the deployment of deep learning vision applications upon terminal devices. However, as the number of layers in deep neural networks (DNNs) and structural complexity increase, although the performance of DNN in handling computer vision tasks has become powerful, model inference tasks on computation resource constrained intelligent terminal devices are frequently incapable of meeting latency requirement. A commonly adopted solution to inference acceleration presents multi-exit DNNs to reduce latency via the provision of early exits. However, existing methods do not fully utilize the potential of heterogeneous processors (GPU/CPU) on intelligent terminal devices to cooperatively accelerate multi-exit DNN inference in parallel. Furthermore, the impact of complex image and video input on multi-exit DNNs, as well as the effects of different power consumption modes on processors within intelligent terminal devices, remain inadequately explored. To address these issues, we comprehensively considered the computing performance of heterogeneous processors in different power consumption modes, the structure and characteristic of multi-exit DNNs in inference acceleration, and proposed the Collaborative Inference Acceleration mechanism for intelligent terminal with Heterogeneous Processors (CIAHP). CIAHP includes a deep neural network computation time prediction model and a multi-exit DNN task allocation algorithm with heterogeneous processors. Our experiments demonstrate that CIAHP performs multi-exit DNN inference 2.31× faster than CPU alone, and is 1.23× faster than GPU alone when processing complex image samples.



中文翻译:

异构处理器智能终端多出口DNN推理加速

最近,在终端设备上部署深度学习视觉应用程序迅速流行。然而,随着深度神经网络(DNN)层数和结构复杂性的增加,尽管DNN处理计算机视觉任务的性能变得强大,但计算资源受限的智能终端设备上的模型推理任务常常无法满足延迟要求。一种普遍采用的推理加速解决方案是多出口 DNN,通过提供早期出口来减少延迟。然而,现有方法并没有充分利用智能终端设备上异构处理器(GPU/CPU)的潜力来协同并行地加速多出口DNN推理。此外,复杂图像和视频输入对多出口 DNN 的影响,以及不同功耗模式对智能终端设备内处理器的影响,仍然没有得到充分的研究。针对这些问题,我们综合考虑异构处理器在不同功耗模式下的计算性能、多出口DNN在推理加速中的结构和特点,提出了异构处理器智能终端协同推理加速机制CIAHP)。CIAHP 包括深度神经网络计算时间预测模型和具有异构处理器的多出口 DNN 任务分配算法。我们的实验表明,在处理复杂图像样本时,CIAHP 执行多出口 DNN 推理的速度比单独使用 CPU 的速度快 2.31 倍,比单独使用 GPU 的速度快 1.23 倍。

更新日期:2023-08-22
down
wechat
bug