当前位置: X-MOL 学术arXiv.cs.AR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Cicero: Addressing Algorithmic and Architectural Bottlenecks in Neural Rendering by Radiance Warping and Memory Optimizations
arXiv - CS - Hardware Architecture Pub Date : 2024-04-18 , DOI: arxiv-2404.11852
Yu Feng, Zihan Liu, Jingwen Leng, Minyi Guo, Yuhao Zhu

Neural Radiance Field (NeRF) is widely seen as an alternative to traditional physically-based rendering. However, NeRF has not yet seen its adoption in resource-limited mobile systems such as Virtual and Augmented Reality (VR/AR), because it is simply extremely slow. On a mobile Volta GPU, even the state-of-the-art NeRF models generally execute only at 0.8 FPS. We show that the main performance bottlenecks are both algorithmic and architectural. We introduce, CICERO, to tame both forms of inefficiencies. We first introduce two algorithms, one fundamentally reduces the amount of work any NeRF model has to execute, and the other eliminates irregular DRAM accesses. We then describe an on-chip data layout strategy that eliminates SRAM bank conflicts. A pure software implementation of CICERO offers an 8.0x speed-up and 7.9x energy saving over a mobile Volta GPU. When compared to a baseline with a dedicated DNN accelerator, our speed-up and energy reduction increase to 28.2x and 37.8x, respectively - all with minimal quality loss (less than 1.0 dB peak signal-to-noise ratio reduction).

中文翻译:

Cicero:通过辐射扭曲和内存优化解决神经渲染中的算法和架构瓶颈

神经辐射场 (NeRF) 被广泛视为传统基于物理的渲染的替代方案。然而,NeRF 尚未在虚拟和增强现实 (VR/AR) 等资源有限的移动系统中得到采用,因为它的速度非常慢。在移动 Volta GPU 上,即使是最先进的 NeRF 模型通常也只能以 0.8 FPS 执行。我们表明主要的性能瓶颈是算法和架构上的。我们引入 CICERO 来克服这两种形式的低效率。我们首先介绍两种算法,一种从根本上减少任何 NeRF 模型必须执行的工作量,另一种消除不规则的 DRAM 访问。然后我们描述了一种消除 SRAM 存储体冲突的片上数据布局策略。与移动 Volta GPU 相比,CICERO 的纯软件实现可提供 8.0 倍的加速和 7.9 倍的节能。与使用专用 DNN 加速器的基线相比,我们的加速和能耗分别增加到 28.2 倍和 37.8 倍 - 所有这些都具有最小的质量损失(峰值信噪比降低不到 1.0 dB)。
更新日期:2024-04-19
down
wechat
bug