当前位置: X-MOL 学术ACM Trans. Multimed. Comput. Commun. Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
High Efficiency Deep-learning Based Video Compression
ACM Transactions on Multimedia Computing, Communications, and Applications ( IF 5.1 ) Pub Date : 2024-04-23 , DOI: 10.1145/3661311
Lv Tang 1 , Xinfeng Zhang 1
Affiliation  

Although deep learning technique has achieved significant improvement on image compression, but its advantages are not fully explored in video compression, which leads to the performance of deep-learning based video compression (DLVC) is obvious inferior to that of hybrid video coding framework. In this paper, we proposed a novel network to improve the performance of DLVC from its most important modules, including Motion Process (MP), Residual Compression (RC) and Frame Reconstruction (FR). In MP, we design a split second-order attention and multi-scale feature extraction module to fully remove the warping artifacts from multi-scale feature space and pixel space, which can help reduce the distortion in the following process. In RC, we propose a channel selection mechanism to gradually drop redundant information while preserving informative channels for a better rate-distortion performance. Finally, in FR, we introduce a residual multi-scale recurrent network to improve the quality of the current reconstructed frame by progressively exploiting temporal context information between it and its several previous reconstructed frames. Extensive experiments are conducted on the three widely used video compression datasets (HEVC, UVG and MCL-JVC), and the performance demonstrates the superiority of our proposed approach over the state-of-the-art methods.



中文翻译:

基于深度学习的高效视频压缩

虽然深度学习技术在图像压缩方面取得了显着的进步,但其在视频压缩方面的优势并未得到充分发挥,导致基于深度学习的视频压缩(DLVC)的性能明显逊色于混合视频编码框架。在本文中,我们提出了一种新颖的网络,从其最重要的模块(包括运动处理(MP)、残差压缩(RC)和帧重建(FR))来提高 DLVC 的性能。在MP中,我们设计了一个分离的二阶注意力和多尺度特征提取模块,以充分消除多尺度特征空间和像素空间中的扭曲伪影,这有助于减少后续过程中的失真。在RC中,我们提出了一种通道选择机制,以逐渐丢弃冗余信息,同时保留信息通道以获得更好的率失真性能。最后,在 FR 中,我们引入了残差多尺度循环网络,通过逐步利用当前重建帧与其之前几个重建帧之间的时间上下文信息来提高当前重建帧的质量。在三个广泛使用的视频压缩数据集(HEVC、UVG 和 MCL-JVC)上进行了大量实验,性能证明了我们提出的方法相对于最先进方法的优越性。

更新日期:2024-04-23
down
wechat
bug