当前位置:
X-MOL 学术
›
arXiv.cs.SD
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
APCodec: A Neural Audio Codec with Parallel Amplitude and Phase Spectrum Encoding and Decoding
arXiv - CS - Sound Pub Date : 2024-02-16 , DOI: arxiv-2402.10533 Yang Ai, Xiao-Hang Jiang, Ye-Xin Lu, Hui-Peng Du, Zhen-Hua Ling
arXiv - CS - Sound Pub Date : 2024-02-16 , DOI: arxiv-2402.10533 Yang Ai, Xiao-Hang Jiang, Ye-Xin Lu, Hui-Peng Du, Zhen-Hua Ling
This paper introduces a novel neural audio codec targeting high waveform
sampling rates and low bitrates named APCodec, which seamlessly integrates the
strengths of parametric codecs and waveform codecs. The APCodec revolutionizes
the process of audio encoding and decoding by concurrently handling the
amplitude and phase spectra as audio parametric characteristics like parametric
codecs. It is composed of an encoder and a decoder with the modified ConvNeXt
v2 network as the backbone, connected by a quantizer based on the residual
vector quantization (RVQ) mechanism. The encoder compresses the audio amplitude
and phase spectra in parallel, amalgamating them into a continuous latent code
at a reduced temporal resolution. This code is subsequently quantized by the
quantizer. Ultimately, the decoder reconstructs the audio amplitude and phase
spectra in parallel, and the decoded waveform is obtained by inverse short-time
Fourier transform. To ensure the fidelity of decoded audio like waveform
codecs, spectral-level loss, quantization loss, and generative adversarial
network (GAN) based loss are collectively employed for training the APCodec. To
support low-latency streamable inference, we employ feed-forward layers and
causal convolutional layers in APCodec, incorporating a knowledge distillation
training strategy to enhance the quality of decoded audio. Experimental results
confirm that our proposed APCodec can encode 48 kHz audio at bitrate of just 6
kbps, with no significant degradation in the quality of the decoded audio. At
the same bitrate, our proposed APCodec also demonstrates superior decoded audio
quality and faster generation speed compared to well-known codecs, such as
SoundStream, Encodec, HiFi-Codec and AudioDec.
更新日期:2024-02-19