当前位置: X-MOL 学术IEEE Signal Process. Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
KLANN: Linearising Long-Term Dynamics in Nonlinear Audio Effects Using Koopman Networks
IEEE Signal Processing Letters ( IF 3.9 ) Pub Date : 2024-04-16 , DOI: 10.1109/lsp.2024.3389465
Ville Huhtala 1 , Lauri Juvela 1 , Sebastian J. Schlecht 1
Affiliation  

In recent years, neural network-based black-box modeling of nonlinear audio effects has improved considerably. Present convolutional and recurrent models can model audio effects with long-term dynamics, but the models require many parameters, thus increasing the processing time. In this letter, we propose KLANN, a Koopman-Linearised Audio Neural Network structure that lifts a one-dimensional signal (mono audio) into a high-dimensional approximately linear state-space representation with nonlinear mapping, and then uses differentiable biquad filters to predict linearly within the lifted state-space. Results show that the proposed models match the high performance of the state-of-the-art neural models while having a more compact architecture, reducing the number of parameters by tenfold, and having interpretable components.

中文翻译:

KLANN:使用 Koopman 网络线性化非线性音频效果中的长期动态

近年来,基于神经网络的非线性音频效果黑盒建模有了相当大的改进。目前的卷积和循环模型可以对长期动态的音频效果进行建模,但模型需要许多参数,从而增加了处理时间。在这封信中,我们提出了 KLANN,一种库普曼线性化音频神经网络结构,它将一维信号(单声道音频)提升为具有非线性映射的高维近似线性状态空间表示,然后使用可微分双二阶滤波器进行预测在提升的状态空间内呈线性。结果表明,所提出的模型与最先进的神经模型的高性能相匹配,同时具有更紧凑的架构,将参数数量减少了十倍,并具有可解释的组件。
更新日期:2024-04-16
down
wechat
bug