KLANN: Linearising Long-Term Dynamics in Nonlinear Audio Effects Using Koopman Networks,IEEE Signal Processing Letters

当前位置： X-MOL 学术 › IEEE Signal Process. Lett. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

KLANN: Linearising Long-Term Dynamics in Nonlinear Audio Effects Using Koopman Networks
IEEE Signal Processing Letters ( IF 3.9 ) Pub Date : 2024-04-16 , DOI: 10.1109/lsp.2024.3389465
Ville Huhtala ₁ , Lauri Juvela ₁ , Sebastian J. Schlecht ₁

Affiliation

In recent years, neural network-based black-box modeling of nonlinear audio effects has improved considerably. Present convolutional and recurrent models can model audio effects with long-term dynamics, but the models require many parameters, thus increasing the processing time. In this letter, we propose KLANN, a Koopman-Linearised Audio Neural Network structure that lifts a one-dimensional signal (mono audio) into a high-dimensional approximately linear state-space representation with nonlinear mapping, and then uses differentiable biquad filters to predict linearly within the lifted state-space. Results show that the proposed models match the high performance of the state-of-the-art neural models while having a more compact architecture, reducing the number of parameters by tenfold, and having interpretable components.

中文翻译：

KLANN：使用 Koopman 网络线性化非线性音频效果中的长期动态

近年来，基于神经网络的非线性音频效果黑盒建模有了相当大的改进。目前的卷积和循环模型可以对长期动态的音频效果进行建模，但模型需要许多参数，从而增加了处理时间。在这封信中，我们提出了 KLANN，一种库普曼线性化音频神经网络结构，它将一维信号（单声道音频）提升为具有非线性映射的高维近似线性状态空间表示，然后使用可微分双二阶滤波器进行预测在提升的状态空间内呈线性。结果表明，所提出的模型与最先进的神经模型的高性能相匹配，同时具有更紧凑的架构，将参数数量减少了十倍，并具有可解释的组件。

更新日期：2024-04-16

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>