当前位置: X-MOL 学术arXiv.cs.SD › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Data is Overrated: Perceptual Metrics Can Lead Learning in the Absence of Training Data
arXiv - CS - Sound Pub Date : 2023-12-06 , DOI: arxiv-2312.03455
Tashi Namgyal, Alexander Hepburn, Raul Santos-Rodriguez, Valero Laparra, Jesus Malo

Perceptual metrics are traditionally used to evaluate the quality of natural signals, such as images and audio. They are designed to mimic the perceptual behaviour of human observers and usually reflect structures found in natural signals. This motivates their use as loss functions for training generative models such that models will learn to capture the structure held in the metric. We take this idea to the extreme in the audio domain by training a compressive autoencoder to reconstruct uniform noise, in lieu of natural data. We show that training with perceptual losses improves the reconstruction of spectrograms and re-synthesized audio at test time over models trained with a standard Euclidean loss. This demonstrates better generalisation to unseen natural signals when using perceptual metrics.

中文翻译:

数据被高估了:感知指标可以在缺乏训练数据的情况下引导学习

感知指标传统上用于评估自然信号的质量,例如图像和音频。它们旨在模仿人类观察者的感知行为,通常反映自然信号中发现的结构。这促使它们用作训练生成模型的损失函数,以便模型能够学习捕获度量中包含的结构。我们通过训练压缩自动编码器来重建均匀噪声来代替自然数据,从而在音频领域将这一想法发挥到极致。我们表明,与使用标准欧几里得损失训练的模型相比,使用感知损失进行训练可以改善测试时频谱图的重建和重新合成的音频。这表明使用感知指标时可以更好地概括看不见的自然信号。
更新日期:2023-12-07
down
wechat
bug