当前位置: X-MOL 学术EURASIP J. Audio Speech Music Proc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Paralinguistic and spectral feature extraction for speech emotion classification using machine learning techniques
EURASIP Journal on Audio, Speech, and Music Processing ( IF 2.4 ) Pub Date : 2023-05-15 , DOI: 10.1186/s13636-023-00290-x
Tong Liu , Xiaochen Yuan

Emotion plays a dominant role in speech. The same utterance with different emotions can lead to a completely different meaning. The ability to perform various of emotion during speaking is also one of the typical characters of human. In this case, technology trends to develop advanced speech emotion classification algorithms in the demand of enhancing the interaction between computer and human beings. This paper proposes a speech emotion classification approach based on the paralinguistic and spectral features extraction. The Mel-frequency cepstral coefficients (MFCC) are extracted as spectral feature, and openSMILE is employed to extract the paralinguistic feature. The machine learning techniques multi-layer perceptron classifier and support vector machines are respectively applied into the extracted features for the classification of the speech emotions. We have conducted experiments on the Berlin database to evaluate the performance of the proposed approach. Experimental results show that the proposed approach achieves satisfied performances. Comparisons are conducted in clean condition and noisy condition respectively, and the results indicate better performance of the proposed scheme.

中文翻译:

使用机器学习技术进行语音情感分类的副语言和光谱特征提取

情绪在言语中起着主导作用。同样的一句话,不同的情绪,会导致完全不同的意思。说话时表现出各种情绪的能力也是人类的典型特征之一。在这种情况下,在增强计算机与人之间交互的需求下,技术趋势是开发高级语音情感分类算法。本文提出了一种基于副语言和谱特征提取的语音情感分类方法。梅尔频率倒谱系数 (MFCC) 被提取为频谱特征,并使用 openSMILE 提取副语言特征。将机器学习技术多层感知器分类器和支持向量机分别应用于提取的语音情感分类特征。我们在柏林数据库上进行了实验,以评估所提出方法的性能。实验结果表明,所提出的方法取得了令人满意的性能。分别在干净条件和嘈杂条件下进行比较,结果表明所提出的方案具有更好的性能。
更新日期:2023-05-15
down
wechat
bug