Academia.eduAcademia.edu

Minimal feature set based classification of emotional speech

Abstract

This paper proposes the use of a minimum number of formant and bandwidth features for efficient classification of the neutral and six basic emotions in two languages. Such a minimal feature set facilitates fast and real time recognition of emotions which is the ultimate goal of any speech emotion recognition system. The investigations were done on emotional speech databases developed by the authors in English as well as Malayalam-a popular Indian language. For each language, the best features were identified by the KMeans, K-nearest neighbor and Naive Bayes classification of individual formants and bandwidths, followed by the artificial neural networks classification of the combination of the best formants and bandwidths. Whereas an overall emotion recognition accuracy of 85.28 % was obtained for Malayalam, based on the values of the first four formants and bandwidths, the recognition accuracy obtained for English was 86.15%, based on a feature set of the four formants and the first and fourth bandwidths, both of which are unprecedented. These results were obtained for elicited emotional speech of females and with statistically preprocessed formants and bandwidth values. Reduction in the number of emotion classes resulted in a striking increase in the recognition accuracy.