Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2013
…
6 pages
1 file
Abstract- Emotion based speaker Identification System is the process of automatically identifying speaker’s emotion based on features extracted from speech waves. This paper presents experiment with the building and testing of a Speaker’s emotion identification for Hindi speech using Mel Frequency Cepestral Coefficients and Vector Quantization techniques. We collected voice samples of Hindi speech sentences in four basic emotions to study speaker’s emotion identification and it was found that with proposed emo-voice model we are able to achieve accuracy of 73 % of speaker’s emotion identification in speech out of 93 % of the total speech samples provided to the system.
International Journal of Computer Science Issues
This paper describes a text independent, closed set, speaker identification system to identify the speaker along with the emotional expression (Emo-voice Model) of the particular speech sentence. The system is evaluated on recorded sample sentences of native Hindi speakers in five basic emotions. Spectral Features, Mel Frequency cepstral coefficients have been used to implement emo-voice models using Vector Quantization and Gaussian Mixture modeling techniques for selected sample sentences using MATLAB. The VQ model trained with K-mean algorithm achieves as much as 82.7% of speaker identification with correct emotion accuracy whilst GMM model trained with EM algorithm achieves 87.9% of speaker identification with correct emotion accuracy. The statistical approach of Emo-voice Models could be used to extend the application field of voiceprint recognition technology.
International Journal of Electrical and Computer Engineering (IJECE), 2020
In the last couple of years emotion recognition has proven its significance in the area of artificial intelligence and man machine communication. Emotion recognition can be done using speech and image (facial expression), this paper deals with SER (speech emotion recognition) only. For emotion recognition emotional speech database is essential. In this paper we have proposed emotional database which is developed in Gujarati language, one of the official's language of India. The proposed speech corpus bifurcate six emotional states as: sadness, surprise, anger, disgust, fear, happiness. To observe effect of different emotions, analysis of proposed Gujarati speech database is carried out using efficient speech parameters like pitch, energy and MFCC using MATLAB Software.
Communications in Computer and Information Science, 2009
In this paper, we are introducing the speech database for analyzing the emotions present in speech signals. The proposed database is recorded in Telugu language using the professional artists from All India Radio (AIR), Vijayawada, India. The speech corpus is collected by simulating eight different emotions using the neutral (emotion free) statements. The database is named as Indian Institute of Technology Kharagpur Simulated Emotion Speech Corpus (IITKGP-SESC). The proposed database will be useful for characterizing the emotions present in speech. Further, the emotion specific knowledge present in speech at different levels can be acquired by developing the emotion specific models using the features from vocal tract system, excitation source and prosody. This paper describes the design, acquisition, post processing and evaluation of the proposed speech database (IITKGP-SESC). The quality of the emotions present in the database is evaluated using subjective listening tests. Finally, statistical models are developed using prosodic features, and the discrimination of the emotions is carried out by performing the classification of emotions using the developed statistical models.
Speech is the most important bio signal that human being can produce and perceive. Speaking is the process of converting discrete phonemes into continuous acoustic signal. Speech is the most natural and desirable method of human communication as a source of information to communicate with intentions and emotions. it is a time continuous signal containing information about message, speaker attitude, language accent , dialect and emotion many psychological studies suggests only 10% of human life is completely unemotional the rest involves emotional. This paper aims to various speech features used for the analysis of speaker utterance of emotional speech of human being which are analyzed by using speech analysing software's(PRATT) and MATLAB. The emotional analysis helps in the study of what emotions are and how the speech features properties change for different emotional state of a human being for different languages as subject.
International Journal of Information Technology, 2018
This paper presents a study related to perceptual evaluation of emotions expressed in Hindi speech and their acoustic prosodic correlates. Six emotions i.e. Neutral, Happiness, Sad, Fear, Anger and Surprise were selected for the present study. For this purpose, a database of fifteen continuous sentences and isolated words i.e. Hindi digits 'शू न् य, एक, दो, तीन चार, पाँ च, छः, सात, आठ, नौ' spoken by five males and five females repeated three times by each speaker, was created. Understanding the difference between acoustic features for different emotions is important for computer recognition and their classification. For the analysis of acoustic features which changes correspondingly with emotions were analyzed using PRATT speech processing software tool. Human perception experiment shows that the overall recognition of the emotions is about 60% for continuous sentences and 53% for isolated words. It is found that anger has highest intensity followed by neutral, happiness, surprise, sad and fear. Some differences were observed in case of continuous sentences. The dynamic changes in the pitch and intensity in these utterances have been analyzed.
2015
Emotions play important roles in expressing feelings as it tend to make people acts differently. Determining emotions of the speaker is less complicated if we are facing him/her rather than from voice independently such as conversation in telephone. However it would be a great achievement if we able to detect with what emotion a speaker is speaking just by listening to the voice. This project is a small step towards it and we basically are focusing on determining emotions through the recorded speech and developing the prototype system. The ability to detect human emotion from their speech is going to be a great addition in the field of human-robot interaction. The aim of the work is to build an emotion recognition system using Mel-frequency cepstral coefficients (MFCC) and Gaussian mixture model (GMM). In this work four emotional states happy, sad, angry and neutral are taken for classification of emotions. Here we have considered only 10 speakers, 7 male and 3 female, all belonging to upper Assam region and all speak in the same accent. The experiments are performed for only speaker dependent and text independent case.
International Journal of Computer Applications, 2018
A modern development in technology is Speech Emotion Recognition (SER). SER in partnership with Humane-Machine interaction (HMI) has advanced machine intelligence. An emotion precise HMI is designed by integrating speech processing and machine learning algorithm which is sculpted to formulate an automated smart and secure application for detecting emotions in a household as well as in commercial application. This project presents a study of distinguishing emotions by acoustic speech recognition (ASR) using K-means nearest neighbor (K-NN), a machine learning (ML) technique. The most significant paralinguistic information obtained from spectral features is presented by ASR i.e. by using Mel frequency cepstrum coefficient (MFCC). The most important processing techniques methods include feature extraction, feature selection, and classification of emotions. A customized dataset consisting of speech corpus, simulated emotion samples in the Sanskrit language is used to classify emotions in different emotional classes i.e. happy, sad, excitement, fear, anger and disgust. The emotions are classified using a K-NN algorithm over 2 separate models, based on the soft and high pitch voice. Model 1 and 2 achieved about 72.95% and 76.96% recognition
2013
25 Abstract— This paper presents the results of investigations in speech emotion recognition in Hindi, using only the first four formants and their bandwidths. This research work was done on female speech data base of nearly 1600 utterances comprising neutral, happiness, surprise, anger, sadness, fear and disgust as the elicited emotions. The best of the statistically preprocessed formant and bandwidth features were first identified by the KMeans, K nearest Neighbour and Naive Bayes classification of individual features. This was followed by artificial neural network classification based on the combination of the best formants and bandwidths. The highest overall emotion recognition accuracy obtained by the ANN method was 97.14%, based on the first four values of formants and bandwidths. A striking increase in the recognition accuracy was observed when the number of emotion classes was reduced from seven. The obtained results presented in this paper, have not been reported so far for...
In this work, emotion recognition from speech data is studied. Today, the applications and studies over speech are focused on text and/or context, but this approach mostly remains lacking about the main thing that people wanted to say, for example metaphors are generally used to explain a situation with reverse meaning. Emotion recognition from speech can be performed in three steps: filtering, analysis, and determination of the speaker’s emotion. First, if speech is not recorded at an isolated environment, speech data is filtered to clean out the noise produced by other speakers or environmental sounds in the conversation. Second, magnitude changes in the frequency spectrum of the speech signal are analyzed. Momentarily speeches do not have any text; therefore, we have to analyze speech rapidly. Then we should analyze sentimental elements. According to the developed method, definite magnitude fluctuations in a conversation show emotional changes. Third, the speaker’s emotions are classified using commonly used classification methods like neural networks or decision trees. First purpose of our work is finding main message in people’s speeches using intonations. Second purpose of our work is making people and computers communicate naturally with speaking.
International Journal of Image, Graphics and Signal Processing, 2014
Speech Processing has been developed as one of the vital provision region of Digital Signal Processing. Speaker recognition is the methodology of immediately distinguishing who is talking dependent upon special aspects held in discourse waves. This strategy makes it conceivable to utilize the speaker's voice to check their character and control access to administrations, for example voice dialing, data administrations, voice send, and security control for secret information. A review on speaker recognition and emotion recognition is performed based on past ten years of research work. So far iari is done on text independent and dependent speaker recognition. There are many prosodic features of speech signal that depict the emotion of a speaker. A detailed study on these issues is presented in this paper.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
National Academy Science Letters, 2020
Journal of Interdisciplinary Mathematics, 2020
International Journal of Speech Technology, 2012
International Journal of Computer Applications, 2015
International Journal of Computers
Indonesian Journal of Electrical Engineering and Informatics, 2018
International Journal of Scientific Research in Science and Technology, 2022
International Journal of Speech Technology, 2015
International Journal Of …, 2010
Signal Processing and …, 2003