Academia.eduAcademia.edu

MFCC (Mel Frequency Cepstral Coefficient)

70 papers
13 followers
AI Powered
Mel Frequency Cepstral Coefficient (MFCC) is a feature extraction technique used in speech and audio processing. It represents the short-term power spectrum of sound, capturing the perceptual characteristics of human hearing by mapping frequencies onto the Mel scale, which approximates the way humans perceive pitch.
Mel Frequency Ceptral Coefficient is a very common and efficient technique for signal processing. This paper presents a new purpose of working with MFCC by using it for Hand gesture recognition. The objective of using MFCC for hand... more
One of the crucial aspects of the environmental protection is continuous monitoring of environment. Specific aspect is estimation of the bird species population. It is particularly important for bird species being in danger of extinction.... more
Birds are a reflection of environmental health as pollution and climate change affect biodiversity. Experts in ecology and machine learning stand to benefit the most from largescale monitoring of biodiversity. Today, convolutional neural... more
Искусственные нейронные сети в MATLAB: учебное пособие / В.К. Тытюк, В.В. Бушер, О.П. Черный; КНУ – Кривой Рог, 2025. – 98 с. Дается краткое представление о работе в системе MATLAB 2024, краткое теоретическое введение в курс... more
Se plantea una solución al problema de reconocer las vocales A-I-U usando un clasificador bayesiano. Este es un problema interesante de abordar debido a que constituye la base para implementar sistemas de transcripción automática de... more
Tertawa adalah respons vokal yang umum dalam interaksi sosial, yang dapat mempengaruhi karakteristik suara individu. Penelitian ini bertujuan membandingkan karakteristik sinyal suara sebelum dan sesudah tertawa. Subjek penelitian terdiri... more
The paper is concerned with various algorithms used for the analysis based on the Discrete Wavelet Transform (DWT) of (non)stationary regimes involving almost sinusoidal waves and for data communication applications respectively.... more
Agradezco y dedico esta tesis a: Mi esposa, por su amor y apoyo incondicional otorgado en todo momento y sobre todo para la culminación del trabajo. Mis padres, por ser en todo momento una fortaleza y ejemplo de vida y haber forjado en mí... more
This project aims to do a study on the possibility of implementing an authentication application using biometrics in mobile devices. It has been carried out a brief review of existing methods of biometric recognition today and the... more
In this paper we report the experiment carried out on recently collected speaker recognition database namely Arunachali Language Speech Database (ALS-DB)to make a comparative study on the performance of acoustic and prosodic features for... more
In past few eras, emotion recognition from speech is one of the hottest research topic in the field of Human Computer Interaction. Many researches are going on various types of language, but for Bengali language, it is still very novice.... more
Resumen El guaraní es un idioma hablado por alrededor de ocho millones de personas. En Paraguay, cerca del 25 % de la población habla solamente guaraní. Con el incremento del número de dispositivos que incluyen asistentes personales... more
Identifying characteristics of articulatory impairment in speech motor disorders is complicated due to the time-consuming nature of kinematic measures. The goal is to explore whether analysing the acoustic signal in terms of total squared... more
Category (2). Problems in voice production can appear due to functional disorders and laryngeal pathologies. The presence of laryngeal pathologies can causes significant changes in the vibrational patterns of the vocal folds and it is... more
Online Social Network (OSN) adalah aplikasi social media yang memungkinkan komunikasi publik dan berbagi informasi. Namun, akun palsu di OSN dapat menyebarkan informasi palsu dengan sumber yang tidak diketahui. Ini adalah tugas yang... more
Layanan hotline Universitas Pembangunan Nasional “Veteran” Yogyakarta merupakan layanan yang dapat digunakan oleh semua orang. Layanan tersebut digunakan dosen dan pegawai untuk berbagi informasi dengan bagian-bagian yang berlokasi di... more
A Pattern can be characterized by more or less rich & varied pieces of information of different features. The fusion of these different sources of information can provide an opportunity to develop more efficient biometric system which is... more
Speech is the most common form of communication, and the need of the hour is a robust speech recognition system. This paper aims to present an algorithm to design a continuous speech recognition system. The recognition of the speech... more
En este documento se desarrollan algunos métodos específicos para llevar a cabo un clasificador que sea capaz de identificar manatíes adultos y cría de distintos audios. La duración de las vocalizaciones se evaluó con un método de... more
En este artículo, investigamos el funcionamiento de distintos tipos de rasgos acústicos en un sistema de reconocimiento automático de habla (SRAH) en entorno telefónico. En concreto, exploramos dos alternativas distintas para el diseño... more
Este artículo presenta el desarrollo de un sistema de reconocimiento de palabras aisladas independiente del locutor, para comandar una silla de ruedas. Cada palabra se codifica mediante las técnicas de Predicción lineal y Cepstrum real, y... more
In this study, speech to text system for homophone phrases in Indonesian was designed using an extraction method which featured Mel Frequency Cepstral Coefficient (MFCC). Feature extraction results were classified by comparing the two... more
The advent of modern communications and the low cost of some kinds of devices have resulted in a desire to equip elderly peoples' homes with sensors to monitor their activities and be forewarned of abnormal situations. In such an... more
by Awais Khan and 
1 more
The use of face masks has increased dramatically since the COVID-19 pandemic startedin order to to curb the spread of the disease. Additionally, breakthrough infections caused by theDelta and Omicron variants have further increased the... more
In this paper, we focused on automation of Dunstan Baby Language. This system uses MFCC as feature extraction and codebook as feature matching. The codebook of clusters is made from the proceeds of all the baby's cries data, by using the... more
English fricatives’ pronunciation is problematic to common people especially Indo-European speakers such as Norwegian, French, Slovakian, Slovenian, Bulgarian, Portuguese, Czech, Spanish, Dutch, and Greek in the research. This research is... more
En este trabajo se presenta un sistema de reconocimiento de palabras aisladas dependiente del locutor. Cada palabra se codifica mediante las técnicas de Predicción Lineal y Cepstrum real, mientras que la etapa de clasificación se realiza... more
Speaker recognition is the identification of the person who is speaking by characteristics of their voices, also called "voice recognition". The components of Speaker Recognition includes Speaker Identification(SI) and Speaker... more
Tindak pidana yang dilakukan pelaku kejahatan sedikit tidaknya terdapat barang bukti digital yang ditinggalkan berupa rekaman suara yang dihasilkan dari percakapan menggunakan telepon, algoritma dalam menganalisis suara rekaman banyak... more
The ECG Signal is one of the most important biomedical signals for diagnostic of different pathologies, and for that reason it is necessary to look for new methods of compression of the signal in search of a fast transmission and best... more
Emotion recognition (ER) from speech signals is a robust approach since it cannot be imitated like facial expression or text based sentiment analysis. Valuable information underlying the emotions are significant for human-computer... more
Приведены результаты оценки компетенций персонала, полученные на основе нейросетевого подхода. Термин «компетенции» используется во многих научных дисциплинах и практических приложениях. В управлении персоналом компетенция представляет... more
A Pattern can be characterized by more or less rich & varied pieces of information of different features. The fusion of these different sources of information can provide an opportunity to develop more efficient biometric system which is... more
En este artículo se describe el sistema de reconocimiento de locutor implementado por el I3A para la evaluación del NIST 2008. Se dispone de dos sistemas básicos: GMM-UBM likelihood ratio y GMM-SVM. Las señales proporcionadas por el NIST... more
In this paper, we present an approach to develop an automatic speech recognition (ASR) system of Urdu isolated words. Our experimentation is based on a medium vocabulary speech corpus of Urdu, consisting of 250 words. We develop our... more
Indonesia merupakan negara besar yang memiliki banyak keberagaman budaya dan suku sehingga memiliki banyak bahasa atau dialek yang berbeda-beda di setiap daerah. Berbagai penelitian dalam pengolahan sinyal suara telah banyak dikembangkan.... more
In this work, a 5 state left to right HMM-based Bangla Isolated word speech recognizer has been developed. To train and test the recognizer, a small corpus of various sampling frequencies have been developed in noisy as well as the... more
Layanan hotline Universitas Pembangunan Nasional “Veteran” Yogyakarta merupakan layanan yang dapat digunakan oleh semua orang. Layanan tersebut digunakan dosen dan pegawai untuk berbagi informasi dengan bagian-bagian yang berlokasi di... more
A phoneme is basically a very small number of basic sounds from any human spoken language. Many researches have been conducted in ASR from last few centuries. Phoneme recognition with 40 native speakers is taken into consideration with... more
Online Social Network (OSN) adalah aplikasi social media yang memungkinkan komunikasi publik dan berbagi informasi. Namun, akun palsu di OSN dapat menyebarkan informasi palsu dengan sumber yang tidak diketahui. Ini adalah tugas yang... more
In past few eras, emotion recognition from speech is one of the hottest research topic in the field of Human Computer Interaction. Many researches are going on various types of language, but for Bengali language, it is still very novice.... more
In past few eras, emotion recognition from speech is one of the hottest research topic in the field of Human Computer Interaction. Many researches are going on various types of language, but for Bengali language, it is still very novice.... more
This study aims to analyze the characteristics of Japanese sibilant sound observed from the acoustic analysis. This study's data were in the form of sound samples from 16 respondents who spoke Javanese in Japanese language learners at the... more
In these modern times there are several methods of communication between ourselves and even our machines, these allow us to create ties between people contact or via an interface for human-machine interaction. These techniques have been... more
Agradezco y dedico esta tesis a: Mi esposa, por su amor y apoyo incondicional otorgado en todo momento y sobre todo para la culminación del trabajo. Mis padres, por ser en todo momento una fortaleza y ejemplo de vida y haber forjado en mí... more
There are a number of languages accent differential applications that detect the different accents in assorted languages. The studies which have done before most of them are based on the English language and different languages throughout... more
El objetivo de esta comunicacion es presentar las actividades realizadas desde noviembre de 1994 dentro del proyecto “Speaker Recognition in Telephony”, financiado por la Comunidad Europea en el marco del programa “European Cooperation in... more