MFCC (Mel Frequency Cepstral Coefficient) Research Papers

2025, Signal & Image Processing : An International Journal

Mel Frequency Ceptral Coefficient is a very common and efficient technique for signal processing. This paper presents a new purpose of working with MFCC by using it for Hand gesture recognition. The objective of using MFCC for hand... more

descriptionView Paper arrow_downwardDownload

Towards the Automatic Acoustical Avian Monitoring System

by Robert Wielgat

2025, Science, Technology and Innovation

One of the crucial aspects of the environmental protection is continuous monitoring of environment. Specific aspect is estimation of the bird species population. It is particularly important for bird species being in danger of extinction.... more

descriptionView Paper arrow_downwardDownload

Classification of Bird Sound Using High-and Low-Complexity Convolutional Neural Networks

by Aymen S A A D Abdalameer

2025

Birds are a reflection of environmental health as pollution and climate change affect biodiversity. Experts in ecology and machine learning stand to benefit the most from largescale monitoring of biodiversity. Today, convolutional neural... more

descriptionView Paper arrow_downwardDownload

Artificial Neural Networks in Matlab / ИСКУССТВЕННЫЕ НЕЙРОННЫЕ СЕТИ В MATLAB

by Victor Busher

2025, ANN MATLAB

Искусственные нейронные сети в MATLAB: учебное пособие / В.К. Тытюк, В.В. Бушер, О.П. Черный; КНУ – Кривой Рог, 2025. – 98 с. Дается краткое представление о работе в системе MATLAB 2024, краткое теоретическое введение в курс... more

descriptionView Paper arrow_downwardDownload

Reconocimiento Vocales AIU usando Clasificador Bayesiano

by Juan M Fonseca-Solís

2025

Se plantea una solución al problema de reconocer las vocales A-I-U usando un clasificador bayesiano. Este es un problema interesante de abordar debido a que constituye la base para implementar sistemas de transcripción automática de... more

descriptionView Paper arrow_downwardDownload

PERBANDINGAN SUARA SEORANG SEBELUM TERTAWA DAN SESUDAH TERTAWA

by TATA WIDYAWASIH

2024, Teknik Biomedis Institut Teknologi Sumatera

Tertawa adalah respons vokal yang umum dalam interaksi sosial, yang dapat mempengaruhi karakteristik suara individu. Penelitian ini bertujuan membandingkan karakteristik sinyal suara sebelum dan sesudah tertawa. Subjek penelitian terdiri... more

descriptionView Paper arrow_downwardDownload

Hybrid Wavelet-Based Algorithms with Fast Reconstruction Features

by Ileana Nicolae

2024

The paper is concerned with various algorithms used for the analysis based on the Discrete Wavelet Transform (DWT) of (non)stationary regimes involving almost sinusoidal waves and for data communication applications respectively.... more

descriptionView Paper arrow_downwardDownload

Desarrollo De Un Sistema De Reconocimiento De Voz Para El Control De Dispositivos Utilizando Mixturas Gaussianas

by Daniel Escalante

2024

Agradezco y dedico esta tesis a: Mi esposa, por su amor y apoyo incondicional otorgado en todo momento y sobre todo para la culminación del trabajo. Mis padres, por ser en todo momento una fortaleza y ejemplo de vida y haber forjado en mí... more

descriptionView Paper arrow_downwardDownload

Procesado digital de voz para el reconocimiento del hablante aplicado a dispositivos móviles

by Daniel Moral Bárcena

2024

This project aims to do a study on the possibility of implementing an authentication application using biometrics in mobile devices. It has been carried out a brief review of existing methods of biometric recognition today and the... more

descriptionView Paper arrow_downwardDownload

Speaker Verification Using Acoustic and Prosodic Features

by Kshirod Sarmah

2024, Advanced Computing: An International Journal

In this paper we report the experiment carried out on recently collected speaker recognition database namely Arunachali Language Speech Database (ALS-DB)to make a comparative study on the performance of acoustic and prosodic features for... more

descriptionView Paper arrow_downwardDownload

Emotion recognition from isolated Bengali speech

by Nahid Sultan

2024, Journal of theoretical and applied information technology

In past few eras, emotion recognition from speech is one of the hottest research topic in the field of Human Computer Interaction. Many researches are going on various types of language, but for Bengali language, it is still very novice.... more

descriptionView Paper arrow_downwardDownload

Eñe’˜e: Sistema de reconocimiento automático del habla en Guaraní

by Diego Pinto-Roa

2024

Resumen El guaraní es un idioma hablado por alrededor de ocho millones de personas. En Paraguay, cerca del 25 % de la población habla solamente guaraní. Con el incremento del número de dispositivos que incluyen asistentes personales... more

descriptionView Paper arrow_downwardDownload

Analysing spectral changes over time to identify articulatory impairments in dysarthria

by Michaela Pernon

2024, The Journal of the Acoustical Society of America

Identifying characteristics of articulatory impairment in speech motor disorders is complicated due to the time-consuming nature of kinematic measures. The goal is to explore whether analysing the acoustic signal in terms of total squared... more

descriptionView Paper arrow_downwardDownload

Compresión de audio en el espacio transformado

by Oscar Bria

2024

descriptionView Paper arrow_downwardDownload

Automatic detection of laryngeal pathologies using cepstral analysis in Mel and Bark scales

by Julian Arias-Londoño

2024

Category (2). Problems in voice production can appear due to functional disorders and laryngeal pathologies. The presence of laryngeal pathologies can causes significant changes in the vibrational patterns of the vocal folds and it is... more

descriptionView Paper arrow_downwardDownload

Re-Fake: Klasifikasi Akun Palsu di Sosial Media Online menggunakan Algoritma RNN

by eka wanda

2024, Prosiding Seminar Nasional Sains Teknologi dan Inovasi Indonesia (SENASTINDO)

Online Social Network (OSN) adalah aplikasi social media yang memungkinkan komunikasi publik dan berbagi informasi. Namun, akun palsu di OSN dapat menyebarkan informasi palsu dengan sumber yang tidak diketahui. Ini adalah tugas yang... more

descriptionView Paper arrow_downwardDownload

APLIKASI PENGENALAN PENUTUR PADA IDENTIFIKASI SUARA PENELEPON MENGGUNAKAN MEL-FREQUENCY CEPSTRAL COEFFICIENT DAN VECTOR QUANTIZATION (Studi Kasus : Layanan Hotline Universitas Pembangunan Nasional “Veteran” Yogyakarta)

by Muhammad Munawir Rasyid

2024, Telematika

Layanan hotline Universitas Pembangunan Nasional “Veteran” Yogyakarta merupakan layanan yang dapat digunakan oleh semua orang. Layanan tersebut digunakan dosen dan pegawai untuk berbagi informasi dengan bagian-bagian yang berlokasi di... more

descriptionView Paper arrow_downwardDownload

Efficient and Robust Multimodal Biometric System for Feature Level Fusion (Speech and Signature)

by Gaganpreet Kaur

2024, International Journal of Computer Applications

A Pattern can be characterized by more or less rich & varied pieces of information of different features. The fusion of these different sources of information can provide an opportunity to develop more efficient biometric system which is... more

descriptionView Paper arrow_downwardDownload

Continuous Hindi Speech Recognition in Real Time Using NI LabVIEW

by Shabana Urooj

2024, Advances in Intelligent Systems and Computing

Speech is the most common form of communication, and the need of the hour is a robust speech recognition system. This paper aims to present an algorithm to design a continuous speech recognition system. The recognition of the speech... more

Fig. 5 MFCC generation, log of filter bank energies and applying it to DCT

descriptionView Paper arrow_downwardDownload

Clasificador automático de clase (adulto-cría) mediante características distintivas en vocalizaciones de manatíes

by hazel pacheco

2024, Revista de Iniciación Científica

En este documento se desarrollan algunos métodos específicos para llevar a cabo un clasificador que sea capaz de identificar manatíes adultos y cría de distintos audios. La duración de las vocalizaciones se evaluó con un método de... more

descriptionView Paper arrow_downwardDownload

Comparación de diversas parametrizaciones para reconocimiento de habla robusto en entorno telefónico

by Ricardo Córdoba

2023

En este artículo, investigamos el funcionamiento de distintos tipos de rasgos acústicos en un sistema de reconocimiento automático de habla (SRAH) en entorno telefónico. En concreto, exploramos dos alternativas distintas para el diseño... more

descriptionView Paper arrow_downwardDownload

Prototipo de silla de ruedas comandada por voz empleando hmm en un ambiente controlado (Prototype of wheelchair commanded by voice using HMM in a controlled environment)

by DIEGO FERNANDO MENDEZ MEDINA

2023, Ingeniería Investigación y Desarrollo

Este artículo presenta el desarrollo de un sistema de reconocimiento de palabras aisladas independiente del locutor, para comandar una silla de ruedas. Cada palabra se codifica mediante las técnicas de Predicción lineal y Cepstrum real, y... more

descriptionView Paper arrow_downwardDownload

Speech to text for Indonesian homophone phrase with Mel Frequency Cepstral Coefficient

by novy nur

2023, 2016 International Conference on Computational Intelligence and Cybernetics

In this study, speech to text system for homophone phrases in Indonesian was designed using an extraction method which featured Mel Frequency Cepstral Coefficient (MFCC). Feature extraction results were classified by comparing the two... more

descriptionView Paper arrow_downwardDownload

Primary investigation of sound recognition for a domotic application using support vector machines

by Jerome Boudy

2023, HAL (Le Centre pour la Communication Scientifique Directe)

The advent of modern communications and the low cost of some kinds of devices have resulted in a desire to equip elderly peoples' homes with sensors to monitor their activities and be forewarned of abnormal situations. In such an... more

descriptionView Paper arrow_downwardDownload

Toward Realigning Automatic Speaker Verification in the Era of COVID-19

by Awais Khan and

2023

The use of face masks has increased dramatically since the COVID-19 pandemic startedin order to to curb the spread of the disease. Additionally, breakthrough infections caused by theDelta and Omicron variants have further increased the... more

The use of face masks has increased dramatically since the COVID-19 pandemic startedin order to to curb the spread of the disease. Additionally, breakthrough infections caused by theDelta and Omicron variants have further increased the importance of wearing a face mask, even forvaccinated individuals. However, the use of face masks also induces attenuation in speech signals,and this change may impact speech processing technologies, e.g., automated speaker verification(ASV) and speech to text conversion. In this paper we examine Automatic Speaker Verification (ASV)systems against the speech samples in the presence of three different types of face mask: surgical,cloth, and filtered N95, and analyze the impact on acoustics and other factors. In addition, weexplore the effect of different microphones, and distance from the microphone, and the impact offace masks when speakers use ASV systems in real-world scenarios. Our analysis shows a significant deterioration in performance when an ASV system encounters different face masks, microphones, and variable distance between the subject and microphone. To address this problem, this paper proposes a novel framework to overcome performance degradation in these scenarios by realigning the ASV system. The novelty of the proposed ASV framework is as follows: first, we propose a fused feature descriptor by concatenating the novel Ternary Deviated overlapping Patterns (TDoP), Mel Frequency Cepstral Coefficients (MFCC), and Gammatone Cepstral Coefficients (GTCC), which are used by both the ensemble learning-based ASV and anomaly detection system in the proposed ASV architecture. Second, this paper proposes an anomaly detection model for identifying vocal samples produced in the presence of face masks. Next, it presents a Peak Norm (PN) filter to approximate the signal of the speaker without a face mask in order to boost the accuracy of ASV systems. Finally, the features of filtered samples utilizing the PN filter and samples without face masks are passed to the proposed ASV to test for improved accuracy. The proposed ASV system achieved an accuracy of 0.99 and 0.92, respectively, on samples recorded without a face mask and with different face masks. Although the use of face masks affects the ASV system, the PN filtering solution overcomes this deficiency up to 4%. Similarly, when exposed to different microphones and distances, the PN approach enhanced system accuracy by up to 7% and 9%, respectively. The results demonstrate the effectiveness of the presented framework against an in-house prepared, diverse Multi Speaker Face Masks (MSFM) dataset, (IRB No. FY2021-83), consisting of samples of subjects taken with a variety of face masks and microphones, and from different distances.

descriptionView Paper arrow_downwardDownload

Infant Cries Identification by Using Codebook as Feature Matching, and MFCC as Feature Extraction

by agus buono

2023

In this paper, we focused on automation of Dunstan Baby Language. This system uses MFCC as feature extraction and codebook as feature matching. The codebook of clusters is made from the proceeds of all the baby's cries data, by using the... more

descriptionView Paper arrow_downwardDownload

The Difficulty of Pronouncing English Fricatives by Speakers of Indo-European Language

by Cristine Natalia

2023

English fricatives’ pronunciation is problematic to common people especially Indo-European speakers such as Norwegian, French, Slovakian, Slovenian, Bulgarian, Portuguese, Czech, Spanish, Dutch, and Greek in the research. This research is... more

descriptionView Paper arrow_downwardDownload

Implementación De Un Reconocedor De Palabras Aisladas Dependiente Del Locutor

by Roberto Carrillo

2023, Revista Facultad de Ingeniería - Universidad de Tarapacá

En este trabajo se presenta un sistema de reconocimiento de palabras aisladas dependiente del locutor. Cada palabra se codifica mediante las técnicas de Predicción Lineal y Cepstrum real, mientras que la etapa de clasificación se realiza... more

descriptionView Paper arrow_downwardDownload

Analysing the Performance of Speaker Verification Task using Different Features

by Dr. L. Kavitha, Physics

2023, International Journal of Computer Applications

Speaker recognition is the identification of the person who is speaking by characteristics of their voices, also called "voice recognition". The components of Speaker Recognition includes Speaker Identification(SI) and Speaker... more

descriptionView Paper arrow_downwardDownload

Teknik Audio Forensik Dengan Metode Minkowski Untuk Pengenalan Rekaman Suara Pelaku Kejahatan

by Muhamad Azwar

2023, Cyber Security dan Forensik Digital

Tindak pidana yang dilakukan pelaku kejahatan sedikit tidaknya terdapat barang bukti digital yang ditinggalkan berupa rekaman suara yang dihasilkan dari percakapan menggunakan telepon, algoritma dalam menganalisis suara rekaman banyak... more

descriptionView Paper arrow_downwardDownload

Compresión de la señal electrocardiográfica (ECG)

by giovanni ruiz

2023, UMBral Científico

The ECG Signal is one of the most important biomedical signals for diagnostic of different pathologies, and for that reason it is necessary to look for new methods of compression of the signal in search of a fast transmission and best... more

descriptionView Paper arrow_downwardDownload

An Extended Variational Mode Decomposition Algorithm Developed Speech Emotion Recognition Performance

by David HASON RUDD

2023, Springer, Cham

Emotion recognition (ER) from speech signals is a robust approach since it cannot be imitated like facial expression or text based sentiment analysis. Valuable information underlying the emotions are significant for human-computer... more

descriptionView Paper arrow_downwardDownload

Нейросетевая оценка компетенций персонала

by Михаил Кричевский

2023, Russian Journal of Labor Economics

Приведены результаты оценки компетенций персонала, полученные на основе нейросетевого подхода. Термин «компетенции» используется во многих научных дисциплинах и практических приложениях. В управлении персоналом компетенция представляет... more

Приведены результаты оценки компетенций персонала, полученные на основе нейросетевого подхода. Термин «компетенции» используется во многих научных дисциплинах и практических приложениях. В управлении персоналом компетенция представляет собой формально описанные требования к профессиональным качествам работника. Важность компетенций подчеркивается в международных стандартах ИСО 9001 и моделях национальных и транснациональных премий в области качества продукции. Однако при оценивании компетенций окончательной точки зрения еще не сформировано. Существуют различные методы и приемы, позволяющие с тех или иных позиций получить оценку компетенций сотрудников, которая сводится, чаще всего, к субъективному подходу. В работе для оценки компетенций предложено использовать искусственные нейронные сети, с их помощью показана возможность классификации работников по уровню компетенций, приведены результаты моделирования работы нейронной сети с помощью инструмента Simulink. ФИНАНСИрОВАНИЕ. Исследование выполнено при финансовой поддержке РФФИ в рамках научного проекта № 18-010-00338А. КЛЮЧЕВЫЕ СЛОВА: компетенция работника, оценка компетенций, нейронная сеть, имитация работы системы оценки.

descriptionView Paper arrow_downwardDownload

Efficient and Robust Multimodal Biometric System for Feature Level Fusion (Speech and Signature)

by Dapinder Kaur

2023, International Journal of Computer Applications

A Pattern can be characterized by more or less rich & varied pieces of information of different features. The fusion of these different sources of information can provide an opportunity to develop more efficient biometric system which is... more

descriptionView Paper arrow_downwardDownload

Experiencia del I3A en la Evaluación de Reconocimiento de Locutor NIST 2008

by Jesus Villalba

2023

En este artículo se describe el sistema de reconocimiento de locutor implementado por el I3A para la evaluación del NIST 2008. Se dispone de dos sistemas básicos: GMM-UBM likelihood ratio y GMM-SVM. Las señales proporcionadas por el NIST... more

descriptionView Paper arrow_downwardDownload

Automatic Urdu Speech Recognition using Hidden Markov Model

by Md.hazrat Ali

2023, 2016 International Conference on Image, Vision and Computing (ICIVC)

In this paper, we present an approach to develop an automatic speech recognition (ASR) system of Urdu isolated words. Our experimentation is based on a medium vocabulary speech corpus of Urdu, consisting of 250 words. We develop our... more

descriptionView Paper arrow_downwardDownload

Pengenalan Dialek Bahasa Daerah di Pulau Jawa menggunakan Metode Mel-Frequency Cepstral Coefficients dan Adaptive Network-based Fuzzy Inference System

by Latiful hayat

2023, Jurnal Riset Rekayasa Elektro

Indonesia merupakan negara besar yang memiliki banyak keberagaman budaya dan suku sehingga memiliki banyak bahasa atau dialek yang berbeda-beda di setiap daerah. Berbagai penelitian dalam pengolahan sinyal suara telah banyak dikembangkan.... more

descriptionView Paper arrow_downwardDownload

Effects of Filter Numbers and Sampling Frequencies on the Performance of MFCC and PLP based Bangla Isolated Word Recognition System

by Md. Shaikh Abrar Kabir

2023, International Journal of Image, Graphics and Signal Processing

In this work, a 5 state left to right HMM-based Bangla Isolated word speech recognizer has been developed. To train and test the recognizer, a small corpus of various sampling frequencies have been developed in noisy as well as the... more

descriptionView Paper arrow_downwardDownload

APLIKASI PENGENALAN PENUTUR PADA IDENTIFIKASI SUARA PENELEPON MENGGUNAKAN MEL-FREQUENCY CEPSTRAL COEFFICIENT DAN VECTOR QUANTIZATION (Studi Kasus : Layanan Hotline Universitas Pembangunan Nasional “Veteran” Yogyakarta)

by Muhammad Rasyid

2023, Telematika

Layanan hotline Universitas Pembangunan Nasional “Veteran” Yogyakarta merupakan layanan yang dapat digunakan oleh semua orang. Layanan tersebut digunakan dosen dan pegawai untuk berbagi informasi dengan bagian-bagian yang berlokasi di... more

descriptionView Paper arrow_downwardDownload

Bangla Phoneme Recognition: Probabilistic Approach

by Md. Shafiul Alam Chowdhury

2023

A phoneme is basically a very small number of basic sounds from any human spoken language. Many researches have been conducted in ASR from last few centuries. Phoneme recognition with 40 native speakers is taken into consideration with... more

descriptionView Paper arrow_downwardDownload

Re-Fake: Klasifikasi Akun Palsu di Sosial Media Online menggunakan Algoritma RNN

by Putra Wanda

2023, Prosiding Seminar Nasional Sains Teknologi dan Inovasi Indonesia (SENASTINDO)

Online Social Network (OSN) adalah aplikasi social media yang memungkinkan komunikasi publik dan berbagi informasi. Namun, akun palsu di OSN dapat menyebarkan informasi palsu dengan sumber yang tidak diketahui. Ini adalah tugas yang... more

descriptionView Paper arrow_downwardDownload

Emotion recognition from isolated Bengali speech

by 356 MD. SABBIR HOSSAIN

2023, Journal of theoretical and applied information technology

In past few eras, emotion recognition from speech is one of the hottest research topic in the field of Human Computer Interaction. Many researches are going on various types of language, but for Bengali language, it is still very novice.... more

descriptionView Paper arrow_downwardDownload

Emotion recognition from isolated Bengali speech

by Md Sabbir Hossain

2023, Journal of theoretical and applied information technology

In past few eras, emotion recognition from speech is one of the hottest research topic in the field of Human Computer Interaction. Many researches are going on various types of language, but for Bengali language, it is still very novice.... more

descriptionView Paper arrow_downwardDownload

Acoustic Analysis of Japanese Sibilant Sounds for Japanese Language Learners at Indonesia, whose mother tongue is Javanese

by Jurnal Chie

2023, CHI'E Jurnal Pendidikan Bahasa Jepang (Journal of Japanese Learning and Teaching)

This study aims to analyze the characteristics of Japanese sibilant sound observed from the acoustic analysis. This study's data were in the form of sound samples from 16 respondents who spoke Javanese in Japanese language learners at the... more

descriptionView Paper arrow_downwardDownload

Detección de signos respiratorios patológicos en poblaciones avícolas productivas mediante procesamiento digital de señales acústicas

by Cristian Kuhn

2023

descriptionView Paper arrow_downwardDownload

Algoritmo de Identificación de Patrones del Idioma Español, a Través de Señales de Habla Sub-Vocal Utilizando Transformada Wavelet e Inteligencia Artificial

by Oscar Daniel

2023

In these modern times there are several methods of communication between ourselves and even our machines, these allow us to create ties between people contact or via an interface for human-machine interaction. These techniques have been... more

descriptionView Paper arrow_downwardDownload

Desarrollo De Un Sistema De Reconocimiento De Voz Para El Control De Dispositivos Utilizando Mixturas Gaussianas

by daniel escalante

2023

Agradezco y dedico esta tesis a: Mi esposa, por su amor y apoyo incondicional otorgado en todo momento y sobre todo para la culminación del trabajo. Mis padres, por ser en todo momento una fortaleza y ejemplo de vida y haber forjado en mí... more

descriptionView Paper arrow_downwardDownload

Bangla Speaker Accent Variation Detection by MFCC Using Recurrent Neural Network Algorithm: A Distinct Approach

by RAKIBUL ISLAM

2023, Innovations in Computer Science and Engineering

There are a number of languages accent differential applications that detect the different accents in assorted languages. The studies which have done before most of them are based on the English language and different languages throughout... more

descriptionView Paper arrow_downwardDownload

Obtención de sonogramas con los formantes realzados usando el método de predicción lineal (LPC)

by Jesus Bernal

2023

descriptionView Paper arrow_downwardDownload

Reconocimiento del locutor en telefonia: actividades del proyecto europeo COST 250

by Luciano Rodríguez

2023

El objetivo de esta comunicacion es presentar las actividades realizadas desde noviembre de 1994 dentro del proyecto “Speaker Recognition in Telephony”, financiado por la Comunidad Europea en el marco del programa “European Cooperation in... more

descriptionView Paper arrow_downwardDownload

Log In

MFCC (Mel Frequency Cepstral Coefficient)

Related Topics