Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2014, 2014 5th International Conference - Confluence The Next Generation Information Technology Summit (Confluence)
…
4 pages
1 file
The classical front end analysis in speech recognition is a spectral analysis which parameterizes the speech signal into feature vectors. This paper proposes a voice recognition model that is able to automatically classify and recognize a voice signal with background noise. The model uses the concept of spectrogram, pitch period, short time energy, zero crossing rate, mel frequency scale and cepestral coefficient in order to calculate feature vectors. The k-Nearest Neighbor (k-NN) classification is used for classification and recognition of real-time input signal. Analytical hierarchical process is used for deciding the weightage of different features.
Speech recognition is the process of automatically recognizing the spoken words of person based on information content in speech signal. Many reviews and surveys have been conducted on voice feature extraction techniques but most of them have not done an exhaustive empirical review on the techniques. This paper provides an empirical review with relevant algorithmic calculations on each of feature extraction techniques for voice recognition and discusses the techniques and systems that make it possible for computers to accept Voice as input. This paper shows the major developments in the field of voice analytics. It gives a detailed information of the three main feature extraction techniques: Linear Predictive Coding (LPC), Mel-frequency cepstrum coefficient (MFCCs) and RASTA filtering technique. The objective of this paper is to summarize the feature extractions techniques used in speech recognition system and provide an empirical value to each technique. The words " voice " and " speech " are used interchangeably in this context.
Journal of Signal and Information Processing, 2014
In this paper, an expert system for security based on biometric human features that can be obtained without any contact with the registering sensor is presented. These features are extracted from human's voice, so the system is called Voice Recognition System (VRS). The proposed system consists of a combination of three stages: signal pre-processing, features extraction by using Wavelet Packet Transform (WPT) and features matching by using Artificial Neural Networks (ANNs). The features vectors are formed after two steps: firstly, decomposing the speech signal at level 7 with Daubechies 20-tap (db20), secondly, the energy corresponding to each WPT node is calculated which collected to form a features vector. One hundred twenty eight features vector for each speaker was fed to the Feed Forward Back-propagation Neural Network (FFBPNN). The data used in this paper are drawn from the English Language Speech Database for Speaker Recognition (ELSDSR) database which composes of audio files for training and other files for testing. The performance of the proposed system is evaluated by using the test files. Our results showed that the rate of correct recognition of the proposed system is about 100% for training files and 95.7% for one testing file for each speaker from the ELSDSR database. The proposed method showed efficiency results were better than the well-known Mel Frequency Cepstral Coefficient (MFCC) and the Zak transform.
International Journal of Trend in Scientific Research and Development, 2018
Speech Processing method is one of the important method used in application area of digital and analog signal processing. It is used in real world speech processing of human language such as human computer interface system for home, industry and medical field. It is the most common means of the communication because the information contains the fundamental role in conversation. From the speech or conversation, it converts an acoustic signal that is captured by a microphone or a telephone, to a set of words. Various fields for research in speech processing are speech recognition, speaker recognition, speech synthesis, speech coding etc. Speech recognition is the process of automatically recognizing the spoken words of person based on information content in speech signal. The introduces a brief detail study on Automatic Speech Recognition and discusses the various classification techniques that have been accomplished in this wide area of speech processing. The objective of this paper is to study some of the well known methods that are widely used in several stages of speech recognition system.
International Journal of Computer Applications
A novel methodology to manipulate wave file and create a feature array for each wave file will be introduced, this array can be used later on to recognize the voice file. A set of experiments will be performed in order to prove the uniqueness of the calculated feature array, and that the created feature array for a certain wave file does not match any other feature array for other wave files. The proposed methodology will minimize the efforts of voice recognition by mean of minimizing the time of feature array creation and minimizing the size of the calculated array.
The paper presents the design of speech recognition system that uses preprocessing, feature extraction and classification stages. In preprocessing stage a de-noising is done to get the speech data without noise. In feature extraction stage Linear Predictive Coding (LPC), Mel Frequency Cepstral Coefficients (MFCC), and Spectrogram methods are used to extract the features of the word. Neural Networks (NN) was used to classify the spoken words to different patterns so the system can recognize unknown spoken words according to these patterns. The set of spoken words are used in simulation of the system. The comparative results of the system have been provided using above mentioned feature extraction methods.
2017
Recognizing the speaker can simplify the task of translating speech in systems that have been trained on specific person's voices or it can be used to authenticate or verify the identity of a speaker as part of a security process. This work discusses the Implementation of an Enhanced Speaker Recognition system using MFCC and LBG Algorithm. MFCC has been used extensively for purposes of Speaker Recognition. This work has augmented the existing work by using Vector Quantization and Classification using the Linde Buzo Gray Algorithm. A complete test system has been developed in MATLAB which can be used for real time testing as it can take inputs directly from the Microphone. Therefore, the design can be translated into a Hardware having the necessary real time processing Prerequisites. The system has been tested using the VID TIMIT Database and using the Performance metrics of False Acceptance Rate (FAR), True Acceptance Rate (TAR) and False Rejection Rate(FRR). The system has been...
Science Journal of Circuits, Systems and Signal Processing
This paper presents a voiced/unvoiced classification algorithm of the noisy speech signal by analyzing two acoustic features of the speech signal. Short-time energy and short-time zero-crossing rates are one of the most distinguishable time domain features of a speech signal to classify its voiced activity into voiced/unvoiced segment. A new idea is developed where frame by frame processing has done in narrow band speech signal using spectrogram image. Two time domain features, short-time energy (STE) and short-time zero-crossing rate (ZCR) are used to classify its voiced/unvoiced parts. In the first stage, each frame of the analyzing spectrogram is divided into three separate sub bands and examines their short-time energy ratio pattern. Then an energy ratio pattern matching look up table is used to classify the voicing activity. However, this method successfully classifies patterns 1 through 4 but fails in the rest of the patterns in the look up table. Therefore, the rest of the patterns are confirmed in the second stage where frame wise short-time average zero-crossing rate is compared with a threshold value. In this study, the threshold value is calculated from the short-time average zero-crossing rate of White Gaussian Noise (wGn). The accuracy of the proposed method is evaluated using both male and female speech waveforms under different signal-to-noise ratios (SNRs). Experimental results show that the proposed method achieves better accuracy than the conventional methods in the literature.
Journal 4 Research - J4R Journal, 2017
Voice recognition system is a system which is used to convert human voice into signal, which can be understood by the machines. When this is achieved, the machine can be made to work, as desired. The machine could be a computer, a typewriter, or even a robot. There are systems available, in which the machine 'speaks' the recorded word. But that is out of the scope of this paper. Here, only the human is expected to talk. Further, the voice recognition systems described here, can be used for projects only.
International Journal of Computer Applications, 2015
To be able to control devices by voice has always intrigued mankind. Today after intense research, Speech Recognition System, have made a niche for themselves and can be seen in many walks of life. The accuracy of Speech Recognition Systems remains one of the most important research challenges e.g. noise, speaker variability, language variability, vocabulary size and domain. The design of speech recognition system requires careful attentions to the challenges such as various types of Speech Classes and Speech Representation, Speech Preprocessing stages, Feature Extraction techniques, Database and Performance evaluation. This paper presents the advances made as well as highlights the pressing problems for a speech recognition system. The paper also classifies the system into Front End and Back End for better understanding and representation of speech recognition system in each part.
— Voice recognition is the identification of a speaker on the basis of the characteristics of voices. For this, features of speech patterns that differ between individuals are used to achieve the objective. In this paper speaker recognition system are discussed. Implementation of speaker's voice recognition system with MATLAB makes possible use of voice for real life applications. This paper provides a brief review of different DSP based techniques applied for speech recognition.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
International journal of advance engineering & research development, 2014
International Journal of Electrical and Computer Engineering (IJECE), 2020
International Journal of Innovative Research in Computer and Communication Engineering, 2018
International Journal of Computer Applications, 2015
International Journal of Applied Research on Information Technology and Computing, 2015
Journal of Advanced Sciences and Engineering Technologies, 2022
Computing Research Repository, 2010