Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
…
6 pages
1 file
— Voice recognition has become one of the most important tools of the modern generation and is widely used in various fields for various purposes. The past decade has seen dramatic progress in voice recognition technology, to the extent that systems and high-performance algorithms have become accessible. Voice recognition system performance is commonly specified in terms of speed and accuracy, recognition accuracy is the most important and straightforward measure of voice recognition performance. This research were proposed to review several voice algorithms in terms of detection accuracy and processing overhead and to identify the optimal voice recognition algorithm that can give the best trade-offs between processing cost (speed, power) and accuracy. Also, to implement and verify the chosen voice recognition algorithm using MATLAB. Ten words were spoken in an isolated way by male and female speakers (four speakers) using MATLAB as a simulation environment, these word were used as a reference signal to trained the algorithm, for evaluating phase, all algorithms dictates to subject them to similar test criteria. From the simulation results, the Wiener Filter algorithm outperform the other four algorithms in terms of all measure of performance, and power requirement with the moderate complexity of the algorithm and its prospective implementation as a hardware. Wiener filter algorithm scored accuracy of 100%, 5%, and 50% for test cases i,ii,and iii respectively, with recognition speed range of (695-867) msec and estimated power range of (750-885) µW.
Voice recognition has evolved into one of the most essential instruments of the contemporary period, with applications in a wide range of industries. Voice recognition technology has advanced dramatically in the last decade, to the point that systems and high-performance algorithms are now widely available. The most frequent way to describe speech recognition system performance is in terms of speed and accuracy. Recognition accuracy is the most essential and easy way to assess voice recognition performance. The goal of this paper was to compare numerous speech recognition algorithms in terms of detection accuracy and processing overhead, and to find the optimum trade-offs between processing cost (speed, power) and accuracy. Also, to implement and verify the chosen voice recognition algorithm using MATLAB. Ten words "yes" and "no" were spoken in an isolated way by male speakers using MATLAB as a simulation environment, these words were used as a reference signal to trained the algorithm, for evaluating phase, all algorithms dictate to subject them to similar test criteria. From the simulation results, the Wiener Filter algorithm outperforms the algorithms in terms of all measures of performance, and power requirement with the moderate complexity of the algorithm and its prospective implementation as hardware.
The aim of this thesis work is to investigate the algorithms of speech recognition. The author programmed and simulated the designed systems for algorithms of speech recognition in MATLAB. There are two systems designed in this thesis. One is based on the shape information of the cross-correlation plotting. The other one is to use the Wiener Filter to realize the speech recognition. The simulations of the programmed systems in MATLAB are accomplished by using the microphone to record the speaking words. After running the program in MATLAB, MATLAB will ask people to record the words three times. The first and second recorded words are different words which will be used as the reference signals in the designed systems. The third recorded word is the same word as the one of the first two recorded words. After recording words, the words will become the signals' information which will be sampled and stored in MATLAB. Then MATLAB should be able to give the judgment that which word is recorded at the third time compared with the first two reference words according to the algorithms programmed in MATLAB. The author invited different people from different countries to test the designed systems. The results of simulations for both designed systems show that the designed systems both work well when the first two reference recordings and the third time recording are recorded from the same person. But the designed systems all have the defects when the first two reference recordings and the third time recording are recorded from the different people. However, if the testing environment is quiet enough and the speaker is the same person for three time recordings, the successful probability of the speech recognition is approach to 100%. Thus, the designed systems actually work well for the basical speech recognition.
Speech is the most natural means of communication among human beings and speech processing and recognition are intensive areas of research for the last five decades. Since speech recognition is a pattern recognition problem, classification is an important part of any speech recognition system. In this work, a speech recognition system is developed for recognizing speaker independent spoken digits in Malayalam. Voice signals are sampled directly from the microphone. The proposed method is implemented for 1000 speakers uttering 10 digits each. Since the speech signals are affected by background noise, the signals are tuned by removing the noise from it using wavelet denoising method based on Soft Thresholding. Here, the features from the signals are extracted using Discrete Wavelet Transforms (DWT) because they are well suitable for processing non-stationary signals like speech. This is due to their multiresolutional, multi-scale analysis characteristics. Speech recognition is a multiclass classification problem. So, the feature vector set obtained are classified using three classifiers namely, Artificial Neural Networks (ANN), Support Vector Machines (SVM) and Naive Bayes classifiers which are capable of handling multiclasses. During classification stage, the input feature vector data is trained using information relating to known patterns and then they are tested using the test data set. The performances of all these classifiers are evaluated based on recognition accuracy. All the three methods produced good recognition accuracy. DWT and ANN produced a recognition accuracy of 89%, SVM and DWT combination produced an accuracy of 86.6% and Naive Bayes and DWT combination produced an accuracy of 83.5%. ANN is found to be better among the three methods.
International Journal of Computer Applications, 2017
The Speech is most major & prime mode of Communication among human beings. The communication among human and computer is referred as human computer interface. Speech can be used to commune with computer. The speech recognition research is becoming more and more determined. Today, researchers are trying to making an effort to extend the capabilities of what computers can do with the spoken words. This paper consists of the classification of algorithms through which an uttered word can be converted to computer intelligible form. The challenges in speech recognition will be enumerated and analyzed for the most popular recognition techniques used today. The analysis ends with a brief description of some of the applications of speech recognition.
2017
Recognizing the speaker can simplify the task of translating speech in systems that have been trained on specific person's voices or it can be used to authenticate or verify the identity of a speaker as part of a security process. This work discusses the Implementation of an Enhanced Speaker Recognition system using MFCC and LBG Algorithm. MFCC has been used extensively for purposes of Speaker Recognition. This work has augmented the existing work by using Vector Quantization and Classification using the Linde Buzo Gray Algorithm. A complete test system has been developed in MATLAB which can be used for real time testing as it can take inputs directly from the Microphone. Therefore, the design can be translated into a Hardware having the necessary real time processing Prerequisites. The system has been tested using the VID TIMIT Database and using the Performance metrics of False Acceptance Rate (FAR), True Acceptance Rate (TAR) and False Rejection Rate(FRR). The system has been...
Automatic voice recognition is a computerized speech text process in voice is usually recorded with acoustic microphones by capturing air pressure changes. This kind of air transmitted voice signals is prone to two kinds of problems related to voice robustness and applicability. The former means mixing of speech signals and ambient noise usually deteriorate automatic voice recognition system performance. The latter means speech could be overheard easily on air transmission channel and this often results in privacy loss or annoyance to other people.
Journal of Signal and Information Processing, 2014
In this paper, an expert system for security based on biometric human features that can be obtained without any contact with the registering sensor is presented. These features are extracted from human's voice, so the system is called Voice Recognition System (VRS). The proposed system consists of a combination of three stages: signal pre-processing, features extraction by using Wavelet Packet Transform (WPT) and features matching by using Artificial Neural Networks (ANNs). The features vectors are formed after two steps: firstly, decomposing the speech signal at level 7 with Daubechies 20-tap (db20), secondly, the energy corresponding to each WPT node is calculated which collected to form a features vector. One hundred twenty eight features vector for each speaker was fed to the Feed Forward Back-propagation Neural Network (FFBPNN). The data used in this paper are drawn from the English Language Speech Database for Speaker Recognition (ELSDSR) database which composes of audio files for training and other files for testing. The performance of the proposed system is evaluated by using the test files. Our results showed that the rate of correct recognition of the proposed system is about 100% for training files and 95.7% for one testing file for each speaker from the ELSDSR database. The proposed method showed efficiency results were better than the well-known Mel Frequency Cepstral Coefficient (MFCC) and the Zak transform.
This paper reviews some of various research carried out over the last decade in the area of Automatic Speech Recognition (ASR) and discusses the major themes and advance made in the last decade of research, in order to show the outlook of technology and an appreciation of the fundamental progress that has been achieved in this weighty area of speech communication. Over period of research and development, the accuracy of automatic speech recognition remains one of the important research challenges such as variation of the context, environmental condition, speaker's variation and poor-quality audio. The design of speech recognition requires careful attention to the following issue: Definition of various types of speech classes, speech representation, techniques, database and performance evaluation. The history, challenges of speech recognition system and various techniques to solve these challenges constructed by various research works have been presented in a chronological order. The objective of this paper is to compare and summarize well know approaches used in various steps of speech recognition system.
IJCSIT, 2019
Speech recognition techniques are one of the most important modern technologies. Many different systems have been developed in terms of methods used in the extraction of features and methods of classification. Voice recognition includes two areas: speech recognition and speaker recognition, where the research is confined to the field of speech recognition. The research presents a proposal to improve the performance of single word recognition systems by an algorithm that combines more than one of the techniques used in character extraction and modulation of the neural network to study the effects of recognition science and study the effect of noise on the proposed system. In this research four systems of speech recognition were studied, the first system adopted the MFCC algorithm to extract the features. The second system adopted the PLP algorithm, while the third system was based on combining the two previous algorithms in addition to the zero-passing rate. In the fourth system, the neural network used in the differentiation process was modified and the error ratio was determined. The impact of noise on these previous systems. The outcomes were looked at regarding the rate of recognizable proof and the season of preparing the neural network for every system independently, to get a rate of distinguishing proof and quiet up to 98% utilizing the proposed framework.
the Speech is most prominent & primary mode ofCommunication among of human being. The communicationamong human computer interaction is called human computerinterface. Speech has potential of being important mode ofinteraction with computer .This paper gives an overview of majortechnological perspective and appreciation of the fundamentalprogress of speech recognition and also gives overview techniquedeveloped in each stage of speech recognition. This paper helps inchoosing the technique along with their relative merits &demerits. A comparative study of different technique is done asper stages. This paper is concludes with the decision on featuredirection for developing technique in human computer interfacesystem using different Language.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
International Journal of Electrical & Computer Sciences …
Journal of Advanced Sciences and Engineering Technologies (JASET), 2018
International Journal of Computer Applications, 2015
International Journal of Electrical and Computer Engineering (IJECE), 2020
INTERNATIONAL JOURNAL OF ADVANCE RESEARCH, IDEAS AND INNOVATIONS IN TECHNOLOGY