0% found this document useful (0 votes)

37 views5 pages

Speech To Text Using Multiple Lang...

Uploaded by

aishwaryadindore07

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views5 pages

Speech To Text Using Multiple Lang...

Uploaded by

aishwaryadindore07

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

International Conference on Communication and Signal Processing, April 6-8, 2016, India

Speech to Text Conversion for Multilingual

Languages
Yogita H. Ghadage, Sushama D. Shelke

Abstract—The current work presents a multilingual speech-to- human computer interaction. Generally today’s speech
text conversion system. Conversion is based on information in recognition technologies are designed for English language.
speech signal. Speech is the natural and most important form of So that illiterate rural communities or educationally under-
communication for human being. Speech-To-Text (STT) system privileged people are being kept away of computer
takes a human speech utterance as an input and requires a string technology. If the processing of computer technology in native
of words as output. The objective of this system is to extract, language is made possible i.e. if computer technologies can
characterize and recognize the information about speech. The
understand the native language then it will be easy to use
proposed system is implemented using Mel-Frequency Cepstral
Coefficient (MFCC) feature extraction technique and Minimum computer technologies for illiterate people, people from rural
Distance Classifier, Support Vector Machine (SVM) methods for communities or educationally under-privileged. Marathi is a
speech classification. Speech utterances are pre-recorded and native language of Maharashtra. In a day to day life while
stored in a database. Database mainly divided into two parts speaking we use English words, i.e. most of the time we mix
testing and training. Samples from training database are passed English with native language. So author has designed
through training phase and features are extracted. Combining Multilingual Speech-To-Text conversion system. In which
features for each sample forms feature vector which is stored as Marathi, English, Marathi-English mix speech has given focus.
reference. Sample to be tested from testing part is given to system The objective of the proposed system is to design and
and its features are extracted. Similarity between these features implement Speech-To-Text conversion system for Marathi,
and reference feature vector is computed and words having English, Marathi-English mix languages. The system has
maximum similarity are given as output. The system is developed
developed for small database which contains 10 Marathi
in MATLAB (R2010a) environment.
sentences, 3 English, 2 mix sentences. This work is based on
Index Terms—Mel Frequency Cepstral Coefficients (MFCC),
MFCC, SVM & Minimum Distance Classifier[2].
Minimum Distance Classifier, Speech Recognition, Speech-To- The outline of the paper is as follows. Section II gives a
Text (STT), Support Vector Machine (SVM). brief overview of the system. Section III describes about the
speech database. Section IV explains the MFCC feature
I. INTRODUCTION extraction. Section V says about the pattern classification.
Section VI describes about the experimental setup. Section
Speech Recognition is the procedure of extracting essential VII discuss about the result. Section VIII,] concludes the
information from input speech signal to make accurate paper. Section IX says about the future work.
decision about the corresponding text. Speech signal conveys
very rich information, such as speaker information, linguistic
II. SYSTEM OVERVIEW
information which has inspired many researchers to develop
the system that automatically process the speech e.g. speech
enhancement, speech synthesis, speech compression, speaker
recognition, speech recognition and verification. Speech
recognition can be further classified as speaker dependent and
speaker independent[1]. Computer follows human voice
commands with the help of speech recognition mechanism and
understand human languages i.e. it acts as good interface for

Yogita H. Ghadage is with the Electronics and Telecommunication

Engineering Department, NBN Sinhgad School of Engineering, Pune.
(e-mail: ghadagehyogita@[Link]).
Sushama D. Shelke is with the Electronics and Telecommunication
Engineering Department, NBN Sinhgad School of Engineering, Pune.
(e-mail: [Link]@[Link]). Fig. 1. Block diagram of system

978-1-5090-0396-9/16/$31.00 ©2016 IEEE

0236
Fig. 1 shows a block diagram of Speech-To-Text conversion TABLE I
system. The system operation is divided into two phases i.e. MARATHI DATABASE
training and testing. First in training phase speech utterances of
each sentences is recorded. Speech signal is preprocessed and
segmented into words. For each word acoustic features are
extracted using MFCC method. Such features for each word
forming feature vector is stored for reference. In testing phase
the speech utterance to be tested is preprocessed, segmented
into words and features are extracted for each words. These
features are compared with the reference feature vector stored
during training phase. This is done by using combination of
SVM and Minimum Distance Classifier. The word having
minimum difference is given as recognized word.

III. SPEECH DATABASE

Database is the crucial point in the automatic Speech-To-
Text conversion system. For any automatic speech recognition
system first step is to configure the database. The proposed
system is implemented on self-generated database[3]. The
whole database is divided into two parts. One is training
database and another is testing database. TABLE II
Marathi language sentences: 10 ENGLISH DATABASE
English language sentences: 3
Marathi-English mix sentences: 2
Total sentences: 15
Speakers: 4 (2-Male, 2-Female)
A. Training Database
The training database contains recorded speech utterances
by 4 different users for 10 Marathi sentences, 3 English TABLE III
MARATHI-ENGLISH MIX DATABASE
sentences and 2 mix sentences. Here each sentence is uttered
10 times by each user, i.e. 40 utterances of each sentence are
used to train the system i.e. total 600 samples are used to train
the system. Sentences used in the formation of database are
mentioned below in table1, 2 and 3.
B. Testing Database
IV. MFCC FEATURE EXTRACTION
The testing database also contains recorded speech
In any automatic speech recognition system first and the
utterances by 4 different users for 10 Marathi, 3 English and 2
most important step is to extract features. i.e. To identify
English-Marathi mix sentences. Here each sentence is uttered
useful components of speech signal that are used to identify
20 times by each user i.e. total 1200 samples are used to test
the linguistic content and discarding all the other stuff which
the system. Each sample of training and testing databases is
carries information like background noise, emotion etc[4].
recorded with sampling frequency of 8 kHz.
Two main purposes of feature extraction are: first is to
compress the input speech signal into features, and second is to
use these features which are insensitive to speech variations,
changes of environmental conditions and independent of
speaker.

0237
§ f · (1)
Mel ( f ) 2595 * log 10¨1 ¸
© 700 ¹

As non-linear characteristics of the human auditory system

in frequency are approximated by Mel filtering in the same
way natural logarithm is used to approximate the loudness
non-linearity i.e. the relationship between the human’s
perception of loudness and the sound intensity is approximated
by natural logarithm. Multiplication in frequency domain,
become simple addition after the logarithm.
The log Mel filter bank coefficients are computed from the
filter output as:
Fig. 2. Block diagram of MFCC
N 1

Steps of MFCC feature extraction are as follows:

s ( m) ¦ X (k ) H (k ),0 P M
20 log 10
k 0
(2)

A. Pre-emphasis
Where, M is the no. of Mel filters (20 to 40),
Pre-emphasis is applied to spectrally flatten the input speech
X (k) is N-point FFT of specific window frame of the input
signal. First order high pass FIR filter is used to pre-emphasize
speech signal, H (k) is the Mel filter transfer function.
the higher frequency components.
F. Discrete Cosine Transform (DCT)
B. Framing
For transforming the Mel coefficients back to time domain
An audio signal is constantly changing, so to simplify things
discrete cosine transform is performed. The result of this step
we assume that on short time scales the audio signal doesn’t
is called the Mel Frequency Cepstral Coefficients (MFCC).
change much. So we frame the signal into 20-40 ms frames.
The inverse Fourier transform of the log magnitude of
Hamming window is applied on each frame and it rid of some
Fourier transform of the signal is called as Cepstrum. As
information at start and at end of frame so to reincorporate this
coefficients of the log Mel filter bank are real and symmetric,
information back into our extracted features overlapping is
we can replace the inverse Fourier transform operation by
applied on frames[5].
DCT to generate the Cepstral coefficients.
C. Windowing The smooth spectral shape or vocal tract shape is represented
To avoid or reduce the unwanted discontinuities in speech by lower order Cepstral coefficients. While the excitation
segment and distortion in spectrum introduced by framing information is represented by higher coefficients.
process windowing is performed. Mostly Hamming window is The Cepstral coefficients are the DCT of the M filter outputs
used in speech recognition. obtained as:
D. Discrete Fourier Transform (DFT) M 1
ª Sn m 1 / 2 º
Spectral estimation can be done by DFT. FFT is very ¦ s ( m)
m 0
cos«
¬ M »
¼
(3)
efficient algorithm to implement DFT. The magnitude
frequency response of each frame is obtained after FFT
execution. i.e. Spectral coefficients of the speech frames are Typically first 13 Cepstral coefficients are used. Generally
complex numbers containing both magnitude and phase MFCC coefficients are less correlated than the log Mel filter
information. The phase information is usually discarded for bank coefficients this is the biggest advantage.
speech recognition and only the magnitude of the spectral
coefficients is retained. V. PATTERN CLASSIFICATION

E. Mel Frequency Filtering A. Minimum Distance Classifier (MDC)

Normally each tone with an actual frequency ‘f’ is measured In speech recognition or STT conversion there are mainly
in Hz. For speech signal, the ability of human ear to two phases first is training phase and second one is testing
understand frequency contents of sounds does not follow a phase. For classification, during training phase zero crossing
linear scale. So that for every tone a subjective pitch is points (ZCP) corresponding to the different words are pre
measured on a scale called the ‘Mel’ scale. Below 1000 Hz, computed and stored as reference ZCPs[6].
Mel frequency scale is a linear frequency spacing and above Minimum distance classifier computes Euclidean distance
1000 Hz it is a logarithmic spacing. between the zero crossing points of the uttered word and zero
The following formula gives the transformation of a given crossing points of words from database. The word having least
linear frequency ‘f’ Hz into corresponding ‘Mel’ frequency. Euclidean distance is declared as uttered word.

0238
Euclidean distance is given as: small class problems i.e. multiclass problems are converted
into binary problems. So solving these problems becomes
k easy. MDC is mainly used for coarse tuning and SVM
d 2 ( x, p) ¦ (x
1
i pi ) 2 (4)
performs fine tuning.

VI. EXPERIMENTAL SETUP

Where, x and p ZCP database.
i.e. x is a ZCP vector of uttered word. The system is trained with the training database and the
p is a ZCP vector of different words.
recorded speech utterances stored in test database are used to
test the system. All utterances are recorded at 8 kHz of
i varies from 1 to k (i.e. no. of ZCPs of a particular word).
sampling frequency. Duration for sentences is from 3sec-5sec.
The sum of squares of the difference between the individual
The input speech signal is given to the MFCC which
zero crossing points is computed to calculate the Euclidean
converts it into feature vectors. Minimum distance classifier
distance i.e. distance between the uttered word and all words
and support vector machine techniques are used for
in the database is found out. The word in the database with
classification purpose[7-9].
least distance is declared as the uttered word.
The trained speech samples are saved as reference models
B. Support Vector Machine (SVM) into database. After that each segmented speech sample of test
SVM is one of the effective method of pattern classification. speech signal is passed over reference models and minimum
SVM use linear and nonlinear separating hyper-planes for data distance is computed. Each word recognition is done by using
classification. First input is mapped into a high dimensional minimum distance and SVM model. The whole system is
space and then with the help of hyper-plane it distinguishes the implemented and tested in MATLAB software.
classes.
The inner product, kernel which is caused by the high VII. RESULTS
dimensional mapping is a crucial aspect of opting SVMs
successfully i.e. a high dimensional feature space is implicitly
introduced by a computationally efficient kernel mapping and
in a high dimensional feature space SVM finds a separating
surface with a large margin between training samples of two
classes . And large margin implies a better generalization
ability. SVM uses discriminative approach. The classification
of any fixed length data vectors is possible by SVM. It cannot
be readily applied to task involving variable length data
classification.
The support vector classifier uses the function:

f ( x) (>D K s ( x)@) b (5)

Where, K s ( x) [k ( x, s1 ),......k ( x, sd )]T is a vector of

evaluation of kernel functions centered at the support vectors.
f ( x) (>D K s ( x)@) b Which are usually subset of the
training data. Fig. 3. Speech Waveform of ‘Tu kuthe aahes’
The classification rule is defined as:

q( x) {1 forf ( x) t 0
(6)
{2 forf ( x) 0
And multiclass classification function and rule is defined as:

f y (x) (D y ks (x)) by, y Y (7)

Q(x) arg max f y (x), y Y (8)

C. SVM-MDC Combination
The proposed system uses the combination of SVM and
MDC for classification. It translates large class problems into Fig. 4. Power Spectral Density of signal ‘Tu kuthe aahes’

0239
IX. FUTURE WORK
This connected word speech recognition system is developed
for speaker independent Marathi, English, English-Marathi
mix languages. This work may extend for other regional
languages and may extend towards the real time connected
word speech recognition for multilingual speech.

REFERENCES
[1] Priyanka P. Patil, Sanjay A. Pardeshi, “Marathi Connected Word
Speech Recognition System,” IEEE First International Conference on
Networks & Soft Computing, pp 314-318, Aug. 2014.
[2] [Link], [Link], “Speech Recognition by Machine: A Review,”
Fig. 5. MFCC Filter weights International Journal of Computer Science and Information Security,
vol.6, no.3, 2009.
[3] Mathias De Wachter, Mike Matton, Kris Demuynck, Patrick Wambacq,
“Template Based Continuous Speech Recognition,” IEEE Transs. On
Audio, Speech & Language Processing, vol.15, issue 4,pp 1377-1390,
May 2007.
[4] Vikram.C.M., [Link], “Phoneme Independent Pathalogical Voice
Detection Using Wavelet Based MFCCs, GMM-SVM Hybrid
Classifier,” IEEE International Conference on Advances in Computing,
Communications and Informatics, pp 929-934, Aug. 2013.
[5] [Link], [Link], Abhishek Karan and [Link],
“PSOC based isolated speech recognition system,” IEEE International
Conference on Communication and Signal Processing, pp 693- 697,
April 3-5, 2013, India.
[6] Taabish Gulzar, Anand Singh, Dinesh Kumar Rajoriya and Najma
Farooq, “A Systematic Analysis of Automatic Speech Recognition: An
Overview,” International Journal of Current Engineering and
Technology, vol.4, no.3, June 2014.
[7] Santosh V. Chapaneri, “Spoken Digits Recognition using Weighted
MFCC and Improved Features for Dynamic Time Warping,”
International Journal of Computer Applications, vol.40, no.3, Feb.
2012.
[8] Rashmi C. R., “Review of Algorithms and Applications in Speech
Recognition System,” International Journal of Computer Science and
Information Technologies, vol. 5(4), pp 5258-5262, 2014.
Fig. 6. MFCC Discrete Cosine Transform Matrix [9] Shivanker Dev Dhingra, Geeta Nijhawan, Poonam Pandit, “Isolated
Speech Recognition Using MFCC and DTW,” International Journal of
Advanced Research in Electrical, Electronics and Instrumentation
TABLE IV Engineering, vol.2, Issue 8, Aug. 2013.
MARATHI-ENGLISH MIX DATABASE

VIII. CONCLUSION
The % accuracy of the proposed system for Marathi
language of 93.625% is achieved using MFCC for feature
extraction, Minimum Distance classifier and SVM
combination for classification. The proposed system achieved
the higher accuracy compared to the using MFCC-feature
extraction technique & CDHMM-classifier, which gives
accuracy of 88.80% for Marathi language. The proposed
system achieved 91.6667% accuracy for English and 90.625%
accuracy for English-Marathi mix languages.

0240

Multilingual Speech-to-Text System
No ratings yet
Multilingual Speech-to-Text System
5 pages
Myanmar Speech to Text System
No ratings yet
Myanmar Speech to Text System
2 pages
Real-Time Speech Recognition with MATLAB
No ratings yet
Real-Time Speech Recognition with MATLAB
5 pages
Implementation of Marathi Language Speech Databases For Large Dictionary
No ratings yet
Implementation of Marathi Language Speech Databases For Large Dictionary
6 pages
Telugu TTS System Development
No ratings yet
Telugu TTS System Development
9 pages
Implementation of Speech Synthesis System Using Neural Networks
No ratings yet
Implementation of Speech Synthesis System Using Neural Networks
4 pages
Combination of LPC and ANN For Speaker Recognition
No ratings yet
Combination of LPC and ANN For Speaker Recognition
5 pages
Research Paper 2
No ratings yet
Research Paper 2
14 pages
2 - CNN Based Speaker Recognition in Language and Text Independent Small Scale System
No ratings yet
2 - CNN Based Speaker Recognition in Language and Text Independent Small Scale System
4 pages
Malayalam Speech Recognition
No ratings yet
Malayalam Speech Recognition
3 pages
Sharika Malayalam Speech Recognition System: Shyam.k MES College of Engineering, Kuttipuram
No ratings yet
Sharika Malayalam Speech Recognition System: Shyam.k MES College of Engineering, Kuttipuram
4 pages
A Focus On Codemixing and Codeswitching in Tamil Speech To Text
No ratings yet
A Focus On Codemixing and Codeswitching in Tamil Speech To Text
12 pages
Jarvis Digital Life Assistant IJERTV2IS1237 PDF
No ratings yet
Jarvis Digital Life Assistant IJERTV2IS1237 PDF
6 pages
Comp Sci - Recognition Isolated - Shanthi Teressa1
No ratings yet
Comp Sci - Recognition Isolated - Shanthi Teressa1
6 pages
Indian Language Speech Database Review
No ratings yet
Indian Language Speech Database Review
5 pages
TTShindi
No ratings yet
TTShindi
83 pages
Speech Recognition Using MFCC and DTW: January 2014
No ratings yet
Speech Recognition Using MFCC and DTW: January 2014
5 pages
Slang Detection via Speech Recognition
No ratings yet
Slang Detection via Speech Recognition
15 pages
Wa0002.
No ratings yet
Wa0002.
10 pages
Temp Research Paper
No ratings yet
Temp Research Paper
5 pages
Audio Speech To Sign Language Converter Master Complete Document
No ratings yet
Audio Speech To Sign Language Converter Master Complete Document
54 pages
Indian Accent Speech Recognition Challenges
No ratings yet
Indian Accent Speech Recognition Challenges
19 pages
Report
No ratings yet
Report
38 pages
Text-To-Speech System For Telangana State Languages
No ratings yet
Text-To-Speech System For Telangana State Languages
6 pages
Speech & Text Recognition Report
No ratings yet
Speech & Text Recognition Report
74 pages
Marathi Sentence Type Detection
100% (1)
Marathi Sentence Type Detection
5 pages
Speech Recognition Using MFCC Analysis
No ratings yet
Speech Recognition Using MFCC Analysis
4 pages
8.5 Multilingual Speech Processing
No ratings yet
8.5 Multilingual Speech Processing
24 pages
Deep Learning Based Multilingual Speech Synthesis Using Multi Feature Fusion Methods
No ratings yet
Deep Learning Based Multilingual Speech Synthesis Using Multi Feature Fusion Methods
16 pages
AIspeaker
No ratings yet
AIspeaker
10 pages
Real Time Speech Translation Between Indian Languages
No ratings yet
Real Time Speech Translation Between Indian Languages
5 pages
Speech Recognition Using Neural Networks: A. Types of Speech Utterance
No ratings yet
Speech Recognition Using Neural Networks: A. Types of Speech Utterance
24 pages
Speech Recognition1
No ratings yet
Speech Recognition1
24 pages
133-138, Tesma0810, IJEAST
No ratings yet
133-138, Tesma0810, IJEAST
6 pages
Neurocomputing: Mario Malcangi, David Frontini
No ratings yet
Neurocomputing: Mario Malcangi, David Frontini
10 pages
Techniques in Text-to-Speech Synthesis
No ratings yet
Techniques in Text-to-Speech Synthesis
138 pages
Kumbharana CK Thesis Cs
No ratings yet
Kumbharana CK Thesis Cs
243 pages
Tts
No ratings yet
Tts
13 pages
Tamil Textual Image Reader
No ratings yet
Tamil Textual Image Reader
4 pages
Speech to Text Recognition Project
No ratings yet
Speech to Text Recognition Project
5 pages
Marathi Speech Synthesis A Review
No ratings yet
Marathi Speech Synthesis A Review
4 pages
ISM Report Final
No ratings yet
ISM Report Final
33 pages
Speech Recognition of Isolated Words Usi
No ratings yet
Speech Recognition of Isolated Words Usi
10 pages
The PC Interfaced Voice Recognition System Is To Implement A Password For Authentication
No ratings yet
The PC Interfaced Voice Recognition System Is To Implement A Password For Authentication
7 pages
Speech Recognition System Development
No ratings yet
Speech Recognition System Development
9 pages
Wocal Swa Dict
No ratings yet
Wocal Swa Dict
13 pages
IIIT-H Indic Speech Databases
No ratings yet
IIIT-H Indic Speech Databases
4 pages
Voice-to-Text via Deep Learning
No ratings yet
Voice-to-Text via Deep Learning
6 pages
BF 02745745
No ratings yet
BF 02745745
28 pages
Computer-Based Speech Processing Course
No ratings yet
Computer-Based Speech Processing Course
70 pages
A Hindi Speech Recognition System For Connected Wo
No ratings yet
A Hindi Speech Recognition System For Connected Wo
8 pages
The Main Principles of Text-to-Speech Synthesis System: January 2010
No ratings yet
The Main Principles of Text-to-Speech Synthesis System: January 2010
8 pages
TTS Tech Review for Researchers
No ratings yet
TTS Tech Review for Researchers
4 pages
Speech Recognition Using Matrix Comparison: Vishnupriya Gupta
No ratings yet
Speech Recognition Using Matrix Comparison: Vishnupriya Gupta
3 pages
Voice Command Recognition Using MFCC and DTW
No ratings yet
Voice Command Recognition Using MFCC and DTW
8 pages
Speech Recognition System Proposal
No ratings yet
Speech Recognition System Proposal
11 pages
Voice Recognition & Text-to-Speech
No ratings yet
Voice Recognition & Text-to-Speech
6 pages
Basic Principles of Nursing Care PDF
No ratings yet
Basic Principles of Nursing Care PDF
11 pages
FMHM Coursefile
No ratings yet
FMHM Coursefile
77 pages
(Steven Galbraith) Mathematics of Public Key Crypt
100% (1)
(Steven Galbraith) Mathematics of Public Key Crypt
597 pages
Calitatea Apei
No ratings yet
Calitatea Apei
64 pages
Design and Implementation of An Iot-Based Smart Home Security System
No ratings yet
Design and Implementation of An Iot-Based Smart Home Security System
8 pages
Technical Specifications of GIV-20A2616-GSOLA
No ratings yet
Technical Specifications of GIV-20A2616-GSOLA
5 pages
Design of Transmission Systems Question Bank
No ratings yet
Design of Transmission Systems Question Bank
23 pages
Effect of Gum Acacia Incorporation On The Bread Making Performance of Punjab Wheat
No ratings yet
Effect of Gum Acacia Incorporation On The Bread Making Performance of Punjab Wheat
11 pages
Asl 121 Syllabus Spring 2022
No ratings yet
Asl 121 Syllabus Spring 2022
8 pages
Usha Martin
67% (3)
Usha Martin
61 pages
Tic-Tac Toe: C++ Programming Language
No ratings yet
Tic-Tac Toe: C++ Programming Language
20 pages
Com Lynx User Guide 15 The 20110513
No ratings yet
Com Lynx User Guide 15 The 20110513
45 pages
BMN B maXX 2024 en Web
No ratings yet
BMN B maXX 2024 en Web
40 pages
Car Series A, C,& D, Practice Questions
No ratings yet
Car Series A, C,& D, Practice Questions
26 pages
FEMINE
No ratings yet
FEMINE
133 pages
What Not To Do On Your Igcse Exams
No ratings yet
What Not To Do On Your Igcse Exams
3 pages
Ideology and Sexuality Among Victorian Women
No ratings yet
Ideology and Sexuality Among Victorian Women
25 pages
Tanglewood HR Hiring Strategy Overview
0% (2)
Tanglewood HR Hiring Strategy Overview
4 pages
Ad. Jingles Brand Recall
No ratings yet
Ad. Jingles Brand Recall
13 pages
Grassroots Football Coaching Guide
100% (1)
Grassroots Football Coaching Guide
18 pages
13. Quranic Swearing Words
No ratings yet
13. Quranic Swearing Words
9 pages
Las Ap4 Q2 Week 2 Las 1 Arlene Asgapo1
No ratings yet
Las Ap4 Q2 Week 2 Las 1 Arlene Asgapo1
4 pages
Baf Preamble
No ratings yet
Baf Preamble
7 pages
How To Break Software: James A. Whittaker
No ratings yet
How To Break Software: James A. Whittaker
8 pages
Taylor Experiencing
No ratings yet
Taylor Experiencing
12 pages
RESET - GP - CETI - Bicomponent Spunbond Nonwovens
No ratings yet
RESET - GP - CETI - Bicomponent Spunbond Nonwovens
40 pages
Quantitative Research & Variables
No ratings yet
Quantitative Research & Variables
45 pages
Class Source #5 - What Does It Really Take To Build A New Habit
No ratings yet
Class Source #5 - What Does It Really Take To Build A New Habit
7 pages
Clause and Sentence Structure
No ratings yet
Clause and Sentence Structure
2 pages
Pay It Forward: Social Responsibility Insights
No ratings yet
Pay It Forward: Social Responsibility Insights
2 pages

Speech To Text Using Multiple Lang...

Uploaded by

Speech To Text Using Multiple Lang...

Uploaded by

International Conference on Communication and Signal Processing, April 6-8, 2016, India

Speech to Text Conversion for Multilingual

Yogita H. Ghadage is with the Electronics and Telecommunication

978-1-5090-0396-9/16/$31.00 ©2016 IEEE

III. SPEECH DATABASE

As non-linear characteristics of the human auditory system

Steps of MFCC feature extraction are as follows:

E. Mel Frequency Filtering A. Minimum Distance Classifier (MDC)

VI. EXPERIMENTAL SETUP

f ( x) (>D K s ( x)@)  b (5)

Where, K s ( x) [k ( x, s1 ),......k ( x, sd )]T is a vector of

f y (x) (D y ks (x))  by, y Y (7)

Q(x) arg max f y (x), y Y (8)

You might also like

f ( x) (>D K s ( x)@) b (5)

f y (x) (D y ks (x)) by, y Y (7)

Q(x) arg max f y (x), y Y (8)