0% found this document useful (0 votes)
76 views4 pages

International Conference On Emanations in Modern Technology and Engineering (ICEMTE-2017) ISSN: 2321-8169 Volume: 5 Issue: 3 254 - 257

This document describes a voice-controlled robotic vehicle that can be remotely operated using speech commands. The system uses a standalone speech recognition circuit (SRC) that is separate from the robot's main CPU, allowing voice commands to be recognized without using the robot's processing power. The SRC can be programmed to recognize specific words as commands to control the robot's movement, such as moving forward, backward, left, or right. The system aims to provide a basic menu-driven voice control interface for the robot. It analyzes speech signals and compares them to a database of trained commands using techniques like dynamic time warping and mel frequency cepstral coefficients to recognize commands and control the robot.

Uploaded by

priti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views4 pages

International Conference On Emanations in Modern Technology and Engineering (ICEMTE-2017) ISSN: 2321-8169 Volume: 5 Issue: 3 254 - 257

This document describes a voice-controlled robotic vehicle that can be remotely operated using speech commands. The system uses a standalone speech recognition circuit (SRC) that is separate from the robot's main CPU, allowing voice commands to be recognized without using the robot's processing power. The SRC can be programmed to recognize specific words as commands to control the robot's movement, such as moving forward, backward, left, or right. The system aims to provide a basic menu-driven voice control interface for the robot. It analyzes speech signals and compares them to a database of trained commands using techniques like dynamic time warping and mel frequency cepstral coefficients to recognize commands and control the robot.

Uploaded by

priti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

International Conference On Emanations in Modern Technology and Engineering (ICEMTE-2017) ISSN: 2321-8169

Volume: 5 Issue: 3 254 - 257


____________________________________________________________________________________________________________________

Voice Controlled Robotic Vehicle with Long Distance Speech Recognition

Krishnendu.S.Nair Shekhar Mane


Department Of Computer Engineering Department Of Electronics & Telecommunication
Pillai College Of Engineering Bvcoe, Kharghar.
New Panvel

Abstract- Speech is a ideal method for robotic control and communication. The speech recognition circuit we will outline, functions
independently from the robot's main intelligence [central processing unit (CPU) ]. This is a good thing because it doesn't take any of the robot's
main CPU processing power for word recognition. The CPU must merely poll the speech circuit's recognition lines occasionally to check if a
command has been issued to the robot. We can even improve upon this by connecting the recognition line to one of the robot's CPU interrupt
lines. By doing this, a recognised word would cause an interrupt, letting the CPU know a recognised word has been spoken. Another advantage
to this stand-alone speech-recognition circuit (SRC) is its programmability. You can program and train the SRC to recognise the unique words
you want recognised. The SRC can easily interface to the robot's CPU. At its most basic level speech origination allows the user to perform
parallel tasks, while continuing to work with the computer or appliance. The algorithm which is going to be used is Forward Algorithm or
Viterbi Algorithm. Forward algorithm solves given model parameters which have output probability as a certain series of number. Viterbi
algorithm solves given model parameters which have hidden state series with maximum probability to give output as a given certain series of
number.
Index term- mobile robot , Smartphone,remote control , Interfacing Circuit.

__________________________________________________*****_________________________________________________

I. INTRODUCTION What we are aiming at is to control the robot using following


voice commands.
T he purpose of this project is to build a robotic car which
Robot
which can do these basic tasks:-
could be controlled using voice commands. Generally these
kinds of systems are known as Speech Controlled Automation
Systems (SCAS). Our system will be a prototype of the same.
We are not aiming to build a robot which can recognize a lot
of words. Our basic idea is to develop some sort of menu
driven control for our robot, where the menu is going to be
voice driven. What we are aiming at is to control the robot
using following voice commands. Robot which can do these
basic tasks like move forward, movie backward,move
left,move right.
The greatest advantage of using a mobile phone to remotely
control a robot is location independent.There are a number of
existing articles studying the development of communication
models between a cellular phone and a robot. Smartphones to
be powerful platform for robotic automated remote control. A
contemporary smartphone possesses many auxiliary features.

II. DESIGN OF ROBOT SPEECH COMMAND CONTROL


SYSTEM Fig I. Command control diagram

The purpose of this project is to build a robotic car which The process of speech commands controlling is complicated.
could be controlled using voice commands. Generally these When the operator speaks to the robot, their voice will be
kinds of systems are known as Speech Controlled Automation captured by the microphone and passed into signal processing
Systems (SCAS). Our system will be a prototype of the same. module. Oral commands will be processed into a structure of
We are not aiming to build a robot which can recognize a lot features. These features may include signal characteristics
of words. Our basic idea is to develop some sort of menu such as energy or frequency response. The features would be
driven control for our robot, where the menu is going to be analysed and compared with the data in database. The
voice driven. database is obtained through signal analysis, that stage can be

254
IJRITCC | March 2017, Available @ http://www.ijritcc.org (Conference Issue)
_____________________________________________________________________________________
International Conference On Emanations in Modern Technology and Engineering (ICEMTE-2017) ISSN: 2321-8169
Volume: 5 Issue: 3 254 - 257
____________________________________________________________________________________________________________________
called "training" of the speech data. The recognized database. The database is obtained through signal analysis,
commands will be passed into control module, which is that stage can be called "training" of the speech data.
separate from the speech recognition module. The control Considering for our system, we need real time operation
module will process the commands it receives from the and our commands are simple, so speech recognition
speech recognition module and instruct the robot to take technique DTW which is based on pattern-comparison is
corresponding actions. selected, and we extract MFCC as feature of the speech
There are four basic commands: go forward, go backward, command.
turn left, turn right, and stop. All the commands would be In our system, the process of speech recognition can be
given to the control module. There is a while loop to keep divided into five parts:
checking the input command, and compares it with those 1. Acquire voice signals through microphone and make
seven basic commands. If the command is one of them, then analog-to-digital conversion through soundcard
the control module will make the robot take the 2. A series of pre-process signal analysis
corresponding action. Turn left command means the robot 3. Mel Frequency Campestral Coefficients (MFCC)
will turn left 45 degrees and then go straight with a constant
velocity of 0.05m/s.The same extent to the turn right calculation [1]
command. Extend and reposition means two position of
manipulator. Go forward and go backward means it will go
straight with a constant velocity of 0.05m/s. Stop means the IV. EXISTING SYSTEM
robot will pull up without any action. The Existing voice controlled robot uses a wide range of
techniques for feature extraction. Any one of the techniques
can be used to develop the system. The techniques are as
follows:

A. USING 8951 MICROCONTROLLOR AND RF


MODULATOR

The main objective of the project is to control the robotic


vehicle in a desired position, remotely through user voice
commands by attaching a speech-recognition module to the
microcontroller unit and using an RF communication. The
Fig II. Working
proposed system consists of two blocks: transmitter and
receiver block; both use a microcontroller of the 8951 family
III. VOICE COMMAND RECOGNITION
and a battery for power source. This project also consists of a
laser beam light to diffuse the bombs if required from a
Speech signal is time variable actually. We usually process
certain distance. An RF transmitter module is connected to
speech signal in a very short time, for instance, in 20ms, such
the transmitter unit with the help of an encoder device. A
a short time, the signal can be considered invariable, and this
voice-recognition module and a set of push-button switches
is the basic point of processing of speech signal.
are interfaced to the microcontroller for giving the input. The
The process of our speech recognition is to extract feature commands are sent from the voice or push-button switches to
from an acoustic signal and then recognise it. Feature the receiver to control the movement of the robot either in
extraction step involve Mel Frequency Campestral forward, backward left or right directions. An RF receiver is
Coefficients (MFCC) and the linear prediction coefficients connected to the receiver end with the help of a decoder
(LPCC). The MFCC parameter achieves the highest device. The two motors are interfaced to the microcontroller
recognition accuracy when compared with LPCC. The through a motor-driver IC wherein they are used to run or
recognition stage can be achieved by many processes such as change the directions of the robotic vehicle. The robot is
Dynamic Mme Warping (DTW) which is based on pattern- controlled by the voice or push buttons wherein the
comparison, Hidden Markov Modelling (HMM) which is commands are sent by the transmitter –and, based on these
based on statistics model, Neural Networks (NN) which is commands –the receiver controls the directions of the robot.
based on neural network . A laser beam is mounted on the robot's body – whose
In some small vocabulary application, the speech recognition operation is carried out by the microcontroller output through
that is based on pattern-matching is more convenient and the appropriate signal from the transmitting end.
efficient than the other algorithms. The more simple control
commands are, the more intelligent a robot should be. Simple B. VOICE CONTROLLED ROBOT USING ANDROID
isolated words speech recognition technique can give highest MOBILE BLUETOOTH
accuracy of recognition results in shorter time requiring less Voice Controlled Robot (VCR) is a mobile robot whose
powerful hardware. So DTW is appropriate for small motions can be controlled by the user by giving specific voice
vocabulary and real time operation. In process of speech commands. The speech recognition software running on a
recognition using DTW, features which represent the voice Android Mobile is capable of identifying the different voice
would be extracted and then be compared with the data in commands 'Forward', 'Stop', 'Left', 'Right' and 'Back' etc.
Issued by a user. The working mechanism of the robot is
255
IJRITCC | March 2017, Available @ http://www.ijritcc.org (Conference Issue)
_____________________________________________________________________________________
International Conference On Emanations in Modern Technology and Engineering (ICEMTE-2017) ISSN: 2321-8169
Volume: 5 Issue: 3 254 - 257
____________________________________________________________________________________________________________________
based on the information passed from the Phone to the robot between the Arduino and the Playstation plug. This chip is
using a headset cable. If u want to use online mode u can very easy to use: just power VCCb at 5V and VCCa at 3.3V
directly use common commands like forward, backward, left, and you are done! Please note that a previous version of this
right or if u want to use offline mode u have to use below page had VCCa and VCCb reversed. Thanks to Joonas for
commands. pointing out my error. Each of the four channels is fully bi-
directional without a need to select the direction.[6]
Commands for online mode
go = forward
back = back
left
right V. SYSTEM ANALYSIS
stop
these commands can be easily understood by Google server A. FEASIBILITY ANALYSIS
for voice recognition input so we used these commands. Feasibility study is a major factor that contributes to the
analysis of the system. The decision of the System Analyst,
C. CONTROLL ROBOT BY USING MOBILE DTMF TONE where to design a particular system or not depend on its
(TOUCHPAD) AND ATmega16 MICROCONTROLLER feasibility study. The feasibility study on this system is
In this project, the robot is controlled by a mobile phone that divided in the following three areas. All projects are feasible
makes a call to the mobile phone attached to the robot. In the given unlimited resource and infinite time. It is both
course of a call, if any button is pressed, a tone corresponding necessary and prudent to evaluate the feasibility of the project
to the button pressed is heard at the other end of the call. This at the earliest possible time. Feasibility and risk analysis is
tone is called ‘dual-tone multiple-frequency’ (DTMF) tone. related in many ways. If project risk is great, the feasibility
The robot perceives this DTMF tone with the help of the listed below are equally important.
phone stacked in the robot. The received tone is processed by
the ATmega16 microcontroller with the help of DTMF 1) Economic Feasibility: This is concerned with the cost
decoder MT8870. The decoder decodes the DTMF tone into incurred for development and implementation of the system,
its equivalent binary digit and this binary number is sent to the maintenance of the system and the benefits derived from
the microcontroller. The microcontroller is pre-programmed it. The hardware and software required for the system is
to take a decision for any given input and outputs its decision already available. In this we examine the cost of developing
to motor drivers in order to drive the motors for forward or the system with regard to what the organization can afford.
backward motion or a turn. The only cost involved is for coding, implementation and
The mobile that makes a call to the mobile phone stacked in maintaining of the system. Hence the system is economically
the robot acts as a remote. So this simple robotic project does feasible.
not require the construction of receiver and transmitter units.
2) Technical Feasibility: The firm has to purchase a machine
DTMF signalling is used for telephone signalling over the with Pentium processor or higher. The computer must be
line in the voice-frequency band to the call switching centre. running windows XP or any other higher version of windows.
The version of DTMF used for telephone tone dialling is As the hardware and the software of developing the system is
known as ‘Touch-Tone.’ already available, the system is technically feasible. The
concern will only be in which system the software is being
D. CONTROLL ROBOT USING PLAYSTATION CONTROLLOR developed and in which it will be implemented. The proposed
system is developed in KEIL µVISION and ECLIPSE and
The PlayStations game port uses a rather sophisticated
will be implemented on android 4.0 or above. The project is
protocol built on top of a very simple and common serial
beneficial only if it can provide a successful and accurate
interface: SPI. This synchronous serial interface uses four
access to the users.
lines: a clock (sent by the Adriano), a data input (called
MISO), a data output (MOSI) and a select (sometimes called
SS or ATT). This interface is byte oriented and a basic 3) Operational Feasibility: There are two aspects to
transfer consist of an exchange of eight bits. There are several operational feasibility. One aspect is that of technical
parameters that need to be agreed upon before a successful information and other is Acceptance. Technical information
link can be established (speed, data order, clock polarity and determines if a system can provide correct results and
active clock edge). However, you don't need to worry about Acceptance involves users acceptance to the computer
these too much as the library sets up the Adriano to match the system. Knowing that the system can provide easy and
format used by the PlayStation. accurate access to a robotic vehicle, users will not hesitate to
use the system for real situations in daily routine. The current
There is one difference between Arduino and a Playstation: system also provides options for speech recognition technique
while Arduino runs at 5V, the Playstation (and its peripherals) to control the bot but is less accessible and has a less coverage
are designed to run at 3.3V. Some people have successfully area. Thus the system that is going to be developed will be
run some of these game pads at 5V, however I am not sure highly accurate and can process the voice signals at a much
how reliable such a setup would be for the long term. As the faster rate. With better algorithms the software is assured to
drawing below shows, I decided to use a special chip (the give better results without compromising in the genre of
TXS0104, Available from Digi-Key) to adjust the logic levels quality on accessibility
256
IJRITCC | March 2017, Available @ http://www.ijritcc.org (Conference Issue)
_____________________________________________________________________________________
International Conference On Emanations in Modern Technology and Engineering (ICEMTE-2017) ISSN: 2321-8169
Volume: 5 Issue: 3 254 - 257
____________________________________________________________________________________________________________________
B. REQUIREMENT ANALYSIS
D. SYSTEM ANALYSIS
After analysing the requirements of the task to be performed,
1) User interfaces: User interface is simple and efficient the next step is to analyse the problem and understand its
enough to set up the user’s voice. Apart from this user context. The first activity in the phase is studying the existing
interface need not be used as the application runs in the system and other is to understand the requirements and
background. domain of the new system. Both the activities are equally
important, but the first activity serves as a basis of giving the
functional specifications and then successful design of the
2) Hardware interface: Any smart phones working at android proposed system. Understanding the requirements of a new
4.0 or above. system is more difficult and requires creative thinking and
Processor above 500 MHz and 512 MB of RAM understanding of the existing system is also difficult,
Internal memory with at least 100 MB free storage. improper understanding of present system can lead diversion
Steel chassis with mobile holder. from solution.

3) Performance requirements: The maximum satisfactory VI. Conclusion


time to respond to the voice and accessing the bot should be In this paper a method of word speech recognition system
less than a second. Response time can be measured from the was proposed to control the vehicle and therefore make
time the user speaks to the phone to the time the vehicle takes proposed technique be more efficient in real time operation
to start its moment. It is user’s subjective wait time. used in control. In our proposed project
The user if suffering from tracheal infections might not be In this paper the hardware used for making of the vehicle is
able to access the vehicle and might have to resent to the also presented and the android app working which will
basic techniques of accessing the vehicle such as using RF control the robot .The communication channel which will
module or Bluetooth control. carry the signal will be 2 types that is cellular connection for
word recognition internet connection and for clear connection
4) Security requirement: It has to be ensured that the saved to the robot instead of using a analog connection we have
information on the speech recognition app is not tampered by used a digital connection that is via bluetooth .
another program, software or virus intentionally or
unintentionally.
REFERENCES
C. SOFTWARE QUALITY ATTRIBUTES [1] Agarwal, A., Wardhan, K., Mehta, P.: JEEVES - A Natural
Language Processing Application for Android.
Quality attributes are the overall factors that affect run-time http://www.slideshare.net (2012).
[2] Android: Android Operating System, Wikipedia.
behaviour, system design and user experience. They represent http://en.wikipedia.org/wiki/Android OS.
the areas of concern that have the potential for application [3] Jeong, H.D., Ye, S.K., Lim, J., You, I., Hyun, W., Song, H.K.: A
wide impact across layers and tiers. Some of these attributes Remote Computer Control System Using Speech Recognition
are related to the overall system design, while others are Technologies of Mobile Devices. In: The Seventh International
Conference on Innovative Mobile and Internet Services in
specific to run time, design time, or user related issues. The Ubiquitous Computing: Future Internet and Next Generation
extent to which the application possesses a desired Networks (FINGNet-2013). pp. 595–600. Taichung, Taiwan
combination of quality attributes such as usability (2013).
performance, reliability and security indicates the success of [4] Knight, W.: Where Speech Recognition Is Going. MIT
Technology Review, technologyreview.com (2012).
the design and the overall quality of the software application. [5] Lee, C.Y., An, B., Ahn, H.Y.: Android based Local SNS. Institute
When designing applications to meet any of the quality of Webcating, Internet Television and Telecommunication 10(6),
attribute requirements, it is necessary to consider the potential 93–98 (2010).
impact on other requirements. One must analyze the tradeoffs [6] Tan, Z.H., Varga, I.: Network, Distributed and Embedded Speech
Recognition: An Overview. Advances in Patterns Recognition
between multiple quality attributes. The importance or (2008).
priority of each quality attribute differs from system to [7] Wikipedia: http://en.wikipedia.org/wiki/Speech recognition.
system. [8] IEEE CONFERENCE PUBLICATIONS
An intelligent control of mobile robot based on voice command
Byoung-Kyun Shim; Kwang-wook Kang; Woo-Song Lee; Jong-
1) Functionality : Recognising the voice commands, Baem Won; Sung-Hyun Han.
processing the commands according to
the moments and fetching it to the
robotic vehicle.
2) Reliability : Maturity and accuracy.
3) Usability : Fast processing and moving robotic
vehicle according to command.
4) Efficiency : Accurate matching and quick
processing.
5) Maintainability : None except for factory updates.

257
IJRITCC | March 2017, Available @ http://www.ijritcc.org (Conference Issue)
_____________________________________________________________________________________

You might also like