RULE-BASED PHONETIC MATCHING APPROACH FOR HINDI AND MARATHI

Sandeep Chaware; Computer Science &amp; Engineering: An International Journal (CSEIJ)

RULE-BASED PHONETIC MATCHING APPROACH FOR HINDI AND MARATHI

Sandeep Chaware

Computer Science & Engineering: An International Journal (CSEIJ)

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

Phonetic matching plays an important role in multilingual information retrieval, where data is manipulated in multiple languages and user need information in their native language which may be different from the language where data has been maintained. In such an environment, we need a system which matches the strings phonetically in any case. Once strings match, we can retrieve the information irrespective of languages. In this paper, we proposed an approach which matches the strings either in Hindi or Marathi or in cross-language in order to retrieve information. We compared our proposed method with soundex and Q-gram methods. We found better results as compared to these approaches.

Figures (8)

EXAMPLE many alphabets for Hindi language, such as UT, ST, &, U etc.

Figure 1: Hindi-Marathi Phonetic Approach System Architecture )verall system architecture for proposed Hindi-Marathi phonetic approach is as shown in figur . The system includes modules as user interface module, parsing module, phonetic equivalen tring module, comparison module and display or result module. Each module is described belov n brief.

Figure 2: Phonetic Matching Approach for Hindi and Marathi Figures from 2 to 8 shows an interface to enter two strings to compare phonetically either in Hindi or Marathi, its phonetic equivalent according to rules, and matching results for soundex, Q- gram and proposed Indic-phonetic methods. The results also show the information retrieval as phonetic matching from database. These phonetic representations are assigned codes as per rules mentioned. These codes are compared with threshold given compared for phonetically matching.

Figure 3: Three Phonetic matching algorithm to compare Computer Science & Engineering: An International Journal (CSE), Vol.1, No.3, August 2011

Figure 9: Phonetic Matching for soundex, q-gram and Indic-phonetic for Hindi and Marathi Computer Science & Engineering: An International Journal (CSE), Vol.1, No.3, August 2011

Dheeraj Kumar Singh

2014

In a system with a large database, there always has been a problem that names may not be spelled well or might not be spelled in a way that one expected. So, data in the database gets degraded. In this case it is required to search the duplicates and merge them in the single entity. In doing so, one problem is that the way in which the strings would be compared. In such cases rather than looking for exact match, approximate string matching would be appreciable. One of the string matching techniques is Phonetic matching which is used to compare the name based on the pronunciation of the words. The similar sounding words could be retrieved from the large database using different phonetic matching algorithm and best known algorithm is Soundex algorithm. Phonetic matching is needed when many people from different culture come together. They either speak with different pronunciation or their writing habits are different. This scenario is very common in India, as we have many different la...

Log In

RULE-BASED PHONETIC MATCHING APPROACH FOR HINDI AND MARATHI

Sign up for access to the world's latest research

Abstract

Related papers

Related topics