Figure 2.2: Interlingua language system Figure 2.4: Description of Transfer-Based Machine Translation step in the translation. The steps which are performed are shown in Figure 2.4. The major modules in transfer based MT is as follows. Fig. 2.6 Simple block diagram of statistical machine translation system Language in India www. ISSN 1930-2940 19:5 May 2019 Prof. Rajendran Sankaravelayuthan and Dr. G. Vasuki English To Tamil Machine Translation System Using Parallel Corpus 3.4.5 Selection of Time-Span Language Changes WIth me. oO AetermMination OF PalliCular Me Spa Is required to capture features of a language within this time span. Corpus attempts to cover a particular period of time with a clear time indicator. Materials published between 1981 and 1995 are included in MIT corpus with an assumption that data will sufficiently represent the condition of present day language, and will provide information about the changes taking place within the period. (Here in this context A = Adjunct, C = Complement, | O = indirect Object, O = Object S = Subject, V=Verb) English we can expect a compound sentence as its translation equivalent in Tamil mechanism of transfer of equative sentences in English into Tamil. “Kamala is a doctor’ OlIG lo VOaUTTUE avaL cennai-yil irukkiRaaL ‘She is in Chennai’ Language in India ISSN 1930-2940 19:5 May 2019 Prof. Rajendran Sankaravelayuthan and Dr. G. Vasuki English To Tamil Machine Translation System Using Parallel Corpus The table below correlates the question with ‘be’ verb in English with Tamil. The following table shows the correspondence between interrogation in glish and Tamil. imperative sense in English and Tamil: Language in India ISSN 1930-2940 19:5 May 2019 Prof. Rajendran Sankaravelayuthan and Dr. G. Vasuki English To Tamil Machine Translation System Using Parallel Corpus distinction in the imperative forms of verbs too. So, for English you, depending upon 4.2.5. Parallels in co-ordination The following table depicts the points to be noted while correlating np ee ee Ce en ., es Dl coordination in English to Tamil. SSS SaaS Language in India ISSN 1930-2940 19:5 May 2019 Prof. Rajendran Sankaravelayuthan and Dr. G. Vasuki English To Tamil Machine Translation System Using Parallel Corpus >xpressing linguistic concepts. So we ignore the emotive and attitudinal sense and ry to capture a core aspectual and model system. That is why we have ignored -ertain auxiliaries, which are used in Tamil to denote certain attitudinal and non- attitudinal senses. With this aim in mind, the aspectual and modals systems in both anguages have been correlated for the purpose of preparing MTA. The following able correlates TAM system of English with that of Tamil. table correlates TAM system of English with that of Tamil. Language in India www. ISSN 1930-2940 19:5 May 2019 Prof. Rajendran Sankaravelayuthan and Dr. G. Vasuki English To Tamil Machine Translation System Using Parallel Corpus Language in India ISSN 1930-2940 19:5 May 2019 Prof. Rajendran Sankaravelayuthan and Dr. G. Vasuki English To Tamil Machine Translation System Using Parallel Corpus The following points have to be noted while transferring TAM system of a limba. Send. “TT wawceetl Parallels in verb patterns Language in India ISSN 1930-2940 19:5 May 2019 Prof. Rajendran Sankaravelayuthan and Dr. G. Vasuki English To Tamil Machine Translation System Using Parallel Corpus 4.3.4 Parallels in Adverbial Phrase Language in India ISSN 1930-2940 19:5 May 2019 Prof. Rajendran Sankaravelayuthan and Dr. G. Vasuki English To Tamil Machine Translation System Using Parallel Corpus Language in India ISSN 1930-2940 19:5 May 2019 Prof. Rajendran Sankaravelayuthan and Dr. G. Vasuki English To Tamil Machine Translation System Using Parallel Corpus 5.2.3 The Statistical Machine Translation Decoder Figure 5.3: Statistical Machine Translation Tools Figure 5.4: Architecture of Statistical Machine Translation syster Table 5.1: Directory Structure of LM Model Ngram-count Ngram-count counts the number of n-gram of the corpus. Ngram-count also The command for generating language model is given in 5.12 5.9 Generating Language Model The variables in Makefile need to be changed are shown in Table 5.3. Language in India www. ISSN 1930-2940 19:5 May 2019 Prof. Rajendran Sankaravelayuthan and Dr. G. Vasuki English To Tamil Machine Translation System Using Parallel Corpus Table 5.4: Variables in Makefile of G/ZA++ to be changed After changing the Makefile, compilation of Moses is done command given in 5.18: The Makefile in the SRILM is changed as shown in Table 5.5. Language in India ISSN 1930-2940 19:5 May 2019 Prof. Rajendran Sankaravelayuthan and Dr. G. Vasuki English To Tamil Machine Translation System Using Parallel Corpus Table 5.6: Parameters for training Moses 5.11.3 Tuning Moses decoder 20110405-1055/training/Tamil_lm5.lm>& training _new5.out & Table 5.7: Parameters of The contents of mert2.out get updated as the script gets executed. Table 5.7 gives the explanation of parameters in tuning Moses. /nome/nakul/moses/mosesdecoder/trunk/scripts/training/moses- Figure 5.9: Interactive mode of Moses Figure 5.9 shows Moses decoder running in an interactive mode. Consider an English sentence ‘how are you?’ Moses decoder accepted this input in 20110405-1055/training/model/moses. ini | *- Object in between, + - Object after the verb and preposition or adverb Table 4.3 Types of phrasal verbs with examples 5.15 Beyond Standard Statistical Machine Translation Table 4.7 Sample output of factor annotator for Tamil Translated Factors of source worde in Target Language (t) Language in India www. ISSN 1930-2940 19:5 May 2019 Prof. Rajendran Sankaravelayuthan and Dr. G. Vasuki English To Tamil Machine Translation System Using Parallel Corpus Language in India ISSN 1930-2940 19:5 May 2019 Prof. Rajendran Sankaravelayuthan and Dr. G. Vasuki English To Tamil Machine Translation System Using Parallel Corpus