Automatic Lexical Text Simplification For Turkish: Ahmet Yavuz Uluslu

This paper introduces the first automatic lexical simplification system for the Turkish language, addressing the unique challenges posed by its morphological richness and low-resource status. The proposed LS-BERT pipeline utilizes pretrained BERT models and morphological features to generate grammatically correct and semantically appropriate simplifications. The study also presents a new dataset for complex word identification and evaluates the system's performance, contributing to the accessibility of Turkish texts for various audiences.

Uploaded by

anilkimsesiz1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

66 views6 pages

Automatic Lexical Text Simplification For Turkish: Ahmet Yavuz Uluslu

Uploaded by

anilkimsesiz1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Automatic Lexical Text Simplification for Turkish

Ahmet Yavuz Uluslu

ETH Zürich
[email protected]

Abstract
In this paper, we present the first automatic lexical simplification system for the Turkish language. Recent text simplification efforts rely
on manually crafted simplified corpora and comprehensive NLP tools that can analyse the target text both in word and sentence levels.
Turkish is a morphologically rich agglutinative language that requires unique considerations such as the proper handling of inflectional
cases. Being a low-resource language in terms of available resources and industrial-strength tools, it makes the text simplification
task harder to approach. We present a new text simplification pipeline based on pretrained representation model BERT together with
morphological features to generate grammatically correct and semantically appropriate word-level simplifications.

Keywords: Turkish, automatic text simplification, lexical simplification

arXiv:2201.05878v3 [cs.CL] 28 Jul 2023

1. Introduction unavailable clinical insight is required for our work to

The goal of the lexical simplification task is to replace have any practical importance for people with special
complex words with simpler alternatives. There are needs. More psycholinguistic research is needed to
many groups of people that can benefit from this in- establish what constitutes a simple language for dif-
cluding children, people with cognitive disabilities and ferent groups. Therefore, we simply focus on build-
non-native speakers (Paetzold and Specia, 2016; Rello ing a general-purpose lexical simplification pipeline for
et al., 2013a). The common assumption among lin- Turkish to lay the foundations of further research.
guists is that those who are familiar with the vocabu-
lary of a text can often understand the meaning even if
they have problems with grammatical structures. Auto-
matic lexical simplification thus can become an effec-
tive method to make text accessible for different audi-
ences.

Turkish, the most widely spoken language in the Tur-

kic language family, is the official language of Turkey
with 80 million speakers. It is a morphologically rich
agglutinative language. Text simplification for Turkish
exhibits a number of challenge due to lack of linguis-
tics resources and dissimilarity to other covered lan-
guages. Turkish is considered to be a low resource lan- Figure 1: An example lexical simplification by LS-
guage in terms of standard linguistic resources (Cieri BERT pipeline
et al., 2016). Recently there has been an initiative by
different research groups to release datasets and tools
publicly. To name a few, some of the necessary lexical LS-BERT is a lexical simplification method that gener-
resources for modern text simplification such as Word- ates substitute words with pretrained encoders (Qiang
Net has become available (Bakay et al., 2021) and mul- et al., 2021). Our paper builds a similar pipeline and
tiple Turkish Treebanks were successfully integrated adapts BERTurk (Schweter, 2020) to handle challenges
into Universal Dependencies (Türk et al., 2021). How- of Turkish text simplification by using additional fea-
ever, the lack of parallel corpora for different tasks and tures. We present a new manually constructed dataset
domains renders data-driven approaches such as Neural for complex word identification. We evaluated our pro-
Text Simplification ineffective. posed system automatically, and released our code and
lexical resources open-source.
There has not been a comprehensive conceptual study
on text simplification in Turkish. Different simplifi-
cation methods should be proposed for the target au- 2. Related Work
dience and they should be subjected to experimenta- Text simplification is the process of simplifying the
tion to show their effectiveness. Simplification meth- content of the original text while retaining the mean-
ods can be found insignificant on their (Rello et al., ing and preserving the grammaticality. It focuses on
2013b) and can be used mixed with other techniques to the simplification of vocabulary and the syntactic struc-
improve the text accessibility. We admit that currently tures in the text. Early text simplification systems were
rule-based, relying on lexical resources such as Word- (Devlin et al., 2018). LSBERT exploits this to gener-
Net and other linguistic databases for a predefined set ate suitable simplifications for complex words (Qiang
of complex words to substitute words with simpler al- et al., 2021). This method considers the whole sentence
ternatives (Carroll et al., 1998). The major limitation context, and it is shown to generate coherent and cohe-
of such an approach was the identification of complex sive sentences. BERTurk, a community driven BERT
words (Shardlow, 2014). Rule-based systems relied model for Turkish is available to implement this ap-
heavily on word frequencies and ignored the context. proach (Schweter, 2020). We create a similar pipeline
The synonym replacement also required simplification which consists of the following three steps: complex
rules for every word or general rules that failed to ac- word identification, substitute generation, substitute se-
count for different linguistic relationships. Even with lection.
integration of N-gram language models to understand
the word context, simplification algorithms had limited 3.1. Complex Word Identification
understanding of the whole sentence. The most common first step in lexical simplification
With the availability of complex and simple parallel is to identify which words are considered complex by
corpora (Coster and Kauchak, 2011), data-driven meth- the target audience (Shardlow, 2013). Complex words
ods started to produce adequate results. Recent re- may be identified by different features such as word
search treated text simplification task as a monolin- length, syllable count, and word frequency. General
gual machine translation problem (Tang et al., 2019). purpose text simplification systems focus on replacing
Statistical machine translation (SMT) algorithms were infrequent words with frequent alternatives. The num-
the first techniques to be used for text simplification ber of syllables and vowels may become important in
(Wubben et al., 2012). It was followed by the recent de- special situations such as vowel dyslexia (Güven and
velopments in neural machine translation (NMT). Re- Friedmann, 2021).
searchers started to apply the new trend of deep learn- We trained a POS (Part-of-speech) tagger on BOUN
ing based machine translation models on the text sim- Treebank to establish what sentence parts are targeted
plification problem. (Wang et al., 2016) The study built by the simplification pipeline (Türk et al., 2021). The
a model based on long short-term memory (LSTM) PoS tagger (Lample et al., 2016) achieved F1 score
encoder-decoder and successfully shown that it was of 0.89 on the test set. The complex sentence is first
able to learn simplification rules such as sorting, re- PoS tagged and only words with predefined set of tags
versing, replacing, removal and substitution of words. Nouns (NN), adjectives (ADJ), verbs (VB), adverbs
The study proved an increased capacity of simplifica- (ADV) are checked for their frequency inside the Turk-
tion and LSTM-based encoder-decoders outperformed ish section of the wordfreq corpus (Speer et al., 2018).
their statistical counterparts. The corpus includes crawled Wikipedia entries, movie
There has been relatively little research on text sim- subtitles, tweets and web pages.
plification in Turkish (Torunoglu-Selamet et al., 2016;
Özkan and Ercan, 2018). The first study proposed We observed two different conditions where predefined
various syntax-level simplification rules but did not word lists and naive frequency based algorithms failed
cover lexical simplification. Some of the proposed to capture fundamental aspects of language. We pro-
rules such as paratactic sentence simplification appear vided sentence examples to address context-awareness
to exist only at a conceptual level and practical impli- and morphological complexity.
cations went uncovered. The system should be eval- Turkish is a morphologically rich agglutinative lan-
uated on actual complex sentences to assess the ro- guage. It can produce very complex sentences with
bustness of defined rules against text cohesion (Sid- only a few words. These words may appear frequently
dharthan, 2006). The target group of the study seem to in everyday speech and written language. It can then
be not defined clearly and the group names preteens (8- go unnoticed by the frequency algorithm. Non-native
12) and children (0-18) are used interchangeably. The speakers tend to have a hard time grasping unfamil-
latter study approaches the text simplification problem iar concepts. Infrequent and long words are already
from the modernisation perspective and trains a statisti- known to affect dyslexia (Rello et al., 2013a). Re-
cal machine translation model on a parallel corpus con- cently, morphological complexity in Turkish words are
structed with the original and modernised version of also shown to affect sentence comprehension in stu-
Turkish classics. dents with dyslexia (Dodur and Miray, 2021).
3. Turkish Text Simplification 1. Morphological Complexity:
The lack of parallel data in Turkish limits the appli-
cability of data-driven approaches. However, unsuper- In: Çevrendekilerle iyi geçinmelisin.
vised language models may still be employed for low In EN: You should get along well with those
resource languages as it only requires a large corpus of around you.
raw text. BERT-based pretrained language models has
shown to be effective for masked language modeling Complex words identified: None
The frequency algorithm does not identify the pronoun In: Hak söz söyleyenin dostu az olur.
’çevrendekilerle’ (A3pl+Pnon+Ins) as a complex word. Out: Doğru söz söyleyenin dostu az olur.
It is possible to disambiguate the pronoun depending on
the sentence context and break it down into two words Simplification: Doğru (-Hak)
to reduce morphological complexity. The lexical sim-
plification affects the overall sentence complexity and Complex word identification problem has recently been
results in a clear and concise outcome. The syntax of treated as a sequence labelling task (Gooding and
the sentence was also affected by this change, therefore Kochmar, 2019). Data-driven models take word con-
it may be an overstep depending on the definition of the text into account and they avoid the necessity of ex-
lexical simplification task. tensive feature engineering to address linguistic com-
plexity. We manually crafted an annotated com-
plex word identification dataset to experiment with se-
quence models. We followed the annotation guide-
line of CWI Shared Task 2018 (Yimam et al., 2018).
The author whose native tongue is Turkish assumed the
target group of preteens proceeding into high school
level study with limited exposure to Arabic and Persian
rooted words in the Turkish language. 1000 complex
sentences from the Bilkent Creative Writing dataset
and 2000 complex sentences from Wikipedia were an-
notated. This is not a complete study or dataset, as CWI
Figure 2: Dependency Visualisation Before Simplifica- Shared Task included multiple annotators with differ-
tion ent assumed roles to construct a corpus of 90.000 sen-
tences (Yimam et al., 2018). We regardless make our
data and code open-source for further study.
In: Çevrendekilerle iyi geçinmelisin.
Out: Çevrendeki insanlarla iyi geçinmelisin. Total Dataset Training Data Test Data
3k Sentences 2650 Sentences 350 Sentences
Simplification: Çevrendeki (-lerle) insanlarla
Table 1: Turkish CWI Dataset

A sequence labelling based word-level BiLSTM model

was trained to predict the binary complexity of words
annotated in the dataset. The model F1 score was
0.64 for complex word class and the overlap between
frequency-based algorithm was 67.3%. We have pre-
viously explored the differences behind the two ap-
proaches, however, without comprehensive benchmark
data statistical results are not robust enough for further
Figure 3: Dependency Visualisation After Simplifica- analysis.
tion
3.2. Substitute Generation
The aim of substitution generation is to produce substi-
2. Contextual Information: tute candidates for a complex word. We produce sub-
In: Hak söz söyleyenin dostu az olur. stitute candidates using the pre-trained language model
BERT (Devlin et al., 2018). BERT is a self-supervised
In EN: S/he who speaks truth has few friends. method based on the encoder part of the transformer ar-
Complex words identified: None chitecture. The model is trained on two language tasks:
masked language modeling and next sentence predic-
The frequency algorithm does not identify the word hak tion. Masked language model is the objective of pre-
(justice, truth, right) as a complex word. Since the word dicting the next word in a sequence given its left and
has several meanings and repeatedly used in compound right context. Next sentence prediction is the task of
verbs (hak etmek, hakkı olmak, hak görmek) and nouns given a pair of sentences predicting if the second sen-
(hak sahibi, miras hakkı), it frequently appears inside tence in the pair is the subsequent sentence in the orig-
the corpus. This usage is now considered an old prac- inal document. BERT accomplishes the masked lan-
tice, and it can be simplified for a certain age group and guage modelling task by replacing random words with
education background. It is impossible to identify such special token [MASK] during training. In our simplifi-
words without contextual information. cation pipeline, we follow the LS-BERT study (Qiang
et al., 2021) and replace the identified complex word plex word if and only if it has a higher frequency and it
with a [MASK] symbol to produce the substitute can- has a better loss outcome in the language modelling.
didates based on BERT. Bi-directional nature of the
model allows candidate generation depending on the 4. Evaluation
whole sentence context. We could not find any simplified parallel corpus sen-
tence pairs for Turkish. To evaluate our simplification
3.3. Substitute Selection
system, we manually simplified a reserved subset of
The substitution selection is the decision step to filter our CWI dataset that was not included in the training
and select which one of the candidate substitutions is process. The final parallel corpus contained 500 com-
the simplest choice and fits the context of the complex sentences and their corresponding lexical simplifi-
plex word best. The candidates are ranked based on cations. We adhered to the CWI dataset guidelines and
BERT prediction probability, word frequency and se- assumed the role of a student with pre-high school edu-
mantic similarity. cation background. The complex sentences were taken
from the same resources: Wikipedia and a university-
BERT probability distribution:
level Turkish writing corpus. The simplifications were
BERT returns the probability distribution of vocabu-
created by the author whose native language is Turkish
lary corresponding to the masked word given a com-
and has a background in linguistics. The simplification
plex word identified sentence. The results are calcu-
pipeline first identified the complex word, and BERT
lated based on the attention mechanism and depend on
generated the substitute candidates. The ranking algo-
the sentence context. Therefore, the higher the proba-
rithm included different features to pick the best can-
bility, the more relevant the candidate for the original
didate. We decided to take core parts of the algorithm,
sentence. It is possible to rank the candidates accord-
the probability distribution and frequency analysis as
ingly.
our baseline of evaluation and show the improvement
Language model feature: in performance after the addition of each feature.
A substitution candidate should fit in the context of the We evaluate our system outputs using standard evalu-
words that come before and after the original term. In ation metrics for text simplification: BLEU and SARI
non-context lexical simplification systems, n-gram lan- (Xu et al., 2016). BLEU score for the evaluation of
guage models are implemented to verify grammatical- text simplification was recently disputed (Sulem et al.,
ity (Qasmi et al., 2020). Bi-directional nature of BERT 2018). However, our method is out of scope for the ma-
already accounts for grammaticality depending on the jor shortcomings mentioned such as sentence splitting.
sentence context. We simply add another ranking mea- We regardless provide the score for comparison with
sure to evaluate compatibility between the whole sen- other studies.
tence and the limited word frame context. It is possible
Model BLEU SARI
to mask nearby words back to front for each candidate
to calculate the overall loss and rank accordingly. BERT (Prob + Freq) 70.30 35.52
+ Similarity 76.84 37.36
Semantic similarity: + LM 78.25 37.40
The semantic similarity is calculated based on the co-
sine similarity between the GloVe vector of the original Table 2: Results of Automatic Evaluation
word and the candidate substitution.
Frequency comparison:
Frequency-based approaches were covered during the 5. Conclusion & Future Work
complex word identification. We make use of a similar This paper presents the first automatic lexical text sim-
algorithm as a supportive measure in substitute selec- plification system for Turkish. We also present a com-
tion. We rank the substition candidates according to plex word identification dataset for Turkish, and cre-
their appearence in the wordfreq corpus (Speer et al., ate a small simplified parallel corpus for benchmarking
2018). text simplification tasks. Our model achieves a BLEU
LS-BERT algorithm makes use of an additional mea- score of 78.25 and a SARI score of 37.40 on automatic
sure called PPDB feature. It analyses a paraphrase cor- evaluation. In our future work, we would like to ex-
pus to see if the complex word and the substitution oc- pand our previously created datasets with multiple an-
cured inside a paraphrase pair. They conclude that this notators and address the simplification shortcomings in
feature had the least impact in overall performance. We multi-word expressions and morphologically complex
were not able to find a comprehensive paraphrase cor- words in Turkish.
pus with over hundred million examples, therefore we
exclude it from our study. The performance of substitu- 6. Bibliographical References
tion candidates are averaged to calculate the final rank- Carroll, J., Minnen, G., Canning, Y., Devlin, S., and
ing score. The foremost candidate replaces the com- Tait, J. (1998). Practical simplification of english
newspaper text to assist aphasic readers. In Proceed- The impact of lexical simplification by verbal para-
ings of the AAAI-98 Workshop on Integrating Artifi- phrases for people with and without dyslexia. In In-
cial Intelligence and Assistive Technology, pages 7– ternational Conference on Intelligent Text Process-
10. Citeseer. ing and Computational Linguistics, pages 501–512.
Cieri, C., Maxwell, M., Strassel, S., and Tracey, J. Springer.
(2016). Selection criteria for low resource language Shardlow, M. (2013). A comparison of techniques
programs. In Proceedings of the Tenth International to automatically identify complex words. In 51st
Conference on Language Resources and Evaluation Annual Meeting of the Association for Computa-
(LREC’16), pages 4543–4549. tional Linguistics Proceedings of the Student Re-
Coster, W. and Kauchak, D. (2011). Simple english search Workshop, pages 103–109.
wikipedia: a new text simplification task. In Pro- Shardlow, M. (2014). Out in the open: Finding
ceedings of the 49th Annual Meeting of the Associ- and categorising errors in the lexical simplification
ation for Computational Linguistics: Human Lan- pipeline. In LREC, pages 1583–1590.
guage Technologies, pages 665–669. Siddharthan, A. (2006). Syntactic simplification and
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, text cohesion. Research on Language and Computa-
K. (2018). Bert: Pre-training of deep bidirectional tion, 4(1):77–109.
transformers for language understanding. arXiv Sulem, E., Abend, O., and Rappoport, A. (2018). Bleu
preprint arXiv:1810.04805. is not suitable for the evaluation of text simplifica-
Dodur, S. and Miray, H. (2021). Syntax compre- tion. arXiv preprint arXiv:1810.05995.
hension skills of turkish-speaking students with Tang, G., Sennrich, R., and Nivre, J. (2019). Un-
dyslexia. International Journal of Curriculum and derstanding neural machine translation by simpli-
Instruction, 13(3):2732–2745. fication: The case of encoder-free models. arXiv
Gooding, S. and Kochmar, E. (2019). Complex word preprint arXiv:1907.08158.
identification as a sequence labelling task. In Pro- Torunoglu-Selamet, D., Pamay, T., and Eryigit, G.
ceedings of the 57th Annual Meeting of the Asso- (2016). Simplification of turkish sentences. In The
ciation for Computational Linguistics, pages 1148– First International Conference on Turkic Computa-
1153. tional Linguistics, pages 55–59.
Güven, S. and Friedmann, N. (2021). Vowel dyslexia Wang, T., Chen, P., Rochford, J., and Qiang, J. (2016).
in turkish: A window to the complex structure of the Text simplification using neural machine translation.
sublexical route. PLOS ONE, 16(3):1–39, 03. In Proceedings of the AAAI Conference on Artificial
Lample, G., Ballesteros, M., Subramanian, S., Intelligence, volume 30.
Kawakami, K., and Dyer, C. (2016). Neural archi- Wubben, S., Krahmer, E., and van den Bosch, A.
tectures for named entity recognition. arXiv preprint (2012). Sentence simplification by monolingual ma-
arXiv:1603.01360. chine translation.
Özkan, E. and Ercan, G. (2018). Modernization of Xu, W., Napoles, C., Pavlick, E., Chen, Q., and
old turkish texts. In 2018 26th Signal Process- Callison-Burch, C. (2016). Optimizing statistical
ing and Communications Applications Conference machine translation for text simplification. Transac-
(SIU), pages 1–4. IEEE. tions of the Association for Computational Linguis-
Paetzold, G. and Specia, L. (2016). Unsupervised lex- tics, 4:401–415.
ical simplification for non-native speakers. In Pro- Yimam, S. M., Biemann, C., Malmasi, S., Paet-
ceedings of the AAAI Conference on Artificial Intel- zold, G. H., Specia, L., Štajner, S., Tack, A., and
ligence, volume 30. Zampieri, M. (2018). A report on the complex
Qasmi, N. H., Zia, H. B., Athar, A., and Raza, A. A. word identification shared task 2018. arXiv preprint
(2020). Simplifyur: Unsupervised lexical text sim- arXiv:1804.09132.
plification for urdu. In Proceedings of The 12th Lan-
guage Resources and Evaluation Conference, pages 7. Language Resource References
3484–3489. O. Bakay and O. Ergelen and E. Sarmis and S. Yildirim
Qiang, J., Li, Y., Zhu, Y., Yuan, Y., Shi, Y., and Wu, and A. Kocabalcioglu and B. N. Arican and M.
X. (2021). Lsbert: Lexical simplification based on Ozcelik and E. Saniyar and O. Kuyrukcu and B. Avar
bert. IEEE/ACM Transactions on Audio, Speech, and O. T. Yıldız. (2021). Turkish WordNet KeNet.
and Language Processing, 29:3064–3076. Stefan Schweter. (2020). BERTurk - BERT models for
Rello, L., Baeza-Yates, R., Dempere-Marco, L., and Turkish. Zenodo.
Saggion, H. (2013a). Frequent words improve Robyn Speer and Joshua Chin and Andrew Lin and
readability and short words improve understandabil- Sara Jewett and Lance Nathan. (2018). LuminosoIn-
ity for people with dyslexia. In IFIP Conference sight/wordfreq: v2.2.
on Human-Computer Interaction, pages 203–219. Türk, Utku and Atmaca, Furkan and Özateş, Şaziye
Springer. Betül and Berk, Gözde and Bedir, Seyyit Talha and
Rello, L., Baeza-Yates, R., and Saggion, H. (2013b). Köksal, Abdullatif and Başaran, Balkız Öztürk and
Güngör, Tunga and Özgür, Arzucan. (2021). Re-
sources for Turkish dependency parsing: Introduc-
ing the BOUN treebank and the BoAT annotation
tool. Springer.

Transformers and BERT
No ratings yet
Transformers and BERT
17 pages
Data-Driven Sentence Simplification: Survey and Benchmark
No ratings yet
Data-Driven Sentence Simplification: Survey and Benchmark
53 pages
20 Paper
No ratings yet
20 Paper
12 pages
AILC Abstract2
No ratings yet
AILC Abstract2
2 pages
A Review Research On Tools
No ratings yet
A Review Research On Tools
10 pages
2021.language Sequence Approach-1.51
No ratings yet
2021.language Sequence Approach-1.51
8 pages
Explain To Me Like I Am Five - Sentence Simplification Using Transformers
No ratings yet
Explain To Me Like I Am Five - Sentence Simplification Using Transformers
4 pages
1 s2.0 S1877042813041906 Main
No ratings yet
1 s2.0 S1877042813041906 Main
9 pages
10 Multi-Word Lexical Simplification
No ratings yet
10 Multi-Word Lexical Simplification
12 pages
NLP Assignment Answer
No ratings yet
NLP Assignment Answer
4 pages
F Learning Morphological Disambiguation Rules For
No ratings yet
F Learning Morphological Disambiguation Rules For
8 pages
Open-Domain Chatbot for Language Learning
No ratings yet
Open-Domain Chatbot for Language Learning
16 pages
Abstractive Summarization Using Multilingual Text-To-Text Transfer Transformer For The Turkish Text
No ratings yet
Abstractive Summarization Using Multilingual Text-To-Text Transfer Transformer For The Turkish Text
10 pages
Can Knowledge Graphs Simplify Text?: Anthony Colas Haodi Ma Xuanli He
No ratings yet
Can Knowledge Graphs Simplify Text?: Anthony Colas Haodi Ma Xuanli He
11 pages
Semantic Changes in Linguistics Explained
No ratings yet
Semantic Changes in Linguistics Explained
22 pages
Studying The Effect of Syntactic Simplification On Text Summarization
No ratings yet
Studying The Effect of Syntactic Simplification On Text Summarization
13 pages
Is It Possible To Modify Text To A Target Readability Level An Initial.2024.Lrec-main.815
No ratings yet
Is It Possible To Modify Text To A Target Readability Level An Initial.2024.Lrec-main.815
15 pages
Syntactic Simplification For MT BULAG
No ratings yet
Syntactic Simplification For MT BULAG
20 pages
1009 NLP PPT
No ratings yet
1009 NLP PPT
31 pages
Comparison of Tokenizer Method
No ratings yet
Comparison of Tokenizer Method
17 pages
NILC-Metrix: Language Complexity Metrics
No ratings yet
NILC-Metrix: Language Complexity Metrics
26 pages
4 Natural Language Processing-Text Normalization
No ratings yet
4 Natural Language Processing-Text Normalization
10 pages
Samsa
No ratings yet
Samsa
12 pages
Zhu, Bernhard, Gurevych - A Monolingual Tree-Based Translation Model For Sentence Simplification
No ratings yet
Zhu, Bernhard, Gurevych - A Monolingual Tree-Based Translation Model For Sentence Simplification
9 pages
Zemberek: Open Source NLP for Turkic Languages
No ratings yet
Zemberek: Open Source NLP for Turkic Languages
8 pages
Understanding Natural Language Processing
No ratings yet
Understanding Natural Language Processing
37 pages
Text Processing Techniques in Data Engineering
No ratings yet
Text Processing Techniques in Data Engineering
70 pages
Deep Learning for Turkish Question Classification
No ratings yet
Deep Learning for Turkish Question Classification
20 pages
NLP Techniques and Linguistic Analysis
No ratings yet
NLP Techniques and Linguistic Analysis
33 pages
A Hybrid Personalized Text Simplificatio
No ratings yet
A Hybrid Personalized Text Simplificatio
15 pages
NLP Final
No ratings yet
NLP Final
27 pages
Controllable Sentence Simplification: Louis Martin Eric Villemonte de La Clergerie Beno It Sagot Antoine Bordes
No ratings yet
Controllable Sentence Simplification: Louis Martin Eric Villemonte de La Clergerie Beno It Sagot Antoine Bordes
10 pages
Turkish Intent Detection and Slot Filling
No ratings yet
Turkish Intent Detection and Slot Filling
17 pages
Translating Similar Languages with Transformers
No ratings yet
Translating Similar Languages with Transformers
7 pages
Module 1 NLP
No ratings yet
Module 1 NLP
26 pages
NLP Unit Test 2
No ratings yet
NLP Unit Test 2
10 pages
Aliero 2023 Ijca 923106
No ratings yet
Aliero 2023 Ijca 923106
13 pages
Kuhlmann - Introduction To Computational Linguistics (Slides) (2015)
100% (1)
Kuhlmann - Introduction To Computational Linguistics (Slides) (2015)
66 pages
Sms Test
No ratings yet
Sms Test
11 pages
Regular Expression and BPE
No ratings yet
Regular Expression and BPE
68 pages
405 417jis 408848
No ratings yet
405 417jis 408848
14 pages
Sample Paper Questions - NLP (Part 2)
No ratings yet
Sample Paper Questions - NLP (Part 2)
7 pages
Natural Language Processing Notes Class 10 AI
No ratings yet
Natural Language Processing Notes Class 10 AI
24 pages
Understanding NLP Techniques and Data Types
No ratings yet
Understanding NLP Techniques and Data Types
27 pages
Text Analytics Basics
No ratings yet
Text Analytics Basics
28 pages
46 50
No ratings yet
46 50
5 pages
Interconnecting Romanian Lexical Resources
No ratings yet
Interconnecting Romanian Lexical Resources
6 pages
Arabic Morphological Analysis Techniques
No ratings yet
Arabic Morphological Analysis Techniques
4 pages
Deep Encodings vs. Linguistic Features
No ratings yet
Deep Encodings vs. Linguistic Features
17 pages
10-Unit 6 NLP-Notes and Exercise
No ratings yet
10-Unit 6 NLP-Notes and Exercise
13 pages
Understanding Natural Language Processing
No ratings yet
Understanding Natural Language Processing
6 pages
Lan Detecttion For Translit Text
No ratings yet
Lan Detecttion For Translit Text
4 pages
NLP 2
No ratings yet
NLP 2
13 pages
Old-Alphabet Ottoman Turkish To Latin Based Turkish Translation Systems and Current Situation Analysis
No ratings yet
Old-Alphabet Ottoman Turkish To Latin Based Turkish Translation Systems and Current Situation Analysis
10 pages
A Morphology-Aware Network For Morphological Disambiguation
No ratings yet
A Morphology-Aware Network For Morphological Disambiguation
8 pages
Semantic Text Comparison in Education
No ratings yet
Semantic Text Comparison in Education
13 pages
S2-Hybrid Method For Text Summarization Based On Statistical and Semantic Treatment
No ratings yet
S2-Hybrid Method For Text Summarization Based On Statistical and Semantic Treatment
34 pages
Part B Notes
No ratings yet
Part B Notes
62 pages
Engproc 107 00008
No ratings yet
Engproc 107 00008
16 pages
Yale University Library Digital Collections: Title Call Number Creator Published/Created Date Collection Title Rights
No ratings yet
Yale University Library Digital Collections: Title Call Number Creator Published/Created Date Collection Title Rights
12 pages
Preserving The Past Unveiling Challenges in Ancien
No ratings yet
Preserving The Past Unveiling Challenges in Ancien
5 pages
SCiL 2024 Morphological Segmentation
No ratings yet
SCiL 2024 Morphological Segmentation
11 pages
Ozlem Final
No ratings yet
Ozlem Final
6 pages
Named-Entity Recognition in Turkish Legal Texts
No ratings yet
Named-Entity Recognition in Turkish Legal Texts
28 pages
SWJ 1474
No ratings yet
SWJ 1474
19 pages
A Structured Analysis On Morpheme Segmentation For Agglutinative Languages
No ratings yet
A Structured Analysis On Morpheme Segmentation For Agglutinative Languages
6 pages
2020 Acl-Srw 15
No ratings yet
2020 Acl-Srw 15
8 pages
Tagging and Morphological Disambiguation of Turkish Text: Kemal Oazer - Ilker Kuru Oz
No ratings yet
Tagging and Morphological Disambiguation of Turkish Text: Kemal Oazer - Ilker Kuru Oz
6 pages
2023 Conll-1 34
No ratings yet
2023 Conll-1 34
13 pages
VNLP: Turkish NLP Package: Melikşah Türker, Mehmet Erdi Arı, Aydın Han
No ratings yet
VNLP: Turkish NLP Package: Melikşah Türker, Mehmet Erdi Arı, Aydın Han
10 pages
Parsing Turkish Using The Lexical Functional Grammar Formalism
No ratings yet
Parsing Turkish Using The Lexical Functional Grammar Formalism
7 pages
B F N L P H T: R M: Uilding Oundations For Atural Anguage Rocessing of Istorical Urkish Esources and Odels
No ratings yet
B F N L P H T: R M: Uilding Oundations For Atural Anguage Rocessing of Istorical Urkish Esources and Odels
20 pages
COT 1-Lesson Plan in English VI-final
100% (3)
COT 1-Lesson Plan in English VI-final
5 pages
Comparative Syllabus for Egyptian Learners
No ratings yet
Comparative Syllabus for Egyptian Learners
9 pages
Assignment 3, ROLEPLAY
No ratings yet
Assignment 3, ROLEPLAY
4 pages
DLL Matatag English 2 q3 w7
No ratings yet
DLL Matatag English 2 q3 w7
15 pages
Together Level 1 Curriculum Overview
No ratings yet
Together Level 1 Curriculum Overview
2 pages
Maarten - Kossmann - Chapter - 19 - Berber Morphology
No ratings yet
Maarten - Kossmann - Chapter - 19 - Berber Morphology
21 pages
Question Tags पूरी जानकारी (Basic से Advanced)
No ratings yet
Question Tags पूरी जानकारी (Basic से Advanced)
9 pages
On The Properties of Neural Machine Translation: Encoder-Decoder Approaches
No ratings yet
On The Properties of Neural Machine Translation: Encoder-Decoder Approaches
9 pages
(Language and Computers) Lieven Vandelanotte, Kristin Davidse, Caroline Gentens, Ditte Kimps (Eds.) - Recent Advances in Corpus Linguistics - Developing and Exploring Corpora-Brill - Rodopi (2014)
No ratings yet
(Language and Computers) Lieven Vandelanotte, Kristin Davidse, Caroline Gentens, Ditte Kimps (Eds.) - Recent Advances in Corpus Linguistics - Developing and Exploring Corpora-Brill - Rodopi (2014)
353 pages
Unit 3 - Change
No ratings yet
Unit 3 - Change
10 pages
LM ENG NC I UNIT 4 Comprrehension and Summary
No ratings yet
LM ENG NC I UNIT 4 Comprrehension and Summary
23 pages
Creative Writing: Quarter 1 Week 1 Module 1: Lesson 1 - Imagery, Diction, and Figures of Speech
75% (16)
Creative Writing: Quarter 1 Week 1 Module 1: Lesson 1 - Imagery, Diction, and Figures of Speech
11 pages
11.2.2.speaking Exam Fen
No ratings yet
11.2.2.speaking Exam Fen
3 pages
Gossip and Media: Unit 5 Overview
No ratings yet
Gossip and Media: Unit 5 Overview
20 pages
English Grammar Exercises for Kids
No ratings yet
English Grammar Exercises for Kids
11 pages
External English LTL (Comp)
No ratings yet
External English LTL (Comp)
3 pages
Pry5 English 3RD Term Exam
No ratings yet
Pry5 English 3RD Term Exam
2 pages
Year 6 Daily Lesson Plans: Self & Friends
No ratings yet
Year 6 Daily Lesson Plans: Self & Friends
17 pages
Adverbs of Degree
No ratings yet
Adverbs of Degree
10 pages
Hindi Language and Literature Program Guide
No ratings yet
Hindi Language and Literature Program Guide
54 pages
Updated GCSE Grammar Revision Pack 2017
No ratings yet
Updated GCSE Grammar Revision Pack 2017
92 pages
PDF1
No ratings yet
PDF1
29 pages
雅思写作评分标准中英对照
No ratings yet
雅思写作评分标准中英对照
11 pages
Alexandros Poulis
No ratings yet
Alexandros Poulis
6 pages
Allowing - Google Search
No ratings yet
Allowing - Google Search
9 pages
Understanding Lexicology: Key Concepts
No ratings yet
Understanding Lexicology: Key Concepts
52 pages
Right Forms of Verbs
No ratings yet
Right Forms of Verbs
5 pages
Soal Bahasa Inggris Kelas 12 Sem Ganjil Ta 2025 - 2026
No ratings yet
Soal Bahasa Inggris Kelas 12 Sem Ganjil Ta 2025 - 2026
1 page
Test Unit 1 - L P 9
No ratings yet
Test Unit 1 - L P 9
3 pages
Grade 6 Vocabulary & Grammar Guide
No ratings yet
Grade 6 Vocabulary & Grammar Guide
7 pages

Automatic Lexical Text Simplification For Turkish: Ahmet Yavuz Uluslu

Uploaded by

Automatic Lexical Text Simplification For Turkish: Ahmet Yavuz Uluslu

Uploaded by

Automatic Lexical Text Simplification for Turkish

Ahmet Yavuz Uluslu

Keywords: Turkish, automatic text simplification, lexical simplification

1. Introduction unavailable clinical insight is required for our work to

Turkish, the most widely spoken language in the Tur-

A sequence labelling based word-level BiLSTM model

You might also like