See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/344285263
Amharic Light Stemmer
Article · September 2020
CITATIONS READS
0 138
1 author:
Girma Neshir Alemneh
Addis Ababa Science and Technology University
8 PUBLICATIONS 5 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Amharic Sentiment Classification relying on cross lingual resource Adaptation View project
All content following this page was uploaded by Girma Neshir Alemneh on 24 September 2020.
The user has requested enhancement of the downloaded file.
11/10/2019 Amharic Light Stemmer | OpenReview
Go to ICLR 2020 Conference homepage (/group?id=ICLR.cc/2020/Conference)
Amharic Light Stemmer (/pdf?id=r1edp2VYwH)
Girma Neshir (/profile?email=girma1978%40gmail.com), Andeas Rauber (/profile?
email=rauber%40ifs.tuwien.ac.at), and Solomon Atnafu (/profile?
email=solomon.atnafu%40aau.edu.et) (privately revealed to you)
25 Sep 2019 (modified: 25 Sep 2019) ICLR 2020 Conference Blind Submission Readers: Revision
Everyone Show Bibtex
Abstract: Stemming is the process of removing affixes( i.e. prefixes, infixes and suffixes) that improve the accuracy and
performance of information retrieval systems.This paper presents the reduction of Amharic words to corresponding stem
where with the intention that it preserves semantic information. The proposed approach efficiently removes affixes from
an Amharic word. The process of removing such affixes (prefixes, infixes and suffixes) from a word to its base form is
called stemming. While many stemmers exist for dominant languages such as English, under resourced languages such
as Amharic which lacks such powerful tool support. In this paper, we design a light Amharic stemmer relying on the rules
that receives an Amharic word and then it finds a match to the beginning of a word to the possible prefixes and to its
ending with the possible suffixes and finally it checks whether it has infix. The final result is the stem if there is any prefix,
infix or/and suffix, otherwise it remains in one of the earlier states. The technique does not rely on any additional
resource (e.g. dictionary) to verify the generated stem. The performance of the generated stemmer is evaluated using
manually annotated Amharic words. The result is compared with current state-of-the-art stemmer for Amharic showing
an increase of 7% in stemmer correctness.
Keywords: Amharic light Stemmer, Affixes, Amharic Sentiment Classification
TL;DR: Amharic Light Stemmer is designed for improving performance of Amharic Sentiment Classification.
3 Replies Add Official Comment Withdraw
Show all from everybody
[–] Official Blind Review #2
ICLR 2020 Conference Paper232 AnonReviewer2
24 Oct 2019 (modified: 05 Nov 2019) ICLR 2020 Conference Paper232 Official
Review Readers: Everyone
Experience Assessment: I have read many papers in this area.
Rating: 1: Reject
Review Assessment: Thoroughness In Paper Reading: I read the paper at least twice and used my best
judgement in assessing the paper.
Review Assessment: Checking Correctness Of Experiments: I assessed the sensibility of the experiments.
Review Assessment: Checking Correctness Of Derivations And Theory: I did not assess the derivations or
theory.
Review: The paper proposes a stemmer pipeline for Amharic words. The system consists of a cascade of small
subsystems, and each of them was developed to normalize a specific characteristics on Amharic words using a
hand-crafted algorithms.
This paper may not be fully relevant to this conference because most of the main methodology does not rely on any
representation learning criteria, but is almost rule-based. I recommend authors to consider submitting this paper to
some appropriate conferences rather than here (e.g., ACL, NAACL, or EMNLP are the best for you), since the paper
should get more meaningful attention and reviews form researchers in natural language processing.
https://openreview.net/forum?id=r1edp2VYwH 1/3
11/10/2019 Amharic Light Stemmer | OpenReview
Presentation errors:
- The paper changes the font of English characters from section 3.2.
- Figure 1 looks to have insufficient resolution.
Add Official Comment
[–] Official Blind Review #3
ICLR 2020 Conference Paper232 AnonReviewer3
21 Oct 2019 (modified: 05 Nov 2019) ICLR 2020 Conference Paper232 Official
Review Readers: Everyone
Rating: 1: Reject
Experience Assessment: I have published one or two papers in this area.
Review Assessment: Checking Correctness Of Derivations And Theory: N/A
Review Assessment: Checking Correctness Of Experiments: I assessed the sensibility of the experiments.
Review Assessment: Thoroughness In Paper Reading: I made a quick assessment of this paper.
Review: This paper proposes a technique for Amharic light stemming. As the authors point out, Amharic is a
language with considerable complexity and richness in its mophological/orthographic processes, and effective
stemming therefore is potentially useful for downstream applications. The approach presented is a cascade of
transformations that standardize the form, remove suffixes, prefixes, and infixes. An analysis is offered in terms of
accuracy in recovering the stems, as well as when using the stems to drive a stem-based sentiment analyzer, with
comparison to another existing stemmer.
Although this is a reasonable approach to the stemming problem in Amharic, some suggestions would be:
1) Compare to approaches to root finding that exist in Arabic and Hebrew, which have certain similarities in
morphology, and which have been widely published on.
2) Dictionary-based sentiment analysis can be useful application, but demonstration of improvements on several
applications would be far more compelling (e.g., there are Amharic-English parallel corpora available, so using this
as a preprocessing step for a machine translation system would be useful).
Overall though, this paper does not focus on representation learning, and as such probably would be better suited
to a different venue.
Add Official Comment
[–] Official Blind Review #1
ICLR 2020 Conference Paper232 AnonReviewer1
20 Oct 2019 (modified: 05 Nov 2019) ICLR 2020 Conference Paper232 Official
Review Readers: Everyone
Rating: 1: Reject
Experience Assessment: I do not know much about this area.
Review Assessment: Checking Correctness Of Derivations And Theory: N/A
Review Assessment: Checking Correctness Of Experiments: I carefully checked the experiments.
Review Assessment: Thoroughness In Paper Reading: I read the paper thoroughly.
Review: This paper studies the problem of stemming for morphologically rich languages, specifically for Amharic.
The main contribution is a light stemmer that only removes affixes to the extent that the original semantic
information in the word is kept after stemming. The approach relies on curated prefixes, suffixes and some rules.
Both intrinsic evaluation and extrinsic evaluation (sentiment classification) are done to show the effectiveness of the
stemmer. Overall, the approach is reasonable and the results are good. However, I'm not sure that ICLR is the best
venue for publishing this work, since it mainly focuses on linguistics and text processing; ACL might be a better fit
for this work.
Experiments:
- I did not find a description of the baseline HornMorpho. How is it different from the proposed approach? Does it
use the same text preprocessing as the proposed stemmer?
https://openreview.net/forum?id=r1edp2VYwH 2/3
View publication stats
11/10/2019 Amharic Light Stemmer | OpenReview
- It looks like sentiment classification is done by looking up a lexicon. What if a classifier is used? How does
stemming affect it?
Style:
- The paper does not strictly follow the ICLR style. The fonts aren't consistent.
- Figure 1 has very low resolution.
- I'd suggest the related work section be broken into several paragraphs.
Add Official Comment
About OpenReview (/about) Contact (/contact)
Hosting a Venue (/group? Feedback
id=OpenReview.net/Support) Terms of Service (/terms)
All Venues (/venues) Privacy Policy (/privacy)
Join the Team (https://codeforscience.org/jobs?
job=OpenReview-Developer)
OpenReview is created by the Information Extraction and Synthesis Laboratory (http://www.iesl.cs.umass.edu/), College of
Information and Computer Science, University of Massachusetts Amherst. We gratefully acknowledge the support of the
OpenReview sponsors: Google, Facebook, NSF, the University of Massachusetts Amherst Center for Data Science, and
Center for Intelligent Information Retrieval, as well as the Google Cloud Platform for donating the computing and
networking services on which OpenReview.net runs.
https://openreview.net/forum?id=r1edp2VYwH 3/3