0% found this document useful (0 votes)
77 views5 pages

Text-to-Sign Language System in Macedonian

The document discusses the development of a system to convert Macedonian text to Macedonian Sign Language. It aims to facilitate digital inclusion for deaf communities. The system translates input text into a sequence of Macedonian Sign Language signs. Initial testing showed an average sign error rate of 4.49%. Online testing confirmed these promising results. The system could help ease communication with deaf communities in Macedonia by automatically generating sign language from text.

Uploaded by

JAYALAKSHMI D. S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views5 pages

Text-to-Sign Language System in Macedonian

The document discusses the development of a system to convert Macedonian text to Macedonian Sign Language. It aims to facilitate digital inclusion for deaf communities. The system translates input text into a sequence of Macedonian Sign Language signs. Initial testing showed an average sign error rate of 4.49%. Online testing confirmed these promising results. The system could help ease communication with deaf communities in Macedonia by automatically generating sign language from text.

Uploaded by

JAYALAKSHMI D. S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/358911890

Towards a System for Converting Text to Sign Language in Macedonian

Conference Paper · September 2021

CITATIONS READS
0 234

8 authors, including:

Stefan Spasovski Branislav Gerazov


Ss. Cyril and Methodius University in Skopje Ss. Cyril and Methodius University in Skopje
1 PUBLICATION 0 CITATIONS 116 PUBLICATIONS 281 CITATIONS

SEE PROFILE SEE PROFILE

Tomislav Kartalov Zoran A. Ivanovski


Ss. Cyril and Methodius University in Skopje Ss. Cyril and Methodius University in Skopje
32 PUBLICATIONS 79 CITATIONS 88 PUBLICATIONS 209 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Stefan Spasovski on 28 February 2022.

The user has requested enhancement of the downloaded file.


Towards a System for Converting Text to Sign
Language in Macedonian
Stefan Spasovski1 , Branislav Gerazov1 , Risto Chavdarov1 , Viktorija Smilevska2 , Aneta Crvenkovska,
Tomislav Kartalov1 , Zoran Ivanovski1 , and Toni Bachvarovski3
1
Faculty of Electrical Engineering and Information Technologies
Ss Cyril and Methodius University in Skopje, Macedonia
2
Elementary school ”Kuzman Josifovski - Pitu”, Skopje, Macedonia
3
Association for Assistive Technologies “Open the Windows”, Skopje, Macedonia
[email protected], [email protected]

Abstract—The paper presents initial results in the design and The problem is even more pronounced in Macedonia, where
development of a system for automatic conversion of text to sign there are only around 30 certified SL translators.1
language in Macedonian. The system will be an essential part
One way to ease the digital inclusion of the deaf and hearing
of a larger system for the automatic generation of Macedonian
sign language based on text. This system will facilitate the digital impaired is through assistive systems able to automatically
inclusion and will ease communication with the Macedonian deaf convert text-to-sign language. One example of such a system
and hearing impaired community. The system is implemented as is the HandTalk App in which a virtual avatar named Hugo
a web application which allows input text to be encoded in the converts text to sign language on the users smart device.2
equivalent sequence of sign language signs. The initial results
These systems are made up of two essential parts: i) a text-
show an average sign error rate of 4.49%. Online testing was
also organized that confirmed these promising results. to-SL converter that transforms textual input to a sequence of
Index Terms—natural language processing, assistive technol- signs or gestures, and ii) a SL generator that uses the sequence
ogy, text-to-sign language, sign language, deafness of signs to generate sign language, usually via 3D rendering of
an animated character, i.e. avatar. Sign language differs from
I. I NTRODUCTION spoken language, in that it does not support inflection. For
example, to create the future tense in Macedonian we add the
Deafness is defined as a condition of extreme hearing loss, future particle before the verb. Tense is not formed in that way
i.e. having very little or no hearing at all. The American in sign language. Instead the speaker uses the infinitive form
Speech-Language-Hearing Association (ASHA) defines pro- of the verb together with signs like “later” to signify that they
found hearing loss as only being able to hear sounds above are speaking about a future event. The same holds true for
90 dB, with severe hearing loss ranging between 71 – 90 dB verb conjugation, singular and plural, case and articles.
[1]. Macedonian Law, places the threshold at 80 dB. The Although sign languages across the world do share signs,
estimated number of deaf and hearing impaired people in there are still different standardized sign languages, such
Macedonia is around 6000 according to information from 2006 as: American Sign Language (ASL), Italian Sign Language
[2], which is 0.3% of Macedonia’s population [3]. This is (LIS), Indian Sign Language (ISL), Vietnamese Sign Lan-
comparable to the percentage of deaf people in places like guage (VSL), and Macedonian Sign Language (MSL). The
the United States (0.38%) [4] and in Germany (0.28%) [5]. text-to-SL converter can encode the signs in various formats.
In today’s high-tech information-rich world, the digital One famous format that dates back to 1984 is the Hamburg
inclusion of people with disabilities is becoming increasingly Notation System for Sign Languages (HamNoSys), which
important. The deaf and hearing impaired are generally able encodes signs through a set of pictograms or symbols [6].
to directly use and interact with computers and smart devices, As an extension to HamNoSys, the Signing Gesture Mark-up
and can follow traditional visual media such as TV and Language (SiGML) describes the symbols using XML tags [7].
newspapers. However, they find it hard to read text at speed. This extension allows the storage and use of the transcriptions
This is especially true for those born deaf or hearing impaired, in computer based systems, such as 3D rendering software,
as they have never heard the sounds of phonemes that phonetic that can be used to generate sign language via an animated
orthography transcribes with graphemes in written text. As avatar.
a consequence, for them phonetic transcription is not much Text-to-SL systems have been developed for many of the
different from logograms, such as the Chinese characters used world languages, such as English [8], German [9], Vietnamese
to write Mandarin and other Asian languages. This problem
can be mitigated through offering live sign language (SL) 1 http://www.deafmkd.org.mk/lista-na-tolkuvaci/

translation, but most TV broadcasters do not offer this service. 2 https://handtalk.me/en/app/


[10], Kurdish [11], Arabic [12], Brazilian Portuguese [13],
Input text
Punjabi [14], Korean [15] etc. Most systems rely on a simple
word-to-sign mapping, i.e. each word token from the input text
is looked up in a lexicon of signs, and if no match is found it is
spelled out using a sequence of alphabet letter signs [11]. More Text normalization

advanced rule-based-systems map the input text grammar to


natural sign language grammar [15]. Recently, the application
of machine learning has allowed improved performance in Main loop

these systems, directly applying methods used in the area of


Machine Translation [10].
In Macedonia, work has mostly been done on the generation Is it part of Yes
a phrase
of Macedonian Sign Language via virtual avatars. Koceski
and Koceska [16] developed and evaluated a 3D virtual tutor
for MSL. Joksimoski et al. [17] presented a 3D visualization
Yes Is it in
system that extensively uses animation and game concepts for skip list
accurately generating sign languages using 3D avatars.
Here, we present a text-to-SL system that translates an input
Macedonian text into an output sequence of Macedonian Sign Is it a name Yes
Language signs. The system is built on a rule based algorithm,
which analyses input text, comparing the input word tokens to
a lexicon of some 200 signs. The performance of the system
is evaluated with a set of test sentences and the results show Is it a sign Yes
a Sign Error Rate (SER) of 4.49%. We augment this analysis
with online testing of the system, which confirms the validity
of the initial results. The system can be used as the basis
Is it in Yes
for building a complete system for text based sign language dictionary
generation in Macedonian.
II. A LGORITHM Similarity
check
A. Sign mappings organization
We organize the data by placing the words and signs in five
different files: Is it similar Yes
to a sign
• list of signs,
• list of names,
• skip list,
• dictionary of word to sign mappings, and Is the root
similar to a Yes
• dictionary of phrases that map directly to sequences of sign

signs.
In the presented system we have 221 signs.
Append input Append sign
B. Text preprocessing to unknown list to output

Text input is first normalised by converting all upper case


characters to lower case and removing all punctuation. We
then divide the text string into into a list of word tokens. We
Fig. 1. Block diagram of the algorithm for converting text to sign language
initialise two empty lists for storing the sign sequence output signs.
and unrecognized word tokens, and define a flag to be used
when a phrase has been recognized. With that we are ready
to begin the translation process.
is in the skip list it is ignored and the next token is processed.
C. Main loop If not in the skip list, we check if the input word token is
We go through the word tokens in the input list one by one part of the names or signs lists. If found the word token is
and run them through multiple checks. Firstly, we check if the appended as is to the output sign sequence. If not, the token
word token is part of a phrase by concatenating the succeeding is looked up in the dictionary of word to sign mappings, and
token from the list. If it is, then we append the sequence of if found it’s sign mapping is appended to the output. If the
signs corresponding to that phrase, skip the second word from word still has not been found in any of the checks then we
the next iteration and move on from there. Next, if the token continue with similarity checks.
D. Similarity checks
The final part of the main loop consists of similarity checks.
The system comprises two different similarity checks. They
are based on the “gestalt pattern matching” algorithm sug-
gested by Ratcliff and Obershelp in the 1980s [18]. The idea
being to find the longest contiguous matching subsequence
that contains no “junk” elements. This is applied recursively
to the pieces of the sequences to the left and to the right of
the matching subsequence. We use the implementation of the
algorithm in the difflib3 Python library.
The similarity check outputs a list of 3 strings sorted
from the most likely to the list likely match. Based on out
experiments we get the best results when the cutoff is equal
to 0.7. One other problem that we encounter is the fact that
sometimes the most likely match (the first element of the
string) is not at all the most likely, and that the second or
third string in the list is the correct answer. We augment the
similarity search algorithm with a Character Error Rate (CER)
to select one of the three offered outputs: Fig. 2. The web app developed for the online testing.

S+D+I S+D+I
CER = = (1)
N S+D+C C. Metrics
where S is the number of character substitutions, D is the To assess the level of performance of the proposed system
number of character deletions, I is the number of character we use the Word Error Rate (WER) to compare the output sign
insertions, C is the number of correct characters, and N is the sequence with the reference sign sequence translation, either
total number of characters in the reference, i.e. N = S+D+C. defined in the test set, or input by the online participants. The
III. E XPERIMENTS WER is similar to the CER used in the similarity checks and
We used two experiments to evaluate the performance of is defined as:
the proposed system. The first was based on an internal test S+D+I S+D+I
WER = = (2)
set, and the second was based on online testing. N S+D+C
A. Test set evaluation where S is the number of word substitutions, D is the
We developed an internal test set comprising 124 sentences number of word deletions, I is the number of word insertions,
for which we provided reference translations to sign language C is the number of correct words, and N is the number of
sequences of signs. We took special care to have a varied test words in the reference, i.e. N = S + D + C.
set both in terms of sentence length as well as ample coverage
IV. R ESULTS
of the set of signs known to the system.
The evaluation of the system’s performance using the test
B. Online evaluation
set was an average of 4.49% WER, i.e. about 1 erroneous sign
To provide a platform for testing the proposed system, we in 20 signs output. A small part of these errors are due to the
developed a web application that provides a user interface for difference between the output signs, as they are given in the
online testing. The web app was developed based on Flask4 . system’s dictionary and as they are provided in the reference
HTML was used to build the site layout, while Flask was used translations. These are minor differences comprising one or
to render the website, receive user input and return the results two characters added in some of the signs.
of the translation process. Errors occurred most notably when input words were ex-
The website lets the user input words for translation. After panded compared to the way they are found in the system’s
submitting the input the translation process starts and the database. For example, when a word is written in the diminu-
output from the proposed system including the output sign tive plural form. In those cases either the algorithm finds a
sequence and the list of unrecognized word tokens. There is similar word which is not the correct sign corresponding to
also a text form that the user can use to give feedback. For the word token, or appends the token to the unrecognized word
our online tests this was the correct sign sequence in case the token list. Such errors do not occur when the input word has
system returned an erroneous one. To improve coverage of the a few changes, or the word is long enough that the relative
known signs in the online tests, the signs known by the system number of changes is minimal.
are listed at the end of the web page. The online testing results were also promising, with minimal
3 https://docs.python.org/3/library/difflib.html errors in the output sequences for the input provided by the
4 https://flask.palletsprojects.com/en/2.0.x/ user. Most errors could be attributed to: i) the word translations
not being part of the signs listed in the web app or ii) being [5] T. G. N. T. Board. (2021) Information for deaf visitors to germany.
synonyms to words that are provided in the system’s dictionary [Online]. Available: https://www.germany.travel/en/accessible-germany/
disability-friendly-travel-for/deafness.html
and which have not been added to the system. In some of these
[6] T. Hanke, “HamNoSys-representing sign language data in language
cases the words are appended to the unrecognized word token resources and language processing contexts,” in LREC, vol. 4, 2004,
list, but a frequently enough a similar, but erroneous, sign is pp. 1–6.
found. [7] K. Kaur and P. Kumar, “Hamnosys to sigml conversion system for sign
language automation,” Procedia Computer Science, vol. 89, pp. 794–
V. C ONCLUSION 803, 2016.
[8] M. Varghese and S. K. Nambiar, “English to sigml conversion for sign
The proposed system is able to translate an input text language generation,” in 2018 International Conference on Circuits and
sequence in to an output sign language sign sequence. The Systems in Digital Enterprise Technology (ICCSDET). IEEE, 2018, pp.
system is based on a rule-based algorithm that converts the 1–6.
[9] S. Ebling and J. Glauert, “Building a swiss german sign language avatar
input word tokens sequentially into their sign language equiv- with jasigning and evaluating it among the deaf community,” Universal
alents. Even though supporting a limited sign vocabulary, the Access in the Information Society, vol. 15, no. 4, pp. 577–587, 2016.
results from the internal and online testing are promising. With [10] N. C. Ngon and Q. L. Da, “Application of hamnosys and avatar 3d
future improvements the system can be used in combination jasigning to construction of vietnamese sign language animations,” The
University of Danang-Journal of Science and Technology, pp. 61–65,
with a sign language generator to create a complete text-to- 2017.
sign language solution. This would be of great help for the [11] Z. Kamal and H. Hassani, “Towards kurdish text to sign translation,” in
digital inclusion and communication with the deaf and hearing Proceedings of the LREC2020 9th Workshop on the Representation and
impaired community. Processing of Sign Languages: Sign Language Resources in the Service
of the Language Community, Technological Challenges and Application
Perspectives, 2020, pp. 117–122.
R EFERENCES
[12] N. Aouiti, “Towards an automatic translation from arabic text to sign
[1] A. S.-L.-H. Association. (2021) Hearing loss (ages 5+). [Online]. Avail- language,” in Fourth International Conference on Information and
able: https://www.asha.org/practice-portal/clinical-topics/hearing-loss/ Communication Technology and Accessibility (ICTA). IEEE, 2013, pp.
[2] Dnevnik. (2006) Around 6,000 deaf people are ask- 1–4.
ing for sign language news on mtv. [Online]. Avail- [14] A. S. Dhanjal and W. Singh, “An automatic conversion of punjabi
able: https://web.archive.org/web/20110727190206/http://star.dnevnik. text to indian sign language,” EAI Endorsed Transactions on Scalable
com.mk/default.aspx?pbroj=1782&stID=10852 Information Systems, vol. 7, no. 28, p. e9, 2020.
[3] Republic of North Macedonia State Statistical Office. (2002) Census [15] S. H. Kang and S. H. Park, “Toward korean text-to-sign language
of Population, Households and Dwellings in the Republic of North translation system (test).”
Macedonia. [Online]. Available: https://www.stat.gov.mk/Publikacii/
[16] S. Koceski and N. Koceska, “Development and evaluation of a 3d virtual
knigaXIII.pdf
tutor for macedonian sign language,” in International Conference on
[4] G. R. I. Charles Reilly. (2011) Snapshot of deaf
Information Technology and Development of Education – ITRO 2015,
and hard of hearing people, postsecondary attendance and
Zrenjanin, Serbia, 2015.
unemployment. [Online]. Available: https://www.gallaudet.edu/
office-of-international-affairs/demographics/deaf-employment-reports/ [17] B. Joksimoski, I. Chorbev, K. Zdravkova, and D. Mihajlov, “Toward
[13] D. Stiehl, L. Addams, L. S. Oliveira, C. Guimarães, and A. Britto, 3d avatar visualization of macedonian sign language,” in International
“Towards a signwriting recognition system,” in 2015 13th International Conference on ICT Innovations. Springer, 2015, pp. 195–203.
Conference on Document Analysis and Recognition (ICDAR). IEEE, [18] J. W. Ratcliff and D. E. Metzener, “Pattern matching: the Gestalt
2015, pp. 26–30. approach,” Dr Dobbs Journal, vol. 13, no. 7, p. 46, 1988.

View publication stats

You might also like