Papers by Dhamyaa Nasrawi

Gender classification refers to the process of categorizing individuals into one of two gender ca... more Gender classification refers to the process of categorizing individuals into one of two gender categories: male or female, typically based on observable characteristics or information. This classification can be done through various methods, including biological and social. In recent years, gender classification has become a topic of increasing interest and debate due to evolving societal understandings of gender. The current survey will study the connection between language use and gender to categorize gender automatically based on text and linguistic style. It provides an in-depth analysis of gender classification based on linguistic patterns in written text. It explores the relationship between linguistic patterns and gender classification, highlighting the various approaches, challenges, and future directions in this field. It also covers various datasets that classify people by gender, including official papers, emails, and social media messages. This survey divides the selected studies into three parts: handwritten, names, and text. However, the most focused part is text based on linguistic analysis. The findings show that the most used dataset is Twitter. Many studies use English, Arabic, and other languages such as Portuguese, Chinese, Spanish, Russian, Brazilian, and German. Moreover, the feature frequently used in studies is the Bag of Words (BOW). Also, the methodology used in many studies is machine learning techniques; however, few use deep learning. Finally, the important metrics are accuracy and F1-score.

In machine learning, the classification task is about building a model to predict a class of elem... more In machine learning, the classification task is about building a model to predict a class of elements based on their attributes and set of examples. This work aims to classify people based on their names. Two models were developed; the former is based on a single feature that is represented by a name. Whereas the latter is built upon nine features derived from the name itself, which are: is_longname, is_vowelend, is_vowelbegin, 2_gramend, 2_grambegin, 1_gramend, 1_grambegin, is_contain_abo, and is_contain_abed. Furthermore, two datasets were utilized: the first was collected from the Ministry of Labor and Social Affairs, while the second was gathered from the Iraqi university website. There are a lot of strange IRAQI names in two datasets, as well as spelling errors, which represent a real challenge in the classification process. Five machine learning methods were applied and tested within the developed models, including Random Forest, Naive Bayes, Logistic Regression, Multilayer Perceptron, and Extreme Gradient Boost. Ultimately, the experimental results demonstrate an increase in accuracy when applying the model to the original dataset, which includes names and their frequencies. The Multilayer Perceptron has achieved 97% accuracy in one feature model, while the Extreme Gradient Boost has achieved 97% accuracy in the multi-feature model. On the other hand, the results do not exceed 79% when the models are applied to the unique dataset (names without their frequencies).

In machine learning, the classification task is about building a model to predict a class of elem... more In machine learning, the classification task is about building a model to predict a class of elements based on their attributes and set of examples. This work aims to classify people based on their names. Two models were developed; the former is based on a single feature that is represented by a name. Whereas the latter is built upon nine features derived from the name itself, which are: is_longname, is_vowelend, is_vowelbegin, 2_gramend, 2_grambegin, 1_gramend, 1_grambegin, is_contain_abo, and is_contain_abed. Furthermore, two datasets were utilized: the first was collected from the Ministry of Labor and Social Affairs, while the second was gathered from the Iraqi university website. There are a lot of strange IRAQI names in two datasets, as well as spelling errors, which represent a real challenge in the classification process. Five machine learning methods were applied and tested within the developed models, including Random Forest, Naive Bayes, Logistic Regression, Multilayer Perceptron, and Extreme Gradient Boost. Ultimately, the experimental results demonstrate an increase in accuracy when applying the model to the original dataset, which includes names and their frequencies. The Multilayer Perceptron has achieved 97% accuracy in one feature model, while the Extreme Gradient Boost has achieved 97% accuracy in the multi-feature model. On the other hand, the results do not exceed 79% when the models are applied to the unique dataset (names without their frequencies).

Businesses can target various customer classifications with customized marketing efforts and prod... more Businesses can target various customer classifications with customized marketing efforts and product offers for better customer satisfaction and participation by using gender classification in the text. This study's goal is to use text to classify humans into male and female groups based on gender. This study employs frequency and similarity methods to evaluate the extraction of gender-specific features from English texts on using one-topic TripAdvisor dataset and multitopic Twitter dataset. The features have been organized into an array for implementation by using machine learning classifiers like Support Vector Machine, Random Forest, and Logistic Regression The investigation looks to developments in language use that are unique to gender. Comparing the results of the study to those of previous research, the Random Forest classifier yielded the highest accuracy of 87.7% on multitopic Twitter dataset, whereas logistic regression yielded the best accuracy of 74.5% on one-topic TripAdvisor dataset. The study's findings indicate that the highest level of accuracy was achieved with a multitopic dataset containing a diverse range of words, as compared to a one-topic dataset, which yielded fewer results.

Journal of Applied Data Sciences, 2024
Text steganography is crucial in information security due to the limited redundancy in text. The ... more Text steganography is crucial in information security due to the limited redundancy in text. The Arabic language features offer a new method for data concealment. In this paper, the researchers propose a new coverless text information hiding method based on built-in features of Arabic scripts. The first word of each row in the dataset is tested based on eight features to get one byte containing 1 or 0. That is a result of the presence or absence of the following features: mahmoze, diacritics, isolated, two sharp edges, vowels, dotted, looping, and high frequency. Then, each byte is converted to a decimal number (ASCII code) to implement a dynamic mapping protocol with the most frequent letter. In the hiding process, each character in the secret message is converted to ASCII code and successfully matched in the dataset. Thus, after matching, the candidate text is sent to the receiver. In contrast, the pre-agreed dynamic mapping protocol was implemented in a receiver to extract secret messages. Three Arabic datasets are used in this paper SANAD (Single-Label Arabic News Articles Dataset) includes 45500 articles, Arabic Poem Comprehensive Dataset (APCD) contains 1,831,770 poetic verses in total, Arabic Poetry Dataset contains more than 58000 poems). The suggested approach withstands existing detecting methods because of no modification or generation. Moreover, there is an enhancement in hiding capacity, which can conceal a (character per word). Hence, all the messages are embedded successfully using dynamic mapping.

Journal of University of Kerbala, 2024
In the digital era, protecting confidential information from unauthorized access is crucial. Info... more In the digital era, protecting confidential information from unauthorized access is crucial. Information can be represented through a variety of communicative media, including text, video, audio, and images, with text being the most common. Traditional methods require carriers to disguise secret information, leading to carrier modifications which are challenging to avoid steganalysis. In contrast, coverless information hiding does not modify the carrier and transfers Secret message directly via the stego cover"s built-in attributes. The most crucial positive significance for the development of coverless information hiding technology is based on the Chinese mathematical phrase. The primary goal of this review is to examine the most recent findings in the fields of coverless hiding, development approaches, selected datasets, metrics of performance, and issues of hiding applications in English, Chinese, and other languages. The findings demonstrate that researchers have considered the Keywords and tags, approaches while the approach-based Markov model is mainly used with the English language. Additionally, the study reveals that hiding capacity, success rate, and security analysis metrics are the most common metrics used to evaluate coverless information hiding performance. As a final point, the unresolved issues and potential future directions are addressed to improve hiding capacity and algorithm efficiency, embed and extract information correctly, and extend these techniques to other languages.

IAES International Journal of Artificial Intelligence (IJ-AI)
Technological development is a revolutionary process by this time, it ismainly depending on elect... more Technological development is a revolutionary process by this time, it ismainly depending on electronic applications in our daily routines like(business management, banking, financial transfers, health, and other essentialtraits of life). Identification or approving identity is one of the complicatedissues within online electronic applications. Person’s writing style can beemployed as an identifying biological characteristic in order to recognize theidentity. This paper presents a new way for identifying a person in a socialmedia group using comments and based on the Deep Neural Network. Thetext samples are short text comments collected from Telegram group in Arabiclanguage (Iraqi dialect). The proposed model is able to extract the person'swriting style features in group comments based on pre-saved dataset. Theanalysis of this information and features forms the identification decision.This model exhibits a range of prolific and favorable results, the accuracy thatcomes with the p...

2022 International Conference on Data Science and Intelligent Computing (ICDSIC), Nov 1, 2022
An essential step in a conversational agent is an intent classification of user-generated text in... more An essential step in a conversational agent is an intent classification of user-generated text input. The purpose of building the intent classifier for a chatbot is to understand the intention of user queries to respond fast and accurately. Robust chatbots necessitate more utterances for an improved training model. Nevertheless, acquiring and annotating data is timeconsuming and expensive. This paper investigates machine learning techniques and data augmentation for addressing intent classification. Experiments were conducted on office product's question answering of Amazon using Random Forest, Multinomial Naïve Bayes, Logistic Regression, and Support Vector Machine (SVM). Contextual word embedding with BERT was used for generating new synonym utterances. The main experiments are the comparison of the performance of these methods after augmenting new data. In general, SVM and random forest yield comparable results. Followed by logistic regression. However, the f1 score of the multinomial naïve base is the lowest. Additionally, we discovered that augmenting new utterances had a simple effect on the performance of models.

2022 13th International Conference on Computing Communication and Networking Technologies (ICCCNT)
Recently, chatbot technology has been employed in several customer services and support roles. Th... more Recently, chatbot technology has been employed in several customer services and support roles. These conversational agents have become valuable assets to companies and organizations because of their ability to communicate with humans and process large amounts of data. Furthermore, since most customers are using online shopping to fulfill their requirements instead of traditional methods, they expect to get a full 24-hour service. This survey provides an overview of different studies that used chatbots as a tool to provide customer service and analyzes the techniques applied to it. In this survey, we divided the studies into two parts, the algorithms, techniques used in creating a chatbot, and the influence of the chatbot's use on marketing. Research shows most of the studies used Deep learning techniques for building chatbots. In addition, chatbot provides services that enable customers to have better communication with the brand and affect their long-term relationships.

Technological development is a revolutionary process by this time, it is mainly depending on elec... more Technological development is a revolutionary process by this time, it is mainly depending on electronic applications in our daily routines like (business management, banking, financial transfers, health, and other essential traits of life). Identification or approving identity is one of the complicated issues within online electronic applications. Person's writing style can be employed as an identifying biological characteristic in order to recognize the identity. This paper presents a new way for identifying a person in a social media group using comments and based on the Deep Neural Network. The text samples are short text comments collected from Telegram group in Arabic language (Iraqi dialect). The proposed model is able to extract the person's writing style features in group comments based on pre-saved dataset. The analysis of this information and features forms the identification decision. This model exhibits a range of prolific and favorable results, the accuracy that comes with the proposed system reach to 92.88% (+/-0.16%).

An essential step in a conversational agent is an intent classification of user-generated text in... more An essential step in a conversational agent is an intent classification of user-generated text input. The purpose of building the intent classifier for a chatbot is to understand the intention of user queries to respond fast and accurately. Robust chatbots necessitate more utterances for an improved training model. Nevertheless, acquiring and annotating data is timeconsuming and expensive. This paper investigates machine learning techniques and data augmentation for addressing intent classification. Experiments were conducted on office product's question answering of Amazon using Random Forest, Multinomial Naïve Bayes, Logistic Regression, and Support Vector Machine (SVM). Contextual word embedding with BERT was used for generating new synonym utterances. The main experiments are the comparison of the performance of these methods after augmenting new data. In general, SVM and random forest yield comparable results. Followed by logistic regression. However, the f1 score of the multinomial naïve base is the lowest. Additionally, we discovered that augmenting new utterances had a simple effect on the performance of models.

Recently, chatbot technology has been employed in several customer services and support roles. Th... more Recently, chatbot technology has been employed in several customer services and support roles. These conversational agents have become valuable assets to companies and organizations because of their ability to communicate with humans and process large amounts of data. Furthermore, since most customers are using online shopping to fulfill their requirements instead of traditional methods, they expect to get a full 24-hour service. This survey provides an overview of different studies that used chatbots as a tool to provide customer service and analyzes the techniques applied to it. In this survey, we divided the studies into two parts, the algorithms, techniques used in creating a chatbot, and the influence of the chatbot's use on marketing. Research shows most of the studies used Deep learning techniques for building chatbots. In addition, chatbot provides services that enable customers to have better communication with the brand and affect their long-term relationships.
Design Engineering, Sep 29, 2021
journal of kerbala university, 2013
A graph is a collection of vertices or nodes, pairs of which are joined by lines or edges, can be... more A graph is a collection of vertices or nodes, pairs of which are joined by lines or edges, can be used not only to represent physical relationships, but also to represent logical, biological, and arithmetic relationships. The attributes that define a good graph are called aesthetics. The problem of good graph drawing is the conflict of some aesthetics with one another. In this paper, Evolutionary Algorithm used with fuzzy fitness function to reduce the conflict and drawing Good Graph that it will convey the most meaning. Two types of crossover and two type of mutation are used, the chromosome represented as graph with N nodes, where N is the chromosome length ,and node is a gene in any chromosome. Good result can be obtained when Fuzzy set is used to compute fitness function .
Journal of Engineering and Applied Sciences, 2019

Network security, and secure communications through public and private channels are more importan... more Network security, and secure communications through public and private channels are more important issue specially when computer usage is increasing, for both social and business areas. Data hiding is one of approach to obtain a secure communication medium and protecting the information during transmission. Text steganography is most challenging because of the presence of very less redundant information in text documents as compared to the images and audio. In this paper a novel method is proposed for data hiding in English scripts using Unicode of English alphabet in another languages. In this method, 13 characters from English alphabet was chosen for hiding process which have appearance in another languages. Two bits embedded in one time, using ASCII code for embedding 00, and using Unicode of multilingual for embedding 01, 10, and 11. This method has a height hiding capacity based on specific characters in each document. As well as have very good perceptual transparency and no changes in original text.

Writing system method for a situated about unmistakable marks that would related, by convention, ... more Writing system method for a situated about unmistakable marks that would related, by convention, on exactly specific structural level about language. Cuneiform speaks to a standout amongst those most punctual What's more practically persuasive Writing systems of the reality. The aim of this paper is to develop the character map utility included with Microsoft Windows operating systems to view the characters in designed font for special ancient Cuneiform writing system using Unicode standard. The proposed method focused on replacing the block of Cuneiform symbols in Unicode which located in Plane1 by the block in Plane0, in another word, change the glyphs of symbols in Plane0 to desired glyphs. The vitality of the paper one task lies Previously, preserving the history and legacy of ancient Writing systems and make it accessible as naturally manner. It Additionally facilitates should the individuals who interested writers in the ancient languages.

Since computer utilization is expanding, for both social and trade ranges, secure communications ... more Since computer utilization is expanding, for both social and trade ranges, secure communications through channels got to be an exceptionally critical issue. Information hiding away could be a strategy to get a secure communication medium and securing the data amid transmission. Text documents have very less redundant information as compared to the images and audio, therefore, text steganography is most challenging. This paper aims to improve "text steganography based on Unicode of characters in multilingual" by design new font with special properties for purposes of hiding data. Furthermore, this method based on making the same glyphs for the multiple codes, the Set of High-Frequency Letters called SHFL in the English language was chosen for the embedding process. The hiding method replaces the code of English symbol with other code that has the same glyph exactly. Two bits are hidden at once, utilizing glyph1 for hiding 00 and utilizing glyph2, glyph3, and glyph4 for hiding 01, 10, and 11. The improvement increases the steganography capacity, transparency and improves the security and robustness of the text stego file.

With the rapid development of Internet, safe covert communications in the network environment bec... more With the rapid development of Internet, safe covert communications in the network environment become an essential research direction. Steganography is a significant means that secret information is embedded into cover data imperceptibly for transmission, so that information cannot be easily aware by others. Text Steganography is low in redundancy and related to natural language rules these lead to limit manipulation of text, so they are both great challenges to conceal message in text properly and to detect such concealment. This study proposes a novel text steganography method which takes into account the Font Types. This new method depends on the Similarity of English Font Types; we called it (SEFT) technique. It works by replace font by more similar fonts. The secret message was encoded and embedded as similar fonts in capital Letters of cover document. Proposed text steganography method can works in different cover documents of different font types. The size of cover and stego d...

A writing system as a set of visible used to represent units of language in a systematic way. Egy... more A writing system as a set of visible used to represent units of language in a systematic way. Egyptian hieroglyphs were a formal writing system used by the ancient Egyptians that combined logographic and alphabetic elements. In serious music textbook editor programs there is a trouble in writing documents which include Egyptian hieroglyph symbols which take more than two bytes, because there is no way to embed these symbolization in a particular document. In this paper, a special text editor designed for ancient Egyptian hieroglyph writing system which power comprehension the Egyptian hieroglyph symbols using Unicode standard, and the basic operations of a classical text editor such as operations of file manipulation (load and save), select font size, type, and color, the operations of copy, paste, cut, select all, etc. as well as print the final document. Besides this, hieroglyphic numbers were represented in this editor. More facilities, flexibility, simplicity and benefits can be...
Uploads
Papers by Dhamyaa Nasrawi