Papers by Nora AlTurayeif

The exponential growth of user-generated content on social media platforms, online news outlets, ... more The exponential growth of user-generated content on social media platforms, online news outlets, and digital communication has necessitated the development of automated tools for analyzing opinions and attitudes expressed in text. Stance detection, a critical task in Natural Language Processing (NLP), aims to identify the underlying perspective or viewpoint of an individual or group towards a specific topic or target. This paper explores the challenges of stance detection, particularly in the context of social media, where brevity, informality, and limited contextual information prevail. While sentiment analysis focuses on explicit sentiment polarity, stance detection classifies the stance or viewpoint of a text towards a target, often of an abstract nature. This study introduces two multi-task learning (MTL) models that integrate sentiment analysis and sarcasm detection tasks to enhance stance detection performance. Four task weighting techniques are proposed and evaluated, and the...

Neural Computing and Applications, Jan 28, 2023
Stance detection is an evolving opinion mining research area motivated by the vast increase in th... more Stance detection is an evolving opinion mining research area motivated by the vast increase in the variety and volume of user-generated content. In this regard, considerable research has been recently carried out in the area of stance detection. In this study, we review the different techniques proposed in the literature for stance detection as well as other applications such as rumor veracity detection. Particularly, we conducted a systematic literature review of empirical research on the machine learning (ML) models for stance detection that were published from January 2015 to October 2022. We analyzed 96 primary studies, which spanned eight categories of ML techniques. In this paper, we categorize the analyzed studies according to a taxonomy of six dimensions: approaches, target dependency, applications, modeling, language, and resources. We further classify and analyze the corresponding techniques from each dimension's perspective and highlight their strengths and weaknesses. The analysis reveals that deep learning models that adopt a mechanism of self-attention have been used more frequently than the other approaches. It is worth noting that emerging ML techniques such as fewshot learning and multitask learning have been used extensively for stance detection. A major conclusion of our analysis is that despite that ML models have shown to be promising in this field, the application of these models in the real world is still limited. Our analysis lists challenges and gaps to be addressed in future research. Furthermore, the taxonomy presented can assist researchers in developing and positioning new techniques for stance detection-related applications.
Proceedings of the The Seventh Arabic Natural Language Processing Workshop (WANLP)

Applied Soft Computing, 2015
Machine learning techniques are widely used nowadays in the healthcare domain for the diagnosis, ... more Machine learning techniques are widely used nowadays in the healthcare domain for the diagnosis, prognosis, and treatment of diseases. These techniques have applications in the field of hematopoietic cell transplantation (HCT), which is a potentially curative therapy for hematological malignancies. Herein, a systematic review of the application of machine learning (ML) techniques in the HCT setting was conducted. We examined the type of data streams included, specific ML techniques used, and type of clinical outcomes measured. A systematic review of English articles using PubMed, Scopus, Web of Science, and IEEE Xplore databases was performed. Search terms included "hematopoietic cell transplantation (HCT)," "autologous HCT," "allogeneic HCT," "machine learning," and "artificial intelligence." Only full-text studies reported between January 2015 and July 2020 were included. Data were extracted by two authors using predefined data fields. Following PRISMA guidelines, a total of 242 studies were identified, of which 27 studies met the inclusion criteria. These studies were subcategorized into three broad topics and the type of ML techniques used included ensemble learning (63%), regression (44%), Bayesian learning (30%), and support vector machine (30%). The majority of studies examined models to predict HCT outcomes (e.g., survival, relapse, graft-versus-host disease). Clinical and genetic data were the most commonly used predictors in the modeling process. Overall, this review provided a systematic review of ML techniques applied in the context of HCT. The evidence is not sufficiently robust to determine the optimal ML technique to use in the HCT setting and/or what minimal data variables are required.
This SLR aims to summarize and clarify the empirical evidence on novices' interaction with co... more This SLR aims to summarize and clarify the empirical evidence on novices' interaction with compiler error messages.

2018 21st Saudi Computer Society National Computer Conference (NCC), 2018
Diabetes Mellitus (DM) is one of the most prevalent chronic diseases in the world with around 150... more Diabetes Mellitus (DM) is one of the most prevalent chronic diseases in the world with around 150 million patients. Patients with chronic diseases are highly susceptible to deterioration in their physical and mental health; consequently, hindering their independence, restricting their daily activities imposing a large financial burden on them and the government. If not discovered early, chronic diseases may lead to serious health complications or in extreme cases, death. Diagnostic solutions have been explored using intelligent methods, however, different ethnic groups have variant factors leading to the development of a disease. Therefore, the proposed system aims to preemptively diagnose DM in a region never explored before. Data are retrieved from King Fahd University Hospital (KFUH) in Khobar, Saudi Arabia. Data undergoes preprocessing to identify relevant features and prepare for identification/classification process. Experimental results show that ANN outperformed SVM, Naïve Bayes, and K-Nearest Neighbor with the testing accuracy of 77.5%.
2018 International Conference on Innovations in Information Technology (IIT), 2018
Chronic Kidney Disease (CKD) is a major public health concern with rising prevalence. In Saudi Ar... more Chronic Kidney Disease (CKD) is a major public health concern with rising prevalence. In Saudi Arabia, approximately 2 Billion Riyals are solely allocated for renal replacement therapy which is required for patients with advanced stages of CKD. Therefore, this study aims to decrease the number of patients and the expenses needed for treatment by preemptively diagnosing chronic kidney disease accurately using data mining and machine learning techniques. Data have been collected from a region that has never been explored before in literature. This study uses Saudi data retrieved from King Fahd University Hospital(KFUH) in Khobar to carry out the experiment. Experimental Results show that ANN, SVM, Naïve Bayes achieved a testing accuracy of 98.0% while k-NN has achieved an accuracy of 93.9%.

International Journal of Advanced Computer Science and Applications, 2020
Visual programming languages make programming more accessible for novices, which open more opport... more Visual programming languages make programming more accessible for novices, which open more opportunities to innovate and develop problem-solving skills. Besides, deep learning is one of the trending computer science fields that has a profound impact on our daily life, and it is important that young people are aware of how our world works. In this study, we partially attribute the difficulties novices face in building deep learning models to the used programming language. This paper presents DeepScratch, a new programming language extension to Scratch that provides powerful language elements to facilitate building and learning about deep learning models. We present the implementation process of DeepScratch, and explain the syntactical definition and the lexical definition of the extended vocabulary. DeepScratch provides two options to implement deep learning models: training a neural network based on built-in datasets and using pre-trained deep learning models. The two options are provided to serve different age groups and educational levels. The preliminary evaluation shows the usability and the effectiveness of this extension as a tool for kids to learn about deep learning.

Proceedings of the 12th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management
Sentiment analysis in the finance domain is widely applied by investors and researchers, but most... more Sentiment analysis in the finance domain is widely applied by investors and researchers, but most of the work is conducted for English text. In this work, we present a framework to analyze and visualize the sentiments of Arabic tweets related to the Saudi stock market using machine learning methods. For the purpose of training and prediction, Twitter API was used for collecting off-line data, and Apache Kafka was used for real-time streaming tweets. Experiments were conducted using five machine learning classifiers with different feature extraction methods, including word embedding (word2vec) and the traditional BoW methods. The highest accuracy for the sentiment classification of Arabic tweets was 79.08%. This result was achieved with the SVM classifier combined with the TF-IDF feature extraction method. At the end, the predicted sentiments of the tweets using the outperforming classifier were visualized by several techniques. We developed a website to visualize the off-line and streaming tweets in various ways: by sentiments, by stock sectors, and by frequent terms.

Applied Sciences
The outbreak of coronavirus disease (COVID-19) has affected almost all of the countries of the wo... more The outbreak of coronavirus disease (COVID-19) has affected almost all of the countries of the world, and has had significant social and psychological effects on the population. Nowadays, social media platforms are being used for emotional self-expression towards current events, including the COVID-19 pandemic. The study of people’s emotions in social media is vital to understand the effect of this pandemic on mental health, in order to protect societies. This work aims to investigate to what extent deep learning models can assist in understanding society’s attitude in social media toward COVID-19 pandemic. We employ two transformer-based models for fine-grained sentiment detection of Arabic tweets, considering that more than one emotion can co-exist in the same tweet. We also show how the textual representation of emojis can boost the performance of sentiment analysis. In addition, we propose a dynamically weighted loss function (DWLF) to handle the issue of imbalanced datasets. Th...

Diabetes Mellitus (DM) is one of the most prevalent chronic diseases in the world with around 150... more Diabetes Mellitus (DM) is one of the most prevalent chronic diseases in the world with around 150 million patients. Patients with chronic diseases are highly susceptible to deterioration in their physical and mental health; consequently, hindering their independence, restricting their daily activities imposing a large financial burden on them and the government. If not discovered early, chronic diseases may lead to serious health complications or in extreme cases, death. Diagnostic solutions have been explored using intelligent methods, however, different ethnic groups have variant factors leading to the development of a disease. Therefore, the proposed system aims to preemptively diagnose DM in a region never explored before. Data are retrieved from King Fahd University Hospital (KFUH) in Khobar, Saudi Arabia. Data undergoes preprocessing to identify relevant features and prepare for identification /classification process. Experimental results show that ANN outperformed SVM, Naïve Bayes, and K-Nearest Neighbor with the testing accuracy of 77.5%.

Visual programming languages make programming more accessible for novices, which open more opport... more Visual programming languages make programming more accessible for novices, which open more opportunities to innovate and develop problem-solving skills. Besides, deep learning is one of the trending computer science fields that has a profound impact on our daily life, and it is important that young people are aware of how our world works. In this study, we partially attribute the difficulties novices face in building deep learning models to the used programming language. This paper presents DeepScratch, a new programming language extension to Scratch that provides powerful language elements to facilitate building and learning about deep learning models. We present the implementation process of DeepScratch, and explain the syntactical definition and the lexical definition of the extended vocabulary. DeepScratch provides two options to implement deep learning models: training a neural network based on built-in datasets and using pre-trained deep learning models. The two options are provided to serve different age groups and educational levels. The preliminary evaluation shows the usability and the effectiveness of this extension as a tool for kids to learn about deep learning.
Chronic Kidney Disease (CKD) is a major public health concern with rising prevalence. In Saudi Ar... more Chronic Kidney Disease (CKD) is a major public health concern with rising prevalence. In Saudi Arabia, approximately 2 Billion Riyals are solely allocated for renal replacement therapy which is required for patients with advanced stages of CKD. Therefore, this study aims to decrease the number of patients and the expenses needed for treatment by preemptively diagnosing chronic kidney disease accurately using data mining and machine learning techniques. Data have been collected from a region that has never been explored before in literature. This study uses Saudi data retrieved from King Fahd University Hospital(KFUH) in Khobar to carry out the experiment. Experimental Results show that ANN, SVM, Naïve Bayes achieved a testing accuracy of 98.0% while k-NN has achieved an accuracy of 93.9%.
Uploads
Papers by Nora AlTurayeif