Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2018, International Journal of Advanced Computer Science and Applications
Internet is the most significant source of getting up thoughts, surveys for a product, and reviews for any type of service or activity. A Bulky amount of reviews are produced on daily basis on the cyberspace about online products and objects. For example, many individuals share their remarks, reviews and feelings in their own language utilizing social media networks such as twitter and so on. Considering their colossal Quantity and size, it is exceedingly knotty to look at with and interpret specified surveys. Sentiment Analysis (SA) aims at extracting people's opinion, felling and thought from their reviews in social websites. SA has recently gained significant consideration, however the vast majority of the resources and frameworks constructed so far are tailored to English as well as English like Western languages. The requirement for designing frameworks for different dialects is expanding, particularly as blogging and micro-blogging sites are becoming popular. This paper presents a comprehensive review of approaches of Urdu sentiment analysis and outlines of relevant gaps in the literature.
PeerJ Computer Science
Sentiment analysis in research involves the processing and analysis of sentiments from textual data. The sentiment analysis for high resource languages such as English and French has been carried out effectively in the past. However, its applications are comparatively few for resource-poor languages due to a lack of textual resources. This systematic literature explores different aspects of Urdu-based sentiment analysis, a classic case of poor resource language. While Urdu is a South Asian language understood by one hundred and sixty-nine million people across the planet. There are various shortcomings in the literature, including limitation of large corpora, language parsers, and lack of pre-trained machine learning models that result in poor performance. This article has analyzed and evaluated studies addressing machine learning-based Urdu sentiment analysis. After searching and filtering, forty articles have been inspected. Research objectives have been proposed that lead to rese...
Computers, 2021
Research efforts in the field of sentiment analysis have exponentially increased in the last few years due to its applicability in areas such as online product purchasing, marketing, and reputation management. Social media and online shopping sites have become a rich source of user-generated data. Manufacturing, sales, and marketing organizations are progressively turning their eyes to this source to get worldwide feedback on their activities and products. Millions of sentences in Urdu and Roman Urdu are posted daily on social sites, such as Facebook, Instagram, Snapchat, and Twitter. Disregarding people’s opinions in Urdu and Roman Urdu and considering only resource-rich English language leads to the vital loss of this vast amount of data. Our research focused on collecting research papers related to Urdu and Roman Urdu language and analyzing them in terms of preprocessing, feature extraction, and classification techniques. This paper contains a comprehensive study of research cond...
2020
In present century, data volume is increasing enormously. The data could be in form for image, text, voice, and video. One factor in this huge growth of data is usage of social media where everyone is posting data on daily basis during chatting, exchanging information, and uploading their personal and official credential. Research of sentiments seeks to uncover abstract knowledge in Published texts in which users communicate their emotions and thoughts about shared content, including blogs, news and social networks. Roman Urdu is the one of most dominant language on social networks in Pakistan and India. Roman Urdu is among the varieties of the world's third largest Urdu language but yet not sufficient work has been done in this language. In this article we addressed the prior concepts and strategies used to examine the sentiment of the roman Urdu text and reported their results as well.
Expert Systems, 2019
The sentiment analysis (SA) applications are becoming popular among the individuals and organizations for gathering and analysing user's sentiments about products, services, policies, and current affairs. Due to the availability of a wide range of English lexical resources, such as part-of-speech taggers, parsers, and polarity lexicons, development of sophisticated SA applications for the English language has attracted many researchers. Although there have been efforts for creating polarity lexicons in non-English languages such as Urdu, they suffer from many deficiencies, such as lack of publically available sentiment lexicons with a proper scoring mechanism of opinion words and modifiers. In this work, we present a word-level translation scheme for creating a first comprehensive Urdu polarity resource: "Urdu Lexicon" using a merger of existing resources: list of English opinion words, SentiWordNet, English-Urdu bilingual dictionary, and a collection of Urdu modifiers. We assign two polarity scores, positive and negative, to each Urdu opinion word. Moreover, modifiers are collected, classified, and tagged with proper polarity scores. We also perform an extrinsic evaluation in terms of subjectivity detection and sentiment classification, and the evaluation results show that the polarity scores assigned by this technique are more accurate than the baseline methods.
2014
With recent development in web technologies and mobile technologies, with increasing user-generated content in Hindi on the internet is the motivation behind the sentiment analysis Research that is growing up at a lightning speed. This information can prove to be very useful for researchers, governments and organization to learn what’s on public mind, to make sound decisions. Opinion Mining or Sentiment Analysis is a natural language processing task that mine information from various text forms such as reviews, news, and blogs and classify them on the basis of their polarity as positive, negative or neutral. But, from the last few years, enormous increase has been seen in Hindi language on the Web. Research in opinion mining mostly carried out in English language but it is very important to perform the opinion mining in Hindi language also as large amount of information in Hindi is also available on the Web. This paper gives an overview of the work that has been done Hindi language.
Multimedia Tools and Applications, 2023
Sentiment analysis involves extracting sentiments from various forms of text, including customer reviews, tweets, blogs, and news clips expressing opinions on diverse subjects, even populist events. The advent of tools supporting regional languages has resulted in a substantial surge of regional language texts. As Hindi ranks fourth in terms of native speakers, the development of sentiment analysis mechanisms for Hindi text becomes crucial. This paper provides a comprehensive review of specific approaches used in Hindi sentiment analysis, encompassing negation handling and the evolution of SentiWordNet for the Hindi Language. Moreover, it offers an overview of available Hindi lexicons and insights into diverse stemmers and morphological analyzers designed for the language. Additionally, the paper conducts an in-depth literature review of various sentiment analysis tasks carried out in Hindi, thereby opening avenues for future research in sentiment analysis and opinion mining in the Hindi language.
Sentiment Analysis (SA) concerns the automatic extraction and classification of sentiments conveyed in a given text, i.e. labelling a text instance as positive, negative or neutral. SA research has attracted increasing interest in the past few years due to its numerous real-world applications. The recent interest in SA is also fuelled by the growing popularity of social media platforms (e.g. Twitter), as they provide large amounts of freely available and highly subjective content that can be readily crawled.
11th IEEE International Conference on Semantic Computing, 2017
It is human instinct to express emotions, and with increasing use of social media, it is more often being expressed through text messages than ever before. The emotions and sentiments encoded in these short text messages are of keen interest to various marketing and advertising agencies. Thus, various lexicons and algorithms have been devised for English, and French language to extract these hidden sentiments. On the other hand, Urdu (or Hindi) the third widely-spoken language in the world [1], lacks any such sentiment lexicons or algorithms. Instead of starting from scratch, we make use of the existing English sentiment lexicons to develop the first sentiment lexicon for Urdu. This lexicon will serve as a baseline for future lexicons developed through more intimate knowledge of Urdu language. Furthermore, we compare its performance with various machine learning (ML) approaches. We also make public the labeled dataset developed by us for Urdu sentiment analysis. We hope that this lexicon and dataset will serve as a benchmark for evaluation of future lexicons and ML approaches for the Urdu language.
An increase in the use of smartphones has laid to the use of the internet and social media platforms. The most commonly used social media platforms are Twitter, Facebook, WhatsApp and Instagram. People are sharing their personal experiences, reviews, feedbacks on the web. The information which is available on the web is unstructured and enormous. Hence, there is a huge scope of research on understanding the sentiment of the data available on the web. Sentiment Analysis (SA) can be carried out on the reviews, feedbacks, discussions available on the web. There has been extensive research carried out on SA in the English language, but data on the web also contains different other languages which should be analyzed. This paper aims to analyze, review and discuss the approaches, algorithms, challenges faced by the researchers while carrying out the SA on Indigenous languages.
The Web 2.0 refers to the second generation of World Wide Web (WWW). Web 2.0 allows Internet users to collaborate and share information online, and therefore create large virtual societies. Web 2.0 includes social network sites, Wikis, Blogs, Web services, podcasting, Multimedia sharing services ...etc. Arab users of social network sites (Facebook and Twitter) generate daily a large volume of Arabic and English textual reviews related to different social, political and scientific subjects. These reviews could be about different products, political events, sport teams, economics, video clips, restaurants, books, actors/actress, new films and songs, universities ...etc. This large volume of different Arabic and English textual reviews cannot be analyzed manually. Therefore sentiment analysis is used to identify sentiments with their subjectivity from this huge volume of reviews. In order to conduct this study a small dataset consisting of 4,050 Arabic and English reviews were collected. Three polarity dictionaries were also created (Arabic, English, and Emoticons). The collected dataset and those dictionaries were used to conduct a comparison between two free online sentiment analysis tools (SocialMention
People prefer to share and express opinions in their own language. Internet is a biggest repository for opinion sharing. Opinion mining refers to the use of NLP, text analysis and computational linguistics to identify and extract subjective information in source material. Opinion mining for Urdu language is not well explored. Proposed approach is based on the identification and extraction of adji-units and decision from the given text using lexicon-based approach. Adji-units are the expressions, which contain the subjective text in a sentence. Proposed approach uses two-step lexicon to extract opinions from text chunks. Unluckily, for Urdu language no such lexicon exists. Goal is to develop a diverse two-step lexicon and highlight the linguistic as well as technical aspects of this multidimensional research problem. The performance of the proposed work is evaluated on multiple texts and the achieved results are quite satisfactory.
2020
Sentiment analysis is a data mining technique, which measures the inclination of people’s opinions. Recent studies have shown that the sentiment lexicon can be developed using automatic and manual tagging techniques. The seminal works on Urdu lexicon done so far do not actually denote a broad Lickert scale for data tagging and also do not cover all the open word classes. The current study aims to develop a sentiment lexicon and test its validity using manual and automatic methods. The dictionary-based method is used to design this lexicon using three authentic Urdu dictionaries. The data was tagged on a five point lickert scale i.e. -2 to +2 using the formulated guidelines. The lexicon is composed of four-word classes namely nouns, verbs, adjectives and adverbs. Once the lexicon was developed using manual tagging techniques it was tested both manually and automatically. The manual testing yielded an inter annotator agreement of 75% while the automatic testing included the comparing ...
2nd IEEE International Conference On Information Science & Communication Technology, 2020
The significance of the labeled dataset is not obscure from artificial intelligence practitioners. We have seen much phenomenal work, in natural language processing, for many languages (like English, Chinese, and Arabic, etc.), due to the reason for the availability of substantial data. For the Urdu language, despite the third largest spoken language in the world, very little research work is shown; hence, it is adjudged as a ‘morphologically rich’ but ‘resource-poor’ language. Further, the researchers working on Urdu natural language processing are in a quandary due to the lack of availability of labeled/annotated datasets. This paper shares the data, “Urdu Sentiment Corpus” (USC), and insights therein, of Urdu tweets for the sentiment analysis and polarity detection. The dataset is consisting of tweets, such that it casts a political shadow and presents a competitive environment between two separate political parties versus the government of Pakistan. Overall, the dataset is comprising over 17, 185 tokens with 52% records as positive, and 48% records as negative. This paper shares the visual insights (from document-level to word-level) into the textual similarities, manifold-learning, etc. In addition to it, this paper also presents a Part-of-Speech wise analysis and an unpretentious technique for the extraction of sentiment lexicons from the corpus.
Journal of Intelligent & Fuzzy Systems, 2020
Sentiment Analysis have also an important role in natural language processing to evaluate and analyzing the public opinion, sentiments and views about social activities such as product, services, Academic institutes, organizations etc. Lot of the work has been done on English language in natural language processing. However, it is found out from the literature that still huge research gap is available for the Romanized Sindhi and there sentiment analysis in the field of natural language processing and also no any trained data available for the testing. Classification of sentiment of Romanized Sindhi text is a very difficult task. For the evaluation of sentiment of Romanized Sindhi text easily available online Python were used. In this research work thousand words of Romanized Sindhi text/data were used for the sentiment classification. Also discussed issues in sentiment classification in Python tool on Romanized Sindhi text.
Mehran University Research Journal of Engineering and Technology, 2019
The majority of online comments/opinions are written in text-free format. Sentiment Analysis can be used as a measure to express the polarity (positive/negative) of comments/opinions. These comments/ opinions can be in different languages i.e. English, Urdu, Roman Urdu, Hindi, Arabic etc. Mostly, people have worked on the sentiment analysis of the English language. Very limited research work has been done in Urdu or Roman Urdu languages. Whereas, Hindi/Urdu is the third largest language in the world. In this paper, we focus on the sentiment analysis of comments/opinions in Roman Urdu. There is no publicly available Roman Urdu public opinion dataset. We prepare a dataset by taking comments/opinions of people in Roman Urdu from different websites. Three supervised machine learning algorithms namely NB (Naive Bayes), LRSGD (Logistic Regression with Stochastic Gradient Descent) and SVM (Support Vector Machine) have been applied on this dataset. From results of experiments, it can be con...
International Journal in Foundations of Computer Science & Technology, 2014
Opinions are very important in the life of human beings. These Opinions helped the humans to carry out the decisions. As the impact of the Web is increasing day by day, Web documents can be seen as a new source of opinion for human beings. Web contains a huge amount of information generated by the users through blogs, forum entries, and social networking websites and so on To analyze this large amount of information it is required to develop a method that automatically classifies the information available on the Web. This domain is called Sentiment Analysis and Opinion Mining. Opinion Mining or Sentiment Analysis is a natural language processing task that mine information from various text forms such as reviews, news, and blogs and classify them on the basis of their polarity as positive, negative or neutral. But, from the last few years, enormous increase has been seen in Hindi language on the Web. Research in opinion mining mostly carried out in English language but it is very important to perform the opinion mining in Hindi language also as large amount of information in Hindi is also available on the Web. This paper gives an overview of the work that has been done Hindi language.
2020
In the era of technology, each and every one of us is expressing their opinion on social media platforms very frequently. And these opinions are mostly expressed in regional languages, so the contents mostly generated are in regional languages in nature. Sentiment Analysis (SA) is a natural language processing task that is defined as finding opinion (In the sense of Positive, Negative, or Neutral) of the writer about specific entities. This includes analyzing a person’s emotions, feelings, and attitudes towards his contents. This paper gives a comparative analysis of sentiment analysis performed in various Indian languages, which includes classification techniques which are based on Lexicon, Dictionary, and Machine Learning. And it also gives a list of lexical resources available to perform Sentiment Analysis (SA) of Indian Languages and the challenges of developing lexical resources for low resourced Indian languages.
IEEE Access
Opinion Mining from user reviews is an emerging field. Sentiment Analysis of Natural Language text helps us in finding the opinion of the customers. These reviews can be in any language e.g. English, Chinese, Arabic, Japanese, Urdu, and Hindi. This research presents a model to classify the polarity of the review(s) in Roman Urdu text (reviews). For the purpose, raw data was scraped from the reviews of 20 songs from Indo-Pak Music Industry. In this research a new dataset of 24000 reviews of Roman Urdu text is created. Nine Machine Learning algorithms-Naïve Bayes, Support Vector Machine, Logistic Regression, K-Nearest Neighbors, Artificial Neural Networks, Convolutional Neural Network, Recurrent Neural Networks, ID3 and Gradient Boost Tree, are attempted. Logistic Regression outperformed the rest, based on testing and cross validation accuracies that are 92.25% and 91.47% respectively.
The Web 2.0 refers to the second generation of World Wide Web (WWW). Web 2.0 allows Internet users to collaborate and share information online, and therefore create large virtual societies. Web 2.0 includes social network sites, Wikis, Blogs, Web services, podcasting, Multimedia sharing services ...etc. Arab users of social network sites (Facebook and Twitter) generate daily a large volume of Arabic and English textual reviews related to different social, political and scientific subjects. These reviews could be about different products, political events, sport teams, economics, video clips, restaurants, books, actors/actress, new films and songs, universities ...etc. This large volume of different Arabic and English textual reviews cannot be analyzed manually. Therefore sentiment analysis is used to identify sentiments with their subjectivity from this huge volume of reviews. In order to conduct this study a small dataset consisting of 4,050 Arabic and English reviews were collected. Three polarity dictionaries were also created (Arabic, English, and Emoticons). The collected dataset and those dictionaries were used to conduct a comparison between two free online sentiment analysis tools (SocialMention
IRJET, 2021
Sentiment Analysis is a process of identifying the emotion of a sentence. Sentiment Analysis has various applications like Social Media Monitoring, Customer Feedback, Brand Monitoring, etc. One of these applications of Social Media Monitoring will be implemented in our project. We will be implementing Sentiment Analysis on the social media platform-Twitter. Most of the work in Sentiment Analysis has been done in the English language. As a result, less work has been done in regional languages. Therefore we decided to perform Sentiment Analysis in the Marathi language, which is the official language of Maharashtra. This paper will be focussing on the Lexicon approach to perform Sentiment Analysis.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.