0% found this document useful (0 votes)

8 views23 pages

Raw Content

Uploaded by

jeethhirrani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views23 pages

Raw Content

Uploaded by

jeethhirrani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

2 Literature Review

2.1 Introduction
This proposal intends to investigate the different techniques for aspect and opinion
extraction from the reviews. Chapter 1 provided an overview of this proposal. This chapter will
present related definitions and terminologies, then review a number of existing techniques and
applications related to aspect-based opinion mining. Finally, some of the challenges will be
discussed.

Section 2.2 introduces some definitions and terminologies related to the aspect-based
opinion mining model. It presents the main concepts to identifying, analysing and extracting
entities, aspects and opinions. Section 2.3 discusses the state of the art in aspect-based opinion
mining from customer reviews. Section 2.4 discusses the most crucial part which Aspect
Sentimental Classification. Section 2.5 lists some existing opinion mining applications and
Section 2.6 discusses some of the challenges. Finally, Section 2.7 concludes the chapter

2.2 Definitions
Opinion

Opinion, in general, is “a view or a judgement formed about something that is not

necessarily based on a fact or existing knowledge” [\cite{12}]. In the problem of sentiment
analysis, Liu indicates that opinion is “a quintuple of Entity, Aspect, Orientation, Opinion Holder
and Time” [ \cite{13}]. The entity is the name of an entity, which could refer to a product. The
aspect can be a feature, component or function of the entity. The orientation is the opinion
provided about the entity and/or the aspect that was provided by the opinion holder at a specific
time.

The term “opinion mining” was firstly presented by [\cite{14}], who proposed some
techniques for opinion mining and classified opinions as positive or negative. An opinion is an
individual’s private state; it exemplifies the individual’s assessments, evaluations, beliefs,
judgements and ideas regarding a particular item/subject/topic. Opinions of others can have
great impact on and offer guidance for governments, social communities, individuals and
organisations in the process of decision-making by [ \cite{15}]. When considering other people’s
opinions, human beings need concise, accurate and timely information so they may make
correct and quick decisions. Opinions make human beings capable of integrating the different
experiences, approaches, knowledge and wisdom of several people when making decisions.
For people, it is natural to take part in discussions and convey their viewpoint [\cite{16}].

Sentiments

Sentiments may be narrated as opinions, ideas or as judgements manifested by

emotions [\cite{17}]. As [\cite{18}] stated, “One of the challenges related to sentiment analysis is
identifying the objects of the study of opinions and subjectivity. Originally, subjectivity was
identified by the prominent linguist R. Quirk [\cite{19}]”. Quirk defines private state as something
that is not open to objective observation or verification [\cite{20}]. These private states include
emotions, opinions and speculations. Computational linguistics mainly focuses on opinions
rather than on sentiments, feelings or emotions. The terms ‘sentiment’ and ‘opinion’ are often
used interchangeably in the literature.

Human sentiment knowledge grows by day-to-day cognitive interactions. Sentiment is

not a direct property of languages. An intelligent system should need some prior knowledge to
act properly. Sentiment knowledge is generally wrapped into computational lexicon, technically
called sentiment lexicon.

Information on the Web that is preserved in text documents can be divided into two
categories: factual and opinionated information. Usually, facts relate to objective articulations
about aspects, events, and their attributes. According to [\cite{13}], opinions are usually
subjective manifestations that outline people’s sentiments, appraisals, or feelings toward the
aspects, events, and their properties. [\cite{21}], on the other hand, demonstrates opinion in the
context of four terms: Topic, Holder, Claim, and Sentiment. The Holder believes a Claim about a
topic that is usually associated with a Sentiment, such as “good” or “poor”. Kim and Hovy draft a
sentiment as an explicit or implicit articulation in text indicating the Holder’s positive, negative, or
neutral expression toward the Claim about the Topic, and the sentiments always involve the
Holder’s emotions and desires.

Opinion Mining

Opinion mining, also known as sentiment analysis, is one of the most explored areas in
computer science in the last few decades. Although a formidable amount of research has been
done, the reported solutions and available systems do not yet satisfy the requirements of the
end user. The main issue is the various conceptual rules that govern sentiment. There are many
different clues (possibly unlimited) that can convert these concepts from realisation to
verbalisation of a human being.

Human psychology directly relates to the paradigms of social psychology, culture, pragmatics
and governs our sentiment realisation. Proper incorporation of human psychology into
computational sentiment knowledge representation appears to be a step in the right direction.

Sentiment analysis is also termed as subjectivity analysis or opinion mining dealing with the
computational presentation of opinion, sentiment, and subjectivity in text involving Natural
Language Processing (NLP), text analysis, and computational linguistics [\cite{22}]. It aims at
comprehending the expression or opinion from a speaker or writer with respect to a certain
topic. The expression may reflect the judgement, opinion, or evaluation of the writer and indicate
her affective state at the time of writing. The affective state means how the writer was feeling at
that time or the emotional communication intended to affect the reader of the text.

Target Detection

As the name suggests, the goal of sentiment target detection is to determine the subject of a
sentiment expression. Depending on the granularity of analysis, a sentiment target may refer to
a concrete entity or to a more abstract topic. For instance, in aspect-oriented review mining, the
interest is in determining the reviewers’ evaluations of very concrete aspects. Such targets
typically become manifest at the phrase or sentence level (e.g., “I really like the picture quality”).
In this case, the task is primarily regarded as information extraction, and it involves such sub-
problems as named entity recognition and relationship extraction. In contrast, sentiment
retrieval systems are generally concerned with identifying opinions related to topics that are
more abstract (e.g., “Which blogs report positively and which negatively on the topic of Israeli
settlement policy?”). Such an analysis is normally conducted at the document level. At coarse-
grained levels of analysis (document or sentence level), sentiment target detection is mostly
viewed as an instance of text categorisation or, more generally, as a problem of information
retrieval. Sentences or documents are classified or ranked according to their relevance towards
a given topic [\cite{23}].

Entity Categories

According to [\cite{24}], there are different categories of entities. A broad overview organises
them into four entity categories that represent different types of words in a review text. These
four categories are components, functions, features, and opinions. Table X shows an example
of entity categories related to “camera” [\cite{24}]. Some entities may not fit in any category.
Therefore, a fifth category, “other” is formed and left open for any suggested categories.
Entity Description

Physical objectives of a camera, including the camera itself, LCD screen,

Components viewfinder, and battery

Capabilities provided by a camera, including movie playback, zoom, and

Functions auto focus

Properties of components or functions, such as colour, speed, size, weight,

Features and clarity

Ideas and thought expressed by reviewers on product, features,

Opinions components or functions

Other Other possible entities defined by the domain

Entity Discovery and assignment

User-generated content, which in this research is customers’ reviews, contains different

opinions about different products. Therefore, opinions mined from such content will be of a little
use, since the entities of reviews concerned were not known. The problem of identifying what
products, features, aspects, or attributes the reviewers have talked about in a sentence is called
“entity discovery”. In a typical review, the opinion holder may give opinions on multiple products,
features, aspects, or attributes even though the review page is about a specific product. The
main issue is how to discover each entity from the four categories described above; therefore,
entities need to be discovered and identified [\cite{24}].

Another essential issue, after discovering entities, is to assign them to the right opinions and
right products; this is called “entity assignment”. Some reviews may have direct opinions
assigned to direct entities, which are explicitly mentioned in the sentence. Other reviews may
contain entities or opinions that are implied and difficult to assign. These two issues are crucial,
as without discovering which entities the review talked about and without assigning the
corresponding opinions to the correct entity, the opinion mining is of no use [\cite{24}].
To illustrate these two issues, an example of a camera review will be used “I bought Camera
S300 last week. I took so many photos; they were better than those from my old camera. Also
the battery is very good and the colour is amazing.” In some parts of this example, it is
straightforward to discover entities. However, it begins to become complicated when the opinion
holder compares products. Therefore, there are difficulties in discovering entities and assigning
them to the right opinions.

There are some proposed techniques to discover and assign entities. [\cite{24}] proposed a
solution to discover entities by discovering linguistic patterns and then using them to extract
entities. They also proposed a technique for assigning entities depending on extracting entities
of comparative sentences, such as the example of Camera S300.

Customer Review Mining

Increasingly large numbers of customers choose online shopping as it is more convenient,

reliable, and easy to compare prices and get good feedback from other customers.
Consequently, the number of products sold online is increasing rapidly. This makes it difficult for
customers to make correct purchasing decisions only based on product description and some
images or even videos of the product. Vast amounts of customer feedback data are available,
such as blogs, product reviews, and forms. However, such data is typically unstructured, thus it
needs a text mining approach to extract useful knowledge and interpret important information.
Therefore, the mined information is re-structured and presented in an ultra-concise form.
Applying sentiment analysis techniques to mine, analyse, and then summarise this type of data
is called “customer review mining”.

In some cases, mining opinions at a sentence or document level is beneficial. However, such
levels of information are not always enough for a complete decision-making process. For
instance, a positive appraisal on a specific item does not mean that the reviewer likes each
aspect of the item. Similarly, a negative review does not imply that the reviewer dislikes
everything. In a typical review, the reviewer writes both negative and positive aspects of the
item, while his overall opinion about the item may be negative or positive [\cite{16}]. Therefore,
sentence-level and document-level opinions may not offer comprehensive information for
decision-making. To acquire such information, a finer level of granularity should be found.

In the last decade, numerous methods have been suggested to solve the problem of aspect
based opinion mining. The previous works are frequency-based approaches wherein simple
filters are applied on high frequency noun phrases to extract aspects. Although such methods
are effective, they miss low frequency. To overcome such weakness, relation-based methods
are suggested. Relation-based methods use NLP approaches to find certain relationships
between related sentiments and aspects.

Sources of Online Review

An online review is a piece of text that is publicly available and written by a known or
anonymous customer regarding a purchase of a product or service. Sources of online reviews
are widely varied; however, according to [\cite{26}], the most popular sources are review sites,
online shopping sites and web logs.

A web log, or “blog” for short, is defined by the Oxford dictionary as “a regularly updated
website or web page, typically one run by an individual or small group, that is written in an
informal or conversational style.” It represents review comments by authors about newly
purchased products (e.g. cameras) and provided services (e.g. hotel or restaurant).

A review site is a website in which reviews are posted by authors about their experiences of
purchased products or provided services. Typically, each product will have single or multiple
pages where all reviews are gathered and published.

An online shopping site is an electronic commerce (EC) site in which the selling and purchasing
goods and services is done over the Internet. The process starts from the merchants, who
advertise goods/services over the Internet to potential customers to purchase and provide
feedback. The customer feedback can be provided on the product as a whole, parts of the
product, the shipment, or anything related to the provided service. [Link] and [Link]
are the most popular examples of online shopping sites.

Formats of Online Reviews

Online reviews give authors the freedom to express their thoughts about provided products
and/or services. They are considered to provide essential and informative data and can take
different formats. [\cite{2}] classifies customer reviews on the Web into two common formats,
whereas [\cite{26}] expands the list by adding one more format, as follows:

Format 1- Free-format detailed review:

This is a free-text box where the author can write the review in the form of free text and
sometimes with no word limit. A very well-known example is [Link].

Format 2 - Pros and Cons:

This form of review is concise, and explicitly represents the positive and negative comments.

Format 3 - Pros, Cons, and the Detailed Review:

This form of review combines both Formats 1 and 2 and allows the author to write comments
and free text, and highlight the pros and cons, for example, [Link]

The entity and sentiment extraction process varies from free text format to the pros and cons
format. [Liu et al., 2005\cite{27}] proposed a system called “opinion observer” to extract aspects
and sentiments from pros and cons format reviews. It is based on sequential learning and relies
on the two assumptions that all reviews are very short and each segment contains one aspect
and its corresponding sentiment. Consequently, there remains a need for systems that can
extract aspects and opinions from free text.

2.2 Aspect-Based Opinion Mining

For many scenarios, document-level review classification is too coarse-grained and does not
provide the desired information. Pure classification merely helps to gather information about
how many customers are generally satisfied or unsatisfied. Based on these numbers where the
trends in the customers’ perceptions of a product can be found, but the exact reasons for
satisfaction or dissatisfaction is not known We do not know what the customers like, and we do
not know what they dislike. Aspect-oriented review mining goes one-step further and analyses
the customers’ sentiments with regard to individual product aspects. Whereas review
classification considers only a single dimension (namely “sentiment polarity”), aspect-oriented
review mining involves the joined analysis of two dimensions. The aim is to discover all relevant
product aspects and the related expressions of sentiment needs to be identified and their
polarity is determined. In contrast to review classification, the aspect-oriented task is better
characterised as a problem in information extraction than a problem in text categorisation. It is
transforming the unstructured information of a review text into a structured, aspect-oriented
summary.

The main problem in the context of aspect extraction is to identify those text passages that refer
to mentions of product aspects. Given a dictionary of relevant product aspects, the task would
be relatively easy. However, if the relevant product aspects are not known a priori, therefore
examining the provided collection of review documents is needed. Thus, the need to devise
methods that automatically extracts a set of the most relevant product aspects from a corpus of
reviews. To do so, notions of relevance need to be defined, and a desired level of granularity
must be identified. Very fine-grained aspects are considered (e.g., “colour accuracy”, “tone
reproduction”, “image noise”, or “chromatic aberration”) or more abstract concepts needs to be
considered (e.g., “image quality”, “ease of use”, “battery”, or “features”). Approaches to aspect
extraction can be /subdivided into three main classes: unsupervised, supervised, and topic
modelling approaches.

Unsupervised approaches are typically frequency-based and often involve the use of predefined
linguistic or syntactic patterns to detect candidate phrases. Most commonly, the goal is to
automatically construct a dictionary of product aspects from a given review corpus. It is possible
to subdivide unsupervised or lexicon-based approaches into two groups. First, find approaches
that use large corpora to find co-occurrences of a small seed list of words in order to find other
words with sentiment connotations as in [\cite{27}]. The second method uses sentiment
dictionaries and lists, such as WordNet [\cite{2}] to infer the word sentiment polarity. The first
strategy will be described in Section 2.2.2 and the second strategy will be presented in Section
2.2.3. The literatures of the proposed methods are discussed in the rest of the sections.

2.2.1 Methods based on Association Rules (AR)

[\cite{5}] first presented a scheme to extract product aspects based on association rule mining.
The main ideas are that consumers often use the same words when they comment on the same
product aspects, and that frequent item sets of nouns in reviews are likely to be product
aspects, while infrequent ones are less likely to be product aspects. The basic steps of the
algorithm are as follows:

1. Find frequent nouns and noun phrases. Nouns and noun phrases (or groups) are
identified by a POS tagger. Only the frequent ones are kept. The reason for using
this approach is that when people comment on different aspects of a product, the
vocabulary that they use tends to converge. Thus, frequently used nouns are usually
genuine and important aspects.

2. Find infrequent aspects by exploiting the relationships between aspects and opinion
words. The first step can miss many genuine aspect expressions that are infrequent.
This step tries to find them. The idea is that the same opinion word can be used to
describe or modify different aspects. Opinion words that can modify frequent aspects
can also modify infrequent aspects, and thus can be used to extract infrequent
aspects. For example, “picture” has been found to be a frequent aspect, and have
the sentence, “The pictures are absolutely amazing.” If “amazing” is known to be an
opinion word, then “software” can be extracted as an aspect from the sentence, “The
software is amazing,” because the two sentences follow the same dependency
pattern and “software” is also a noun.

[\cite{29}] tried to enhance the aspect extraction of previous systems using the results
from [\cite{5}] model as a starting point. It aims to acquire information from customer
reviews. Their model mapped the input to the user-defined catalogue of the aspect
hierarchy to eliminate redundancy and to provide conceptual organisation. [\cite{30}]
developed a set of heuristics and selection algorithms to extract aspects from reviews.
This model extracted noun phrases and then selected feature terms using likeness
scores. \cite{32}] added more enhancements to [\cite{5}] work by presenting an
unsupervised information system that acquires product aspects and opinions by mining
reviews and not involving frequently appearing nouns that do not fall into the aspect
category. This development increased the precision of results with low recall
performance from previous work. [\cite{24}] applied a different approach for recognising
nouns and verb phrases as aspect and opinion expressions. Their model then finds the
relationships between those. The method extends traditional dependency parsing to the
phrase level, which performs better in mining. On the other hand, [\cite{32}] focused on
extracting nouns/ noun phrases, and then used dependency parsing to map the
relationships between opinion words and target expressions. Both of these methods
achieved low precision and normal recall performance and were unable to extract
infrequently aspects.

2.2.2 Corpus-based Methods

Many studies of sentiment analysis (SA) have focused on extracting specific words, or given
parts of speech (POS) [\cite{33}]. Some of the POS tags (like adjectives) or sequences of POS
(adjective-noun) have been shown to be more effective in opinion detection. The work of
[Justeson and Katz, 1995[\cite{1}] adopts an NLP-approach based on POS filtering [\
cite{5}].The words in the text are automatically processed and marked with appropriate POS
tags. Afterwards, specific POS or given phrase patterns are filtered from the text, for example
two adjectives in a row. [\cite{13}] describe an approach based on the idea that the conjoined
adjectives have the same orientation, apart from the ones used in the opposite orientation. They
construct two clusters of adjectives using conjunction counts based on Wall Street Journal
articles. Although they achieve quite high accuracy, it is important to note that they manually
eliminated neutral adjectives on the first step. Other studies have focused on analysing single
words and POS to automatically deduce the polarity of a word from the data presented [\
cite{13}].

2.2.3 Dictionary-based Methods

A popular research trend includes the use of the lexical database WordNet [\cite{1}]. It
provides the grouping of words into synonym sets (called Synsets) and the semantic
relationships between them, such as antonyms, hyponyms, etc. [\cite{13}] used
WordNet to measure semantic orientation of adjectives by counting the number of
synonym links from the analyzed adjective to the seed words, such as good, bad, etc.
One of the successful uses of WordNet to construct a semantic lexicon belongs to [\
cite{3}]. Based on a small set of manually constructed words and expanded it using the
WordNet synonym and antonym relationships of adjectives. This work led to the
construction of SentiWordNet, which provides positive, negative and objective scores to
each gloss, brief definition of the Synset in WordNet. One of the drawbacks of this
lexicon is the variety of senses for some words that could take different scores.
Therefore, a thorough POS analysis or word sense disambiguation is needed to
accurately use this lexicon.

In their study, [\cite{22}] used glosses and lexical relations from WordNet. They started
with a small seed word list and extended it by the means of lexical relations in WordNet
(synonymy, antonymy and hyponymy). Later, they extracted words carrying sentiment
from glosses and assigned a polarity to the extracted terms. This was accomplished by
computing the world's degree of membership in a specific category based on how many
times the word had been assigned a specific category.

This section discusses the supervised approaches to aspect extraction. Computational

linguistic tools and techniques are concerned with language grammar generation and
modeling. They provide an interesting formulation and approach to aspect extraction
that cannot be directly applied to computer science methods and requires adaptation.
Therefore, they decided to concentrate this overview on the most prominent techniques
from CRF methods Section 2.2.5, information retrieval domains Section 2.2.6 and
machine learning Section 2.2.7.

2.2.4 Methods based on Dependency Relations (DR)

The idea of using the modifying relationship of opinion words and aspects to extract aspects can
be generalized to using dependency relations. [\cite{17}] employed the dependency relation to
extract aspect-opinion pairs from movie reviews. After being parsed by a dependency relation
parser, words in a sentence are linked to each other by a certain dependency relation.

Employing dependency patterns has produced promising results in a variety of research areas
involving different approaches to spot product aspects and their analogous opinions from
reviews for multiple languages. Several feature selection schemes have been used alongside
machine learning approaches, such as unigrams and bigrams [\cite{22}]. [\cite{34}] applied
syntactic relations between words in sentences for document sentiment organization. [\cite{35}]
employed dependency relations between words to extract features from text based on Concept
Net ontology. Afterwards they used a method called “mRMR”, which works as a feature
selection scheme to eliminate redundant information. [\cite{36}] presented a method that
extracts opinions and product aspects considering the syntactic and semantic information and
based on dependency relations and ontology knowledge. Pre-processing consisted of several
sub-tasks:

● Clean up the dataset by removing abnormal characters. It is necessary to have only

pure Text reviews.
● Employ Stanford CoreNLP techniques, such as lemmatization. This transforms the word
to its original form.
● As aspects and corresponding opinions at a sentence level are needed, the sentences
are Then split into small parts so that boundaries are drawn within the sentences. This
step Satisfies the assumption that the aspects and corresponding opinions can be found
within a single sentence.
● POS tagging is applied to determine the part of speech of each word. This step also
prepares the sentences for dependency parsing.
● Syntactic relations among words within the sentence are determined by applying
dependency parsing.

At this stage, the dataset is ready for the extraction process and the second phase is initiated.

This is divided into the following sub-task

● Aspect assumptions along with the aspect definitions are evaluated and the relations
between POS tags and the frequencies are probed. After this, each noun and noun
phrase with high frequency and satisfying the aspect definition, as aspects candidates
are located.
● Every dependency rule from previous work with all aspect assumptions and definitions to
locate the best combination of dependencies are examined. Then the highest preformed
dependencies are chosen. Now, all the extracted aspects are filtered to find the most
appropriate and most frequent product aspects.
● At this point a priori algorithm [\cite{35}] with a minimum support of 1% is chosen.
● The final step is to generate the opinion summary that contains the product aspects
along with their corresponding opinion and the orientation.

2.2.5 Conditional Random Fields (CRFs)

The accuracy level of the rule-based detection systems is approximately 50% as the review paper
presented by [\cite{37}]. During error analysis, it has been identified that theme identification
and subjectivity detection are deep semantic issues and it is nearly impossible to develop a
complete set of definite rules. To overcome the limitations of a rule-based system, a machine-
learning module has been developed with the already identified features along with a few
additional ones. The Conditional Random Field (CRF) [\cite{38}] machine learning algorithm
has been used. The CRF base subjectivity detection has achieved precision values of 76.08% and
79.90% for English news and movie review corpus, and 72.16% and 74.6% for Bengali news
and blog domains respectively [\cite{39}]. CRF consists of a set of statistical modelling methods
frequently used in pattern recognition and machine learning for structured prediction. This is
very important for ABOM applications. Previous as well as ongoing studies have revealed that
the sequence labelling approaches based on conditional relations increase the accuracy and
performance of unstructured prediction tasks. Some noteworthy models for sequence labelling
tasks are CRF [\cite{38}], HMM [\cite{3}] and Max-Margin Markov Networks [\cite{3}]. These
models presented considerable improvement in several practical fields such as NLP, pattern
recognition and information extraction. The models are applied to encode known relationships
between reviewers’ opinions and construct consistent interpretations of the reviews. With this
scheme, CRF can predict the sequence of labels for a given input sequence where the reviews
were considered as input sequences, and POS tags and opinion tags were used as output labels.

Generally, opinion extraction from customer reviews falls under the umbrella of phrase level
information mining. It aims to create a thorough sentiment analysis at the aspect level. Model-
based approaches like HMM and CRF aim to overcome the limitations of the other approaches.
HMM models assume that each feature is generated independently and ignore the underlying
relationships between the actual words and labels, as well as the overlapping features [\cite{32}].
CRF tackles those shortcomings since it is a discriminative model that instantiates the
overlapping dependent features. [\cite{3}] consider sentiment analysis as a hybrid task
information extraction problem that combines CRF as a sequence tagging task and AutoSlog [\
cite{2}] to learn the extraction patterns. Even though their system employs extraction learning
with CRF, it resulted in a recall performance of 54% with exact match.

2.2.6 Information Retrieval Methods

Representation of documents in most supervised approaches is based on the vector space model
[\cite{3}]. Every document is represented by a multi-dimensional vector, where each dimension
corresponds to some feature (term) in a document. Thus, a collection of documents can be
contained in a document matrix, where an element (x, y) means the number of times feature x
was encountered in document y. The idea is that it is possible to separate two classes of
documents shown as vectors in the feature space. The supervised model is said to be trained
when a classifier, trained on the set of labelled documents, constructs a multiplane that separates
the two classes of documents with reasonable degree of error. The use of the vector space model
for document sentiment classification was explored in the work of [\cite{40}]. They compose
two vectors to represent each document; the first is based on a calculation of the average
document frequency, while the second is built using the average subjective measure. They retain
terms with higher than average document frequency and subjective measure. For the feature
selection, they apply mutual information and Fisher discrimination ratio and then train the SVM
model. Experiments were carried out on different portions of the movie reviews corpus and show
improvement amelioration in performance in comparison to other feature weighting techniques
using the SVM classifier.

2.2.7 Machine Learning

The field of machine learning has provided many models that are used to solve various text
classification problems. Among them are Naïve Bayes (NB), Support Vector Machines (SVMs),
decision trees, maximum entropy, and Hidden Markov Models (HMMs). The detailed overview
of these and other algorithms can be found in the work of [\cite{41}]. So far, the most popular
machine learning approaches used as baselines are SVM and NB. [\cite{22}] analysed several
supervised machine learning algorithms on a movie reviews dataset, among them SVM, NB and
maximum entropy. They also tested different feature selection techniques. Features are usually
words, or bigrams of words, that could have been somehow pre-processed, for example,
stemmed or lemmatised. The best performance was reported using the SVM method with
unigram text representation. It has to be noted that the authors took into account just the presence
of a feature, and did not count POS tagging information which may improve the effectiveness of
NB and maximum entropy methods, but tends to decrease the performance for SVM. In a later
study, [\cite{22}] proposed to separate subjective sentences from the rest of the text at first. They
assumed that two consecutive sentences would have similar subjectivity labels, as the author is
inclined not to change sentence subjectivity too often. Thus, labelling all sentences as objective
and subjective they reformulate the task of finding the minimum s-t cut in a graph\cite{42}].

2.3 Aspect-Sentiment Classification

2.3.1 Polarity Classification

Probably the most well studied task in this field is polarity classification. Typically, the polarity
classification is considered a binary classification problem. Given a subjective text (e.g., a
customer review or an editorial comment), the goal is to determine whether the general tone of
the text is predominantly positive or negative. Obviously, a crucial point is how to define the two
poles of sentiment. What is a positive opinion and what is a negative opinion? It is impossible to
provide a single answer here. A definition is heavily dependent on the concrete application
scenario, and differences may be subtle. For example, in the context of political debates,
“positive” may refer to support and “negative” may refer to opposition. When classifying
customer reviews, the definition typically considers the evaluative nature of the text. Does the
reviewer like or dislike the product? Providing a specific definition becomes even more
important when computationally treating the sentiment polarity as a classification task.

Early contributions in this area include those by [\cite{28}] and [\cite{22}] who investigated
different approaches for identifying the polarity of product reviews and movie reviews
respectively. The opinions expressed by a writer towards a target can be divided into a number of
classes such as “positive”, “negative”, and “neutral” (i.e. determining the valence); orinto a
discrete measurement scale such as “excellent”, “good”, “satisfactory”, “poor”, and “very poor”;
or by a number of emotions such as “joy”, “sadness”, “anger”, “surprise”, “disgust”, and “fear”.
In this context, a sentiment analysis process is a spectrum of tasks where each task articulates a
sentiment.

When working with only two classes such as “positive” versus “negative” or “good” versus
“poor”, then you are dealing with the idea of “polarity” classification. In the case of
movie/product reviews, rating systems with stars or the terms “thumbs up” and “thumbs down”
are used very often as in [\cite{28}] and [\cite{22}]. A document’s polarity can also be classified
on a multi-way scale, which was attempted by [\cite{3}] and [\cite{22}] among others. [\
cite{22}] expanded the basic approach of classifying a movie review as either positive or
negative to predicting star-based ratings on either a three-star or four-star scale, while Snyder
performed an elaborate analysis of restaurant reviews, predicting ratings for various aspects of
the given restaurant, such as the food and atmosphere on a five-star scale.

A different method for determining sentiment is the use of a scaling system. This scheme
involves words generally associated with a negative, neutral, or positive sentiment being given
an associated number ranging from -10 to +10, from most negative to most positive. When a
piece of unstructured text is analysed using natural language processing, the subsequent concepts
are analysed for an understanding of these words and how they are related to the concept. Each
concept is then associated with a score according to the way sentiment words relate to the
concept. Alternatively, texts can be provided with positive and negative sentiment strength
scores if the goal is to measure the sentiment in a text rather than the overall polarity and strength
of the text.

2.3.2 Emotion classification

The task of detecting the expression of emotion in natural language text can be considered as a
refinement of the sentiment polarity classification task. The goal is to classify a piece of text
according to a predefined set of basic emotions. Whereas sentiment polarity is commonly viewed
as dichotomous (positive vs. negative), emotion classification tries to identify more fine-grained
differences in the expression of sentiment. Most commonly,\cite{33}] six “basic” emotions -
anger, disgust, fear, happiness, sadness, and surprise - are used as class labels for this task.
Besides deriving a categorisation from psychological theories of emotion, class labels may also
be defined ad hoc, based on concrete application needs. Applications for emotion classification
are manifold, ranging from analysis of customer feedback or observing trends in public mood to
analysis of clinical records [\cite{31}]

2.3.3 Subjectivity classification

Subjectivity classification is primarily considered as a binary classification task. Its goal is to

separate subjective from objective information. Again, the problem may be tackled at different
levels of granularity. For instance, at the document level the aim is to distinguish review like
documents from non-review documents or factual newspaper articles from editorial comments.
Subjectivity classification is also an important subtask in sentiment retrieval. Many supervised
and unsupervised techniques have been explored for subjectivity annotation tasks by various
researchers over a long period [\cite{18}]. Several linguistic resources and tools like dependency
parsing, named entity recognition, morphological analysers, stemmer, SentiWordNet, and
WordNet have been used in the subjectivity detection task. However, in the case of
morphologically rich Indian languages like Bengali, such resources and tools are not readily
available. Highly inspired by [\cite{44}] the present work was initiated to develop a subjectivity
classifier that will work on unannotated text documents. The aim is to design an automatic
process that learns linguistically rich extraction patterns for subjective expressions and produces
a rich ontological language-specific (rather than domain dependent) knowledge.

2.3.3 Stance based classification

Since the past ten years, active research has been done in modeling overall positions in
usergenerated contexts. However, the majority of the works focused on congressional debates [\
cite{45}] or debates in online forums [\cite{46}]. Nevertheless, stance detection in other forms
of user-generated contents like Twitter data and news comments are mostly unexplored.

Practically in all of the present approaches, Supervised machine learning is primarily used for
stance classification, in which a large set of data has been collected and annotated in order to be
used as training data for classifiers. In [\cite{47}], a lexicon was majorly used to identify
arguments. Such attributes as arguments, combined with their targets and sentiment expressions,
were employed as a feature in a supervised learner for stance classification.

In [\cite{48}], several features were deployed in a rule-based classifier, such as punctuation

marks, unigrams, syntactic dependencies, bigrams and the post of dialogic structure. The authors
showed that there is no significant difference in performance between systems that use only word
unigrams and systems that also use other features such as Linguistics Inquiry Word Counts
(LIWC) and POS generalized dependencies. The relations of disagreement and agreements
between posts were also exploited by [\cite{49}]. In this research, we decided to focus only on
the text of the user-generated contents as these relationships are not provided for our stance
datasets.

[\cite{50}] researched the problem of detecting document-level stance in essays written by

students and using two sets of features that will represent stance-taking language. Different
machine learning algorithms are employed for automatic classification of overall position from
unstructured text. While SVM and logistic regression were widely used in various studies [\
cite{47}], Conditional Random Fields (CRF).

If several consumers retweet a pair of tweets about a controversial topic, it is concluded that they
largely support the same side of a debate, [\cite{51}] investigated in his work related to
determined stance at user-level. Thus in this research, the author is focussed on identifying
stance, as much as possible, from a singular tweet. Features that help to this end will likely also
be useful when there is access to multiple tweets from the same tweeter. In another work for
Twitter stance detection, bi-directional Long Short Term Memory was used to encode the target
and the tweet [\cite{52}]. In that method, the representation of the tweet and the target depend on
one another and the experiments demonstrated improvement over independently encoding the
tweet and the target.

In this research, stance detection in user generated contents, can be seen as one of the
applications of textual entailment, where the primary goal is to infer a person’s opinion towards a
given target based on a single tweet written by this person.

2.4 Aspect-Opinion Mining Applications

2.4.1 Opinion Summarisation

The requirements of the end user are the driving force behind sentiment analysis research. The
outcomes of these research endeavours should lead to the development of a real-time sentiment
analysis system, which will successfully satisfy the needs of the end users. Let us have a look at
some real-life needs of the end user. For example, a market surveyor from company A may want
to know how public opinion about their product X has changed after the release of product Y by
company B. The different aspects of product Y that the public consider better than product X are
also points of interest. These aspects could typically be the durability of the product, power
options, weight, colour and many other issues that depend on the particular product. In another
scenario, a voter may be interested to study the change of public opinion about any leader or
public event before and after an election. In this case, the aspect could be a social event,
economic recession or may be other issues. The end users are not only looking for the binary
(positive/negative) sentiment classification but they are also interested in aspectual sentiment
analysis. Therefore, sentiment detection and classification is not enough to satisfy the needs of
the end user.

The topic-opinion model is the most popular but end users may want to look into an at-a glance
presentation of opinion oriented summaries. For example, a market surveyor from company A
might be interested in the root cause for why their product X (e.g., a camera) is becoming less
popular day by day. Company A may want to look into the negative reviews only. Relatively few
research efforts could be found in the literature on the polarity-wise summarisation compared to
the popular topic opinion model. Finally, fine-grained, feature-based opinion summarisation was
defined by [\cite{8}].

2.4.2 Opinion Question Answering

Question Answering (QA) can be narrated in the same way as the NLP task. Here, a set of
questions and a collection of documents are presented to an automatic NLP system. This
system is employed to retrieve the answer to the queries in Natural Language (NL). Studies into
building factoid QA systems have a long tradition. However, it is only recently that researchers
began to focus on the development of Opinion Question Answering (OQA) systems.

Due to the immense importance of blog reviews, the beginning of NLP research focused on the
development of OQA systems and the organisation of international conferences encouraging
the creation of effective QA systems both for factual and subjective texts. [\cite{43}] used a SVM
classifier trained on the MPQA corpus [\cite{44}], English NTCIR8 data and rules based on the
subjectivity lexicon [\cite{44}] [\cite{40}] and accomplished query analysis to spot the polarity of
the question using defined guidelines. Likewise, they filtered opinion from fact-retrieved snippets
using a classifier based on Naïve Bayes with unigram features, assigning each sentence a
score that is a linear combination between the opinion and polarity scores.

2.5 Challenges
While trying to identify expressions of sentiment in natural language text and to determine the
conveyed polarity as well as considering the main sub-problem of extracting product aspects
from customer reviews, the following challenges are primarily expressed:

2.5.1 Contextual Polarity

The sentiment polarity of a phrase may be context dependent. For instance, consider the
sentence “the hotel staff was not very friendly”. The negation “not” flips the otherwise positive
polarity of the word “friendly”. Words, phrases, or syntactic constructions that affect the
sentiment polarity or sentiment strength are commonly denoted as sentiment or valence
shifters. A detailed study of contextual polarity was conducted by [\cite{44}].

2.5.2 Implicit Sentiment

Besides explicit expressions of sentiment (e.g., “the check-in process went fast”), sentiment
may also be manifested implicitly (“needed to wait two hours to check in”). Whereas the first
example includes a subjective assessment (“fast”), the second example merely expresses a fact
(two-hour waiting time). To infer a negative evaluation of the check-in process in the second
example, the common sense knowledge needed to be applied, in which a two-hour waiting time
is normally inappropriate. In the literature this form of implicit sentiment is referred to as
objective polar utterance, evaluative fact, or polar fact [\cite{44}].

2.5.3 Implicit Aspect Mentions

Given a dictionary of relevant product aspects, it is relatively easy to identify the explicit
mentions of those aspects in a review text. It is much more difficult to discover implicit mentions.
For instance, the phrase “slept like rocks” can be considered. This phrase implicitly refers to the
aspect “quality of sleep”. Another example would be “the camera is too heavy”. Without explicitly
mentioning the term, the reviewer criticises the camera’s weight

2.5.4 Relations Among Aspects

Product aspects are typically related among each other. For instance, part-of or type-of relations
between different aspects can be observed (e.g., a lens cover is part of a camera lens and the
landscape mode is a type of digital camera mode). Depending on the application scenario,
these hierarchical relations needed to be made explicit and construct some sort of product
aspect taxonomy. Another common relation is similarity. It is reasonable to group similar
aspects and to detect synonyms. For instance, representing aspect references such as “image
quality”, “picture quality”, or “quality of image” by the single canonical form “image quality”.
Automatically grouping entities (e.g., product aspects) and determining relations between them
can be considered as a problem of ontology learning [\cite{3}].

2.5.5 Target-Specific Polarity

The sentiment polarity of words and phrases may depend on the modified target. For instance,
consider the adjective “long”. If it modifies the product aspect “battery life”, it refers to a positive
evaluation. However, in the context of the aspect “flash recycle time”, it would be interpreted as
negative. Most sentiment lexicons ignore this phenomenon and only consider the prior polarity
of words.

2.6 Conclusion
In this chapter, the research topic of aspect-based opinion mining with stance detection is
viewed in a wider context. In particular, the chapter provided a broad overview of pre-
processing, enhanced aspect identification and opinion word extraction, aspect polarity
identification and aspect product summary. In Section 2.2, basic terminologies were defined
from the NLP perspective. In addition, it discussed the different definitions of entities in the
context of sentiment analysis, and then defined sentiment and concluded with the nature of
customer review mining. Section 2.3 introduced the task of aspect-based opinion mining. It
elaborated on the supervised and unsupervised methods of identifying and extracting aspects.
In Section 2.4, different types of sentiment classification methods were discussed and sentiment
polarity was explained in detail. Section 2.5 showed the most active applications in recent years,
which included opinion summarisation. Section 2.6 discussed the challenges and limitations.
The following chapter will illustrate and discuss the proposed method using a deep learning
model.
3.0 Research Methodology
3.1 Introduction
With the development of the Internet and the usage of ecommerce, customers prefer to post
reviews of products and show their opinions on shopping websites (e.g. Amazon, ebay and etc)
which provides plentiful information for marketing intelligence. Product aspect mining, aiming at
extracting the aspects and corresponding opinions from the product review, will benefit
customers and merchants by helping them make smart purchase decisions and efficient
marketing strategy.

Our research methodology aims to detect an aspect i.e. entity or target in a given customer
review and then perform aspect-sentiment classification through stance detection of the review
corresponding with that aspect. We propose a RNN-LSTM to extract aspects from given
customer review and a multi-channel CNN for stance detection on the review text.

3.2 Proposed Architecture

A customer review can have multiple targets or entities. All the targets present in a review text
and snippet related to them are given. Task is to find the aspect term present in each snippet
and stance disposition of each review text towards each target. Our proposed system has two
major tasks, an aspect term extraction model and an aspect sentiment classification model.
Firstly each review snippet is passed through an enhanced RNN-LSTM based aspect term
extraction model which recognizes the aspect present. Each target or entity present in the
review text is used to calculate an enhanced word vector. These enhanced word vectors link
stance disposition with the corresponding targets. Finally, an enhanced word vector is passed
into the CNN based aspect sentiment classification model to generate a stance score
(favourable or against) of a review text towards the target. Figure represents a high level picture
of our proposed architecture.
3.2.1 Customer Review Dataset
The experiment will be conducted using Hu and Liu’s dataset [\cite{3}]consisting of annotated
customer reviews of five different products:(Canon G3, Apex AD2600 Progressive-scan DVD
player, Micro MP3, Nikon Coolpix 4300 and Nokia 6610 mobile phone).These reviews, written
by different customers , were collected from [Link] and [Link] . The reviews contain
3,552 sentences. Each review datasets approximately had more than 260 sentences which
were found to be opinionated reviews posted by 525 different customers. The format of the
datasets is unstructured text files. To further evaluate the discovered aspects, all aspects and
associated opinions were read and labelled manually for each sentence by a human tagger.
Before we use the datasets, we pass the dataset to a pre-processing filter to remove all humane
annotations and keep the original collected reviews.

Product Name No. of Reviews

Canon G3 653
Nokia 6610 598
Micro MP3 1060
Nikon Coolpix 4300 391
Apex AD2600 Progressive-scan DVD player 850
Total 3552
3.2.2 Dataset Preparation
The first task of the proposed method is to prepare and pre-process the dataset by removing all
human annotations, stop words, punctuation and all abnormal symbols such as {, {, :), :(, ##, ...,
{, and more. For these purposes, regular expressions will be used. It is further assumed that
product aspect and their corresponding opinions are within the individual sentence boundary. To
extract aspects and opinions from reviews, the reviews are parsed through dependency tree
parser and the Parts Of Speech (POS) tags are given to all words.

3.2.3 Aspect Term Extraction

Motivated by the recent achievements of deep learning, Recurrent neural networks models with
word embeddings are proposed in this proposal. The ability of RNN architecture based models
to recognize long range patterns in input data makes it a favorable choice for NLP tasks without
any feature engineering effort. To analyse the input in both forward and backward direction
simultaneously Bidirectional LSTM RNN is proposed for aspect term extraction task.

Word Vectors are used to represent sentences because bag-of-word sentence representation is
not effective to extract the semantic and positional relationships between words. Previously
trained words are used in order to prevent overfitting of our models. Any one of the following
word embeddings systems like Stanford GloVe[\cite{53}], Google-News-Word2Vec[\cite{54}],
Godin[\cite{55}], FastText[\cite{56}] and Keras2 will be used to inbuilt embedding layer.

In word vector representation, each sentence is represented as a matrix

R n×d
where ‘n’ is the number of words in a sentence and ‘d’ is the dimension of word embedding.

As an input, the proposed Bidirectional LSTM RNN takes word vectors and provides a
probability distribution as an output over multiple aspect classes. The one with the highest
probability value is taken as the output aspect. Even if the given review text has only a singular
aspect per snippet but the model can be further extended for numerous aspects as shown by [\
cite{57}]. For numerous aspects to be extracted successfully, probability of every class above a
threshold ‘θ’ is treated as an aspect. Here ‘θ’ is a hyperparameter. Bayesian optimization [\
cite{58}] is used for discovering the best selection of hyperparameters.

3.2.4 Aspect Sentiment Classification

Multichannel convolutional neural networks have shown state-of-the-art performance for
sentiment classification tasks[\cite{59}]. Nowadays sentences are polarized into distinct classes
e.g. positive, negative and neutral by majority of the deep learning models but for the challenge,
we need to provide a sentiment intensity score in form of stance detection as [against, favour].
Rule based models which uses a predefined intensity score of each word to calculate the stance
disposition of entire sentence are proposed[3], but they fail in complex cases where a negative
comment can be communicated using negation of positive words. Complex situations like these
are successfully handled by Deep learning models.

We use a slightly modified version of the model proposed by [\cite{59}]. In the output layer of our
CNN, we have only one neuron with sigmoid activation function. The proposed CNN based
deep learning model is further trained on a data-set of various sentences and their respective
sentiment intensity score. Input is given in the form of enhanced word vector and output is
generated between the range [0,1] as the sentiment intensity score which is further scaled to
[against, favour] before reporting. In this model too, hyperparameter is the choice of word
embedding and each channel can incorporate a different word embedding. We use Bayesian
optimization [\cite{58}] for finding the best combination of hyperparameters.

3.2.5 Model Evaluation

To evaluate the performance of the proposed opinion mining system, different evaluation
metrics are employed and compared with the Conditional Random Fields (CRF).

1. Accuracy (Acc) for stance detection

2. Precision (Pre) for stance detection

3. Recall (Rec) for stance detection

4. FScore for stance detection

Here, TP is the number of reviews which are in favour of target and are predicted as favor, FP
represents the number of reviews which are against the target and are predicted as favor.
Similarly, TN is the number of reviews which are against the target and are predicted as against.
FN is the number of reviews which support the target and are predicted as against.

5. Precision for product aspect mining (e.g. aspect extraction)

6. Recall for product aspect mining (e.g. aspect extraction)

3.2.6 Environment Setup

The below Table 3 and Table 4 show the expected system configuration and technical
specifications required for the proposed research implementation.

Memory Processor Speed

8 GB RAM Intel i5 8250U 1.80 GHz

Table 3: Hardware Specifications

Table 4: Software Specifications

Storage Software and Libraries

Anaconda - Jupyter Notebook, Load libraries numpy,
AWS S3 subscription pandas,scikit,matplotlib,sklearn, dlib, keras, tensorflow

Sentiment Analysis
No ratings yet
Sentiment Analysis
2 pages
Sentiment Analysis
No ratings yet
Sentiment Analysis
8 pages
Aspect-Level Sentiment Analysis Survey
No ratings yet
Aspect-Level Sentiment Analysis Survey
18 pages
Business Sentiment Analysis Guide
No ratings yet
Business Sentiment Analysis Guide
10 pages
Data Mining Ass
No ratings yet
Data Mining Ass
14 pages
Opinion Mining in Customer Reviews
No ratings yet
Opinion Mining in Customer Reviews
7 pages
Sentiment Analysis or Opinion Mining: Project Synopsis
No ratings yet
Sentiment Analysis or Opinion Mining: Project Synopsis
6 pages
Opinion Mining Techniques Guide
No ratings yet
Opinion Mining Techniques Guide
6 pages
Techniques for Sentiment Analysis Detection
No ratings yet
Techniques for Sentiment Analysis Detection
5 pages
SA Notes
No ratings yet
SA Notes
61 pages
Twitter Sentiment Analysis 2016 Elections
No ratings yet
Twitter Sentiment Analysis 2016 Elections
4 pages
Opinion Mining
No ratings yet
Opinion Mining
7 pages
4 - Sentiment Analysis - Plain
No ratings yet
4 - Sentiment Analysis - Plain
64 pages
An Overview of Opinion Mining
No ratings yet
An Overview of Opinion Mining
5 pages
Introduction to Sentiment Analysis
No ratings yet
Introduction to Sentiment Analysis
33 pages
Literature Review On Feature Identification in Sentiment Analysis
No ratings yet
Literature Review On Feature Identification in Sentiment Analysis
6 pages
University Synopsius
No ratings yet
University Synopsius
3 pages
Sentiment Analysis Overview
No ratings yet
Sentiment Analysis Overview
6 pages
Masters' Thesis Report 1-1
No ratings yet
Masters' Thesis Report 1-1
5 pages
11 Sentiment Analysis A New Paradigm in Natural Language Processing
No ratings yet
11 Sentiment Analysis A New Paradigm in Natural Language Processing
4 pages
Project Report
No ratings yet
Project Report
20 pages
A Research Study of Sentiment Analysis and Various Techniques of Sentiment Classification
No ratings yet
A Research Study of Sentiment Analysis and Various Techniques of Sentiment Classification
21 pages
Review of Online Product Using Rule Based and Fuzzy Logic With Smileys
No ratings yet
Review of Online Product Using Rule Based and Fuzzy Logic With Smileys
6 pages
Sentiment Analysis for Researchers
No ratings yet
Sentiment Analysis for Researchers
248 pages
Sentiment Annotation Guide
No ratings yet
Sentiment Annotation Guide
16 pages
Minor Fnal
No ratings yet
Minor Fnal
22 pages
Introduction To Sentiment Analysis
No ratings yet
Introduction To Sentiment Analysis
7 pages
Set Reference Final PDF
No ratings yet
Set Reference Final PDF
4 pages
A Comprehensive Review On Sentiment Analysis
No ratings yet
A Comprehensive Review On Sentiment Analysis
29 pages
Chapter 11: Opinion Mining: Introduction - Facts and Opinions
No ratings yet
Chapter 11: Opinion Mining: Introduction - Facts and Opinions
38 pages
A Critical Review of Sentiment Analysis: Fatehjeet Kaur Chopra Rekha Bhatia
No ratings yet
A Critical Review of Sentiment Analysis: Fatehjeet Kaur Chopra Rekha Bhatia
4 pages
Machine Learning Algorithms For Opinion Mining and Sentiment Classification
No ratings yet
Machine Learning Algorithms For Opinion Mining and Sentiment Classification
6 pages
Sentiment Analysis On Data of Social Media: Aditya Zaware
No ratings yet
Sentiment Analysis On Data of Social Media: Aditya Zaware
5 pages
Sentiment Analysis of Product Based Reviews Updated
No ratings yet
Sentiment Analysis of Product Based Reviews Updated
69 pages
Opinion Mining & Sentiment Analysis
No ratings yet
Opinion Mining & Sentiment Analysis
94 pages
Supervised Learning Based Approach To Aspect Based Sentiment Analysis
No ratings yet
Supervised Learning Based Approach To Aspect Based Sentiment Analysis
5 pages
Sentiment Analysis and Opinion Mining For Mobile Network
No ratings yet
Sentiment Analysis and Opinion Mining For Mobile Network
9 pages
1033 ArticleText 1234 1 10 20180930
No ratings yet
1033 ArticleText 1234 1 10 20180930
8 pages
Urdu Sentiment Analysis Dataset Creation
No ratings yet
Urdu Sentiment Analysis Dataset Creation
16 pages
Twitter Sentiment Analysis Study
No ratings yet
Twitter Sentiment Analysis Study
15 pages
Research On Domain-Independent Opinion Target Extraction: Sun Yongmei and Huo Hua
No ratings yet
Research On Domain-Independent Opinion Target Extraction: Sun Yongmei and Huo Hua
12 pages
Opinion Mining: Dr. Alaa El-Halees Faculty of Information Technology Islamic University of Gaza Seminar 9/9/2008
No ratings yet
Opinion Mining: Dr. Alaa El-Halees Faculty of Information Technology Islamic University of Gaza Seminar 9/9/2008
34 pages
Reasearch Paper
100% (1)
Reasearch Paper
9 pages
Abstract
No ratings yet
Abstract
5 pages
Sentiment Analysis of Amazon Reviews
No ratings yet
Sentiment Analysis of Amazon Reviews
3 pages
44 - Aspect-Level Sentiment Analysis On E-Commerce Data
No ratings yet
44 - Aspect-Level Sentiment Analysis On E-Commerce Data
5 pages
A Survey On Challenges and Techniques of Sentiment Analysis
No ratings yet
A Survey On Challenges and Techniques of Sentiment Analysis
6 pages
A Survey of Sentiment Analysis Techniques: Harpreet Kaur Veenu Mangat Nidhi
No ratings yet
A Survey of Sentiment Analysis Techniques: Harpreet Kaur Veenu Mangat Nidhi
5 pages
Opinion Search and Retrieval From WWW: Dr. A. Padmapriya, S. Maheswaran
No ratings yet
Opinion Search and Retrieval From WWW: Dr. A. Padmapriya, S. Maheswaran
5 pages
Sentiment Analysis of Product Reviews A Review
No ratings yet
Sentiment Analysis of Product Reviews A Review
6 pages
Sentiment Analysis Tutorial 2011
100% (2)
Sentiment Analysis Tutorial 2011
198 pages
Survey on Sentiment Analysis in Reviews
No ratings yet
Survey on Sentiment Analysis in Reviews
4 pages
Cin2015 715730
No ratings yet
Cin2015 715730
9 pages
Modals: Worksheet in English 9 Quarter 1 Week 2
50% (2)
Modals: Worksheet in English 9 Quarter 1 Week 2
12 pages
DETERMINERS PYQs
No ratings yet
DETERMINERS PYQs
45 pages
Chapter 5.1 - Mail Room
100% (1)
Chapter 5.1 - Mail Room
24 pages
Smart PPT
No ratings yet
Smart PPT
17 pages
Tenses in Urdu PDF
74% (156)
Tenses in Urdu PDF
15 pages
Iso 301-2006
No ratings yet
Iso 301-2006
14 pages
Common Sentence Errors & Solutions
No ratings yet
Common Sentence Errors & Solutions
5 pages
Fairchild 9445 Microprocessor
No ratings yet
Fairchild 9445 Microprocessor
5 pages
French English Cognates Les Mots Apparentés Giant Crossword Les Mots Croisés
No ratings yet
French English Cognates Les Mots Apparentés Giant Crossword Les Mots Croisés
1 page
June 2010 (v2) MS - Paper 1 CIE Physics IGCSE
No ratings yet
June 2010 (v2) MS - Paper 1 CIE Physics IGCSE
2 pages
Final Documentation-Customer Information System Dem
No ratings yet
Final Documentation-Customer Information System Dem
18 pages
Effective Communication
No ratings yet
Effective Communication
9 pages
Notable Mathematicians of Sikkim
67% (3)
Notable Mathematicians of Sikkim
3 pages
General Principles of Drafting
No ratings yet
General Principles of Drafting
11 pages
As ISO 13584.24-2004 Industrial Automation Systems and Integration - Parts Library Logical Resource - Logical
No ratings yet
As ISO 13584.24-2004 Industrial Automation Systems and Integration - Parts Library Logical Resource - Logical
24 pages
PDF SAP S4HCON E S4HCON2019 Certificati PDF
No ratings yet
PDF SAP S4HCON E S4HCON2019 Certificati PDF
7 pages
Sindhu Internship Report
No ratings yet
Sindhu Internship Report
38 pages
Autocad Electrical Lab Assignment - Ver 8.0
No ratings yet
Autocad Electrical Lab Assignment - Ver 8.0
13 pages
Java Reliable Multicast Overview
No ratings yet
Java Reliable Multicast Overview
22 pages
Msce Mathematics p1 2025
100% (3)
Msce Mathematics p1 2025
11 pages
Digital Documentation Advanced Revision Notes
No ratings yet
Digital Documentation Advanced Revision Notes
10 pages
Chapter 4 Part 1 (A) : Logical Database Design and The Relational Model
No ratings yet
Chapter 4 Part 1 (A) : Logical Database Design and The Relational Model
48 pages
Ec8691 Microprocessors and Microcontrollers MCQ
No ratings yet
Ec8691 Microprocessors and Microcontrollers MCQ
44 pages
SAP BW/4HANA Exam Prep Guide
100% (1)
SAP BW/4HANA Exam Prep Guide
33 pages
Homi Bhabha's Third Space and African Identity
100% (1)
Homi Bhabha's Third Space and African Identity
11 pages
Waveform 02 Data HC
No ratings yet
Waveform 02 Data HC
19 pages
Fathers of The Church
No ratings yet
Fathers of The Church
12 pages
Combine Result
No ratings yet
Combine Result
12 pages
Action Plan in Araling Panlipunan
No ratings yet
Action Plan in Araling Panlipunan
1 page
Previewpdf
No ratings yet
Previewpdf
52 pages

Raw Content

Uploaded by

Raw Content

Uploaded by

2 Literature Review

Opinion, in general, is “a view or a judgement formed about something that is not

Sentiments may be narrated as opinions, ideas or as judgements manifested by

Human sentiment knowledge grows by day-to-day cognitive interactions. Sentiment is

Physical objectives of a camera, including the camera itself, LCD screen,

Capabilities provided by a camera, including movie playback, zoom, and

Properties of components or functions, such as colour, speed, size, weight,

Ideas and thought expressed by reviewers on product, features,

Opinions components or functions

Other Other possible entities defined by the domain

Entity Discovery and assignment

User-generated content, which in this research is customers’ reviews, contains different

Customer Review Mining

Increasingly large numbers of customers choose online shopping as it is more convenient,

Sources of Online Review

Formats of Online Reviews

Format 1- Free-format detailed review:

Format 2 - Pros and Cons:

Format 3 - Pros, Cons, and the Detailed Review:

2.2 Aspect-Based Opinion Mining

2.2.1 Methods based on Association Rules (AR)

2.2.2 Corpus-based Methods

2.2.3 Dictionary-based Methods

This section discusses the supervised approaches to aspect extraction. Computational

2.2.4 Methods based on Dependency Relations (DR)

● Clean up the dataset by removing abnormal characters. It is necessary to have only

This is divided into the following sub-task

2.2.5 Conditional Random Fields (CRFs)

2.2.6 Information Retrieval Methods

2.2.7 Machine Learning

2.3 Aspect-Sentiment Classification

2.3.2 Emotion classification

2.3.3 Subjectivity classification

Subjectivity classification is primarily considered as a binary classification task. Its goal is to

2.3.3 Stance based classification

In [\cite{48}], several features were deployed in a rule-based classifier, such as punctuation

[\cite{50}] researched the problem of detecting document-level stance in essays written by

2.4 Aspect-Opinion Mining Applications

2.4.2 Opinion Question Answering

2.5.1 Contextual Polarity

2.5.2 Implicit Sentiment

2.5.3 Implicit Aspect Mentions

2.5.4 Relations Among Aspects

2.5.5 Target-Specific Polarity

3.2 Proposed Architecture

Product Name No. of Reviews

3.2.3 Aspect Term Extraction

In word vector representation, each sentence is represented as a matrix

3.2.4 Aspect Sentiment Classification

3.2.5 Model Evaluation

1. Accuracy (Acc) for stance detection

2. Precision (Pre) for stance detection

3. Recall (Rec) for stance detection

4. FScore for stance detection

5. Precision for product aspect mining (e.g. aspect extraction)

6. Recall for product aspect mining (e.g. aspect extraction)

3.2.6 Environment Setup

Memory Processor Speed

Table 3: Hardware Specifications

Table 4: Software Specifications

Storage Software and Libraries

You might also like