0% found this document useful (0 votes)
38 views16 pages

SAS Individual Assignment

This document is a cover page and assignment for the course INFS5730 Social Media Analytics in Practice, submitted by Shiyu Lei. It includes an analysis of customer sentiments regarding coffee products, focusing on predefined concepts like time and money, as well as custom concepts related to factors affecting customer satisfaction. The findings suggest that flavor, convenience, and price significantly influence customer preferences and satisfaction levels.

Uploaded by

clairel4752
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views16 pages

SAS Individual Assignment

This document is a cover page and assignment for the course INFS5730 Social Media Analytics in Practice, submitted by Shiyu Lei. It includes an analysis of customer sentiments regarding coffee products, focusing on predefined concepts like time and money, as well as custom concepts related to factors affecting customer satisfaction. The findings suggest that flavor, convenience, and price significantly influence customer preferences and satisfaction levels.

Uploaded by

clairel4752
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

INFS5730 Social Media Analytics in Practice

Cover page for the SAS Hands-On Assignment

Name: _____Shiyu LEI________________________________________

Student Number: ___Z5467025_______________________________________

Course: _____INFS5730_Social Media Analytics In Practice___________

Lecturer/Tutor: __ Chedia Dhaoui/ Nizar Hoblos & Shourong Lin__ _____

Word Count: ____2268________________________________________

Date Submitted: ___2024/3/25_____________________________________

I declare that this assessment item is my own work, except where acknowledged, and has not been
submitted for academic credit elsewhere, and acknowledge that the assessor of this item may, for the
purpose of assessing this item:

 Reproduce this assessment item and provide a copy to another member of the University; and/or,
 Communicate a copy of this assessment item to a plagiarism checking service (which may then
retain a copy of the assessment item on its database for the purpose of future plagiarism checking).

I certify that I have read and understood the University Rules in respect of Student Academic
Misconduct.

Signed and Date:


Shiyu LEI
2024/3/25

This cover sheet has to be completed and signed (electronic signature is also acceptable) for the assignment
submitted. Once completed the document should be scanned/ photographed/ attached and submitted with the
assignment.

Note: Marks may be deducted for this assessment submitted without a fully completed and signed cover
page.
Code
Provider
00098G
CRICOS

Page 1
Part. 1
1. Predefined concepts
1) Time

I choose nlpTime as my first predefined concept, what I try to discover here is what time slot
do people like to have their coffee or post reviews.

First, I open the concepts node, after selecting time, we can see that there are 216 of
matching texts out of total 4691 documents. Then I open the text parsing and sort role and
frequency for further analysis.

As we can see from the results, “in the morning” is the most common term appear in the
documents, with a total of 145. Combine with the reviews, we can conclude that people most
likely to drink their coffee in the morning, with most reviews mentioned.

Page 2
The second most frequently mentioned term is “this morning”, with 24 frequencies. we can see
that this term mostly appears in the reviews that posted the same day the customers first tried
their coffees, which is mostly the product review or first impression. And by analysing the
review with this term, we can learn how the customer react to each product.

To better interpretation, I use the term “morning” (noun) in the text to generate the term map.

In the total of 4691 documents, this term appears in 449 of them. The line is thickest with the
term “in the morning”, which means this phrase has the largest number of documents
associated with the selected term, in total of 145 documents. And the keyword associated with
morning, include “morning drink” (146), “work” (25 out of 75 associate with morning drink), and
“wake” (27), which we can interpret that most customers drink coffee in the morning for
invigorating or as a drink. After discovering customer usage contexts, we can design targeted

Page 3
marketing plans, for example, highlighting that the product not only flavourful, but also
refreshing and it is the best way to start the day with a cup of coffee.

The good things about predefined concepts are that they have categorised each review into
different groups by terms, and by looking at each term, we can analysis and summarize the
pattern. However, the shortcoming is the analysis is too lenient if only conduct this, and thus
the results may be wrong or misleading.

2) Money

I choose nlpMoney as the second predefined concept, and the purpose is to discover what are
customers’ opinion regarding the price.

After selecting the concept, we can see there are in total of 237 matched documents. For
better understanding, I go to text parsing node for further analysis. And by looking at the top
documents, we can see customers are comparing the price based on dollars per ounce, and
thus whether it’s worth the purchase.

Page 4
As we can see, after sorting, the term $2 appears in 12 documents, make it the most
frequently term.

The second most frequently appearing term is ‘a pound’, with 6 frequencies. Different than the
price, this is a weight unit, and people are using this as a means of pointing out if the price is
reasonable or overpriced for a pound of the coffee product.

To see customers’ opinions regarding price, I choose noun term “price” as keyword to
generate the term map. Out of 4691 documents, the price term appears in 676 of them. From
the map, as customers place their orders on amazon, there are 78 documents link the price to
amazon. And we can see that customers are generally satisfied with the price, as term “good
price”, “great price” and “reasonable price” has appeared most of times.

Page 5
But out of 43 documents the term “lower” appear, the term “lower price” appears 24 times.

To sum up, we can find that customers do not judge whether the price is reasonable or not
simply from the total price, but also based on the weight of the product, the caffeine it
contains. Therefore, the company can not only price its products based on the market price,
but also consider a combination of factors.

We can learn a lot about customers’ opinion regarding the coffee pricing for each brand in this
step, Yet the satisfaction level of customers regarding the products is influenced by multiple
factors, price is only one part, it can only provide a limited insight. If we only make judgement
based on single factor, it may lead to misinterpretation and biased conclusion.

Part. 1.2 Auto-generated Topics

The aim of conducting topic analysis is to interpret what is related to customer satisfaction and
dissatisfaction, and among all the reviews, which topics has generated most positive/negative
sentiments.

1) Negative Product Sentiments Analysis

The purpose of analysing this topic is to find out what factors affect the customer satisfaction
and lead to negative sentiments reviews.

In order to know what lead to the high negative sentiments number, I choose the topic “pod,
cup”, as we can see, there are in total of 562 documents related, and within those matched
topics, there are 478 positive, 60 neutral and 89 negative sentiments.

Page 6
Of the 562 matching documents, most of them are talking about coffee pod, and use, with pod
as the highest relevancy with the topic.

Since the keyword “pods” has the highest relevancy with the topic, thus I choose to use this
term as foundation and try to figure out what it links to, and thus learns why it has such high
dissatisfaction rate.

Page 7
Out of 4691 documents, the term “pod” appears in 545 of them, among which, the brand name
“SENSEO” is the most frequently appeared term, which appears in 315 documents, the term
“senseo pod” appears in 192 documents, and the related topics include making and using the
pod. Thus, we can not only assume that the customer review toward pod coffee is generally
negative, which may due to pod coffee are generally difficult to use, but also customers have
negative opinion towards pod coffee made by brand senseo.

2) Highest Positive Product Sentiments Analysis

In the topic “drink, can, illy” , it has the highest number of positive sentiment documents.
Among the total of 727 matched reviews, there are 574 positive, 69 neutral and 84 negative
sentiments.

Page 8
Among all the matched documents, we can see the term drink and can appear most
frequently, and we can assume that this is related to the container or the size of the drink,
there is also a coffee brand “illy” frequently appear in the review, and we may conjectures that
the general customers have higher satisfaction rate regarding other coffee brands, especially
for the canned coffee.

Also, by looking at the top documents, we can see that this topic is generally about
ready_to_drink coffee, as it is the only product type that the brand illy produce. And since
calorie, sugar and the brand “illy” have higher frequency in appearance, it is safe to say that
the customers have high satisfaction toward the “illy” product, and this may due to its product

Page 9
design, which is convenient for customers to use, and also the product have low calorie and
right amount of sugar.

Part 2

1.Custom concepts

1) Concept 1:Factors That Affect Satisfaction Rate For Ready-To-Drink Coffee

The purpose of building the concept is, from the topic analysis above, we can see that all
price, flavor, calories containing have a combining impact on coffee products in general. Yet,
for ready to drink product type, I wonder if the determine factor for consumer purchase would
change, since most consumers choose this type of product out of seeking convenience. Thus,
I intend to analysis on whether price or convenience that consumers consider the most when
purchasing.

First, I create custom concept: “ready_to_drink” and build the first code to see the connection
between illy coffee and price. To get the documents that contains both terms, I need to use
concept rule to use Boolean operators, to get the results that contains “price” and occur in the
document with “illy”.

As we can see, there are in total 50 documents that talk about the price of illy coffee.

Page 10
Then I need to find out how many documents contains both illy and convenience. But when
trying to conduct analysis, I need to consider synonym, thus by using concept rule and adding
@ behind convenience, it enables me to find all the matched documents that have similar
forms as convenience. Then I use concept rule type and Boolean operator to combine illy with
convenience.

There are in total of 98 documents contains both term “convenience” and illy. As for
ready_to_drink coffee, it is quite natural the customers prefer packaging to be easy to open or
carry, and based on the top documents, we can see that customers mostly are pleased with
the packaging design, and think the cans are convenient. And this can give design team an
idea to change their product design based on customer reviews.

2)Concept 2: Coffee Features That Mostly Lead To Customer Satisfaction

To investigate which factor is most important when it comes to affecting customer purchase, I
select 6 keywords from 4691 documents that are frequently appearing, the words are:
calories, richness, tasty, sweetness(sugar), price(affordable), convenience. First, I use
classifier concept to pinpoint the keywords that I want, but I also need to consider the
misspelling or synonym, thus I use REGEX to broaden the range of terms. But when analysing
the flavour, I need to consider that there could be “flavorful” or other forms, thus I use concept
with @ to make sure I have the right amount of matching documents. And from 4961
documents, I get 1225 matched documents.

Page 11
The keywords I choose can be divided into the following broad categories: price, flavour,
calories intake, convenience. And to see which factor have a greater influence on consumers,
I open the text parsing node to investigate.

First, for nouns, the keyword flavour appears in 1787 documents. Similarly, the phrase coffee
flavours appear in 224 documents, which means the term flavour has occurred in 2011
documents.

Page 12
Second, for term “calories”, it appears in 178 documents, which we can see from the reviews
that the sugar intake is also a key factor for consumers, just not as influential.

Lastly, for the keyword “price”, it appears both in role of “noun” and “NounGroup”, it total
appears in 733 documents. For role as adjective, the term “cheap” and “expensive” appears
173 and 104 times, in total price related terms appears in 1044 documents, which make price
as the second determine factors for purchase.

After identify “flavor” as the key factor, I try to use term map to see which flavor that customers
most prefer, so that the R&D teams can create more targeted products.

Page 13
From the term map, we can see that apart from the most obvious connection “coffee flavor”,
the second term that have most connection is “vanilla flavor”, which appears in total of 165
documents. Beside them, vanilla associated with other terms appears 244 times, and is
associated with the term “prefer”, which means there are 50 out 55 documents have the term
“prefer” when “vanilla” appears. and from the matched documents from the custom concepts,
we can already lean that the consumers prefer coffee with right amount of sugar that cover the
bitterness in the coffee, with moderate calories. And combined with the term map, we know
that vanilla is the most popular flavor, thus it can give R&D department insights on how to
build future products, and it is beneficial for marketing teams to place their promotion target on
flavor.

2. Custom Categories:
1) Customer Satisfaction Category

To better analysis what factors affect customers’ satisfaction rate, and their combining effect, I
create a category that contains all the positive term combinations that I have identified prior.
the terms include “price, flavor, calorie, sugar”, and I will use Boolean operators to combine
positive adjectives with these terms to generate category with only positive sentiments.

Page 14
To group all positive terms regarding prices, I put in good and reasonable price, as they are
key indicator of customer satisfaction. For flavour, I look for keywords such as
balanced/balance that suggest positive sentiments. To attain documents that contains not only
both of the terms, but also individual terms, I first use Boolean type “and” for all matches
occurred, then specified to “or”. Considering the misspelling and synonym, “or” Boolean
operator is used to match the documents that these keywords may occurs in the documents.

And out of 4691 documents, there are 1575 matched documents, each of them has assigned
terms and all have positive sentiments. By categorizing documents, company can further
study the pattern and build related products. Cause these are all positive reviews, which
means customers are satisfied with at least one feature of the product, which enable company
to study what feature makes customer happy and what is their competitive advantage, and
further expanding that advantage or to highlight the selling points.

2)Comparative Analysis Categories

I want to compare coffee brands with others and how customers feel about these coffee
products. By comparing, the coffee brands can know which part they are doing great and
which parts they are falling short at, which can help the company better analysis its product
design.

Page 15
From the previous analysis, for brand “illy”, it focuses on producing ready to drink coffee, and
based on the documents, most customers tend to compare its coffee product to Starbucks.
And from the 469 matched documents, we can see that most customers have positive opinion
towards “illy” coffee, especially regarding flavor and sugar intake, many matched documents
have stated that they found illy “not too sweet” and “contain only half of the calorie” compare
with Starbucks’s products.

Page 16

You might also like