Marketing Analytics MBusA MBS 2023 Practice Questions Solution ©Nico Neumann
Practice questions class 1 - Solution
Question A-b
We need to do two things for this kind of questions:
1) Consider which of the few tests we covered is possible/ most appropriate?
2) How to get the data in the format we need for this test and R (here it’s basically about
transforming proportions to frequencies or vice versa).
Solution:
1) We have one column of data: Chi-square goodness-of-fit test required (we compare our
given data to some theoretical one, here age tier distributions from the census).
2) We need to find out the proportions that we would expect. Calculate the proportions that
we need to provide to the R proportions test.
3) We are given proportions for our observed data, but need frequencies.
We carry out a Chi-square goodness-of-fit test, which results in a test statistic of 12.485 (df = 5, p-
value = 0.02872). We can reject the Null hypothesis that the two distributions are equal. In other
words, the age customer distribution is significantly different from the Census age distribution in
Australia.
Step 1: Obtain expected proportions
[Link] <- 20258833 # total census sample size/ population
prop_18_24 <- 2392093/ [Link]
….
Step 2: Calculate observed frequencies
observed_18_24 <- 0.11*5500
observed_25_34 <- round(0.195*5500,0)
…
#Step 3: Calculate Chi-square goodness of fit
See also R-code for details.
Question A-b
This could be a typical open-ended question, similar to what you could face for an exam. Please note
that I will often leave out specific details to assess how you think through the process and recognise
business/ marketing related points that are missing.
Our framework: Order of issues we address may vary/ depend on your preference or the problem.
What marketing issue is mentioned? Branding/ advertising?
Pricing? Product development? Segmentation? Metrics? … What theory,
framework or best
What important concepts and theories were discussed in the
practices should be
respective lecture on the weekly topic?
considered?
Can you apply a specific topic-related framework (e.g. 4Ps)?
`
Marketing Analytics MBusA MBS 2023 Practice Questions Solution ©Nico Neumann
Theoretical considerations: We know the information is supposed to be used for segmentation.
What we don’t know is the type and purpose of the segmentation and how the manager wants to
use our test information - for product decisions or ad campaigns? Usually, segmentation would be
used to find attractive customers that match the product strengths and to help differentiate from
customers (strategic purposes) or to communicate with customers that may be interested in our
product (tactical purposes).
This could make a difference. We could choose to consider each scenario (for product
development vs. for communication) or just presume one situation (e.g. just assuming it’s for
marketing communication) if this matters.
Analytical considerations: We find evidence against the null hypothesis that our age groups are
identical to the Australian Census population in terms of age, but how reliable is our applied test?
There are a few assumptions only for a Chi-square test (which is a non-parametric test) and no
obvious issues can be raised here. We could note that a Chi-square test only tells us that the two
distributions are different, but not by how much! We have a large sample (greater power) and if
we look closely at the proportions, then we can see that absolute percentage differences are only
minor between our customers and the census distribution.
HD level answer:
It would be fair to question how much we can learn as a business from the finding that the
two distributions are independent: we are only looking at associations and have not
controlled for confounders.
Even better: Make suggestions of what could be done to provide a more causal analysis.
Contextual considerations: We can build on the previous bullet point (how much we can learn as a
business from the finding that the two distributions are different) and raise that some additional
important contextual information is unclear:
`
Marketing Analytics MBusA MBS 2023 Practice Questions Solution ©Nico Neumann
• We don't know the product category – perhaps the age distribution is not significantly
different for our type of product versus the census population?
• We don’t know the regions we operate in – is the Australian population the best
benchmark for us? Maybe our customers are all in Western Australia?
• How consistent are the age group results in our database? Is this a yearly average?
Monthly?
HD level answer – adding a bit marketing expertise:
Why would it matter to be different in terms of age from the census distribution? Is a
demographic feature like age a great segmentation feature to use (in particular, if we don’t
know the purpose of the segmentation)?
Why would the age matter for our (unknown) product/ service unless it’s something that only
a certain age group can use (but then the shown age distribution would not make sense).
Finally, we need to make an overall recommendation or conclusion:
At this stage (before getting clarifications on the points above), we would best recommend the
manager not to use the information for further decision guidance.
Please note: The order of our 3 main considerations is not important. The exact structure is also
less relevant (e.g. in case you mention causal interpretation concerns under theoretical
considerations instead of analytical ones). Just use some structure and try to think through all
issues.
Question B
We could carry out a one-sample proportion test or a Chi-square goodness of fit test (testing our
customers against the theoretical Australian e-commerce proportions). However, we need to
recognise that the sample size seems too small to assume a normal distribution (e.g. all counts
must be greater than 10 for a one-sample z-test).
Therefore, we need an exact binomial test (comparing our proportion with the 89% baseline),
which results in a p-value of 0.04051. This result suggests that the two proportions are statistically
different (we have enough evidence to reject the Null hypothesis that these are equal).
(see R code file)
Question C
We have more than two levels for our two factor variables. The possible (given what we covered
in our class) tests are either a Chi-square test of independence or a Fisher’s test of independence.
We don’t need the latter (all cells>5), which can also be RAM intensive. Hence, a Chi-square test is
fine. With a resulting Chi-square statistic of 12.088 (df=6) and p-value = 0.06003, we cannot reject
the Null hypothesis as we don’t find enough evidence (based on our standard threshold) that
there is an association between loyalty programs tiers/ memberships and the four locations.
(see R code file)
`
Marketing Analytics MBusA MBS 2023 Practice Questions Solution ©Nico Neumann
Question D
This question is similar to the syndicate task in class 1. What would make site visits an ideal KPI?
Our general structural guide:
• Contextual information: We don’t know the type of business. Site visits is a digital metric
used for online business. Hence, one assumption is that would be that the business is only
online (or that it must be only one of at least two KPIs otherwise – we would need another
KPI for the offline side/ brick-and mortar stores). Moreover, the text mentions site visits
alone. One interesting question that is left open is the time window to look at? Site visits
per day, hour, week or month? This needs to be determined, ideally motivated by some
theoretical argument (see point 1 below).
• Analytical considerations: No statistical test or method applied here (but you could
mention some points from the discussion below here).
• Theoretical considerations: This is variant of a specific task we practiced with a specific
theoretical framework we can apply. We need to use the four characteristics:
deal characteris cs
ndicator of future success
ed to one
e ecu ve department team
Reported fre uently
Hard to game fake
1. Indicator of success
Does a site visit constitute an indicator of success for a key result? The assumption seems
reasonable as a site visit is a required step (for an online business!) for customers to
eventually buy a product. The link/ impact seems strictly positive: The more people visit a
website, the more future conversions/ purchases we should have. That’s what we want
from a great KPI.
We also need to keep in mind for which team or department our proposed KPI metric is
supposed to be an success indicator. Here we only know about people in charge of
advertising campaigns. But is the goal of ad campaigns to trigger a site visit (=are ads a
precursor of site visits)?
One can certainly argue that nudging people to the website (or store) is all that is expected
from the marketing-communications team (because then it’s not up to them anymore
what customers do – if the website design is bad, customers may bounce/ leave).
From a causal perspective: We are looking for a mediator of a KRI (Key Result Indicator),
where we have as many precursors in our control as possible.
Overall, we could say that using the metric site visits would satisfy this KPI criterion.
`
Marketing Analytics MBusA MBS 2023 Practice Questions Solution ©Nico Neumann
2. Tied to one group
How do we know that a website visit was due to the work of the marketing team in charge
of ad campaigns? This is criterion is problematic. Customers could visit the website
because of many different reasons. For example, they may have heard from someone else
about the product. They could just google products (in search engines). They could also
directly enter the website URL.
So we could argue that this criterion is (at best) partly met, while it’s probably safer to say
that it is not met! One could argue either way.
3. Reported frequently
Modern website software allows tracking traffic in real time (this may be new to some of
you, but it’s something you should know after completing this subject). Hence, it is
possible to report this metric for daily/ weekly/ monthly data. Criterion met.
4. Hard to game
This is criterion is always one that requires some deeper thinking (recall the rat massacre
KPI that backfired).
The question is whether site visit is a bullet proof (=impossible to fake) metric?
In other words, can I somehow increase site visits but still damage/ not help the business?
Unfortunately, there are several creative ways of how I could do this. If only the number
of total site visits counts, then I could just find creative ways to have people enter the
website and then leave again. I could ask friends, visit every day myself, or find advanced
ways to redirect traffic to the website (with online users who will likely never buy
anything).
(if you wanted you could even mention malware programs at this point – creating bots to
visit websites).
In short, it seems that site visits can be manipulated and we are not meeting this criterion.
Overall conclusion:
Site visits meets 2, maybe 3 partly, of the 4 ideal KPI criteria. Two criteria is not too bad as a single
metric rarely meets all criteria. Hence, we can see why it’s fairly popular as a KPI but it’s not an
ideal one that meets all criteria.
High Distinction consideration for this exercise:
To score high for such a standard question that we practice in class, you can try to demonstrate
further reasoning (demonstrating that you thought through the problem) or by integrating
marketing/ digital-specific knowledge.
Some examples –BTW: this is just supposed to give you some ideas. There is no suggestion
anyone is supposed to write or know about all these points!
• Can we further improve the metric? We could highlight that (total) site visits is composed of
two measures: Unique visitors X repeated visits. It would be prudent to track both elements.
`
Marketing Analytics MBusA MBS 2023 Practice Questions Solution ©Nico Neumann
• One issue is that we don’t know which marketing department would obtain credit when
using [total] site traffic as a KPI. Maybe we can create a special site visit KPI metric where we
remove direct site visits and Google searches (or other traffic sources)?
• Alternatively, because (total) site visits cannot be completely tied to one team/ executive
only, we could mention that site visits would be a good criterion for the entire marketing team
(but not necessarily only for those in charge of the ad campaigns).
• Can we make this KPI fake-proof? We could highlight that we should explore a way to remove
‘poor or bot traffic’.
o Maybe we can filter out users who leave within 10 seconds (never browsing our
website and showing true intentions)? Maybe there is technology to filter out
problematic traffic/ malware etc and create a metric called ‘human site visits week’.
o Alternatively, are there guardrail metrics we could consider? For example, we could
track the ratio of people visiting website/ people buying (or similar actions). If the
ratio changes too much, we know we get bad traffic. Reminder: Guardrail metrics are
tracked as a precaution and just flag problems for further review.
o Another interesting question is whether costs should be added as another
denominator/ scaling factor?
• We could notice that we need to determine a time dimension as well (site visits over an hour,
day or week?). Given our context, what is a reasonable time to allow customers to come to
the website after seeing an ad? This is not easy to answer. Often advertising may not affect us
immediately and work subconsciously. We may come back to a product website at a later
stage because of an ad. Hence, it seems reasonable to not choose a too short time horizon to
assess any effects of advertising. One could argue to use weekly or monthly data, though
monthly may be a problem for KPI criterion 3 (= reported frequently).
• An interesting question from web-analytics: Is it technically even feasible to track how many
people are visiting a website? Are there measurement errors? For e ample, cookies don’t
represent people, but just browsers. Hence, if someone uses multiple devices (mobile, laptop)
or browsers, then this metric would be wrong.
• Finally, remember: The whole exercise is much easier if you add/ consider a causal diagram to
identify or visualise the factors of interest and their relationships.