0% found this document useful (0 votes)
58 views42 pages

Chapter1 - Introduction & Overview

This document provides an introduction and overview to a course on customer analytics and A/B testing in Python. It discusses what A/B testing is, why it is important, the A/B test process, where A/B testing can be used, and how the course will progress from understanding users and key performance indicators to exploratory data analysis, A/B test design, and analyzing results. The document also provides examples of how A/B testing could be used for a meditation app to test hypotheses about subscriptions and purchases.

Uploaded by

bigissue2023
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views42 pages

Chapter1 - Introduction & Overview

This document provides an introduction and overview to a course on customer analytics and A/B testing in Python. It discusses what A/B testing is, why it is important, the A/B test process, where A/B testing can be used, and how the course will progress from understanding users and key performance indicators to exploratory data analysis, A/B test design, and analyzing results. The document also provides examples of how A/B testing could be used for a meditation app to test hypotheses about subscriptions and purchases.

Uploaded by

bigissue2023
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Course introduction

and overview
C U S T O M E R A N A LY T I C S A N D A / B T E S T I N G I N P Y T H O N

Ryan Grossman
Data Scientist, EDO
What is A/B testing?
A/B Testing: Test di erent ideas against each other in the real world

Choose the one that statistically performs be er

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


Why is A/B testing important?
No guessing

Provides accurate answers - quickly

Allows to rapidly iterate on ideas

...and establish causal relationships

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


A/B test process
1. Develop a hypothesis about your product
or business

2. Randomly assign users to two di erent


groups

3. Expose:
Group 1 to the the current product rules

Group 2 to a product that tests the


hypothesis

4. Pick whichever performs be er according


to a set of KPIs

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


Where can A/B testing be used?
Users + ideas → A/B test

testing impact of drugs

incentivizing spending

driving user growth

...and many more!

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


Course progression
1. Understanding users — Key Performance Indicators

2. Identifying trends — Exploratory Data Analysis

3. Optimizing performance — Design of A/B Tests

4. Data driven decisions — Analyzing A/B Test Results

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


Key performance indicators (KPIs)
A/B Tests: Measure impact of changes on
KPIs

KPIs — metrics important to an


organization
likelihood of a side-e ect

revenue

conversion rate

...

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


How to identify KPIs
Experience + Domain knowledge + Exploratory data analysis

Experience & Knowledge - What is important to a business

Exploratory Analysis - What metrics and relationships impact these KPIs

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


Next Up...
Exploratory Data Analysis (EDA)

Identify KPIs and areas for further analysis

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


Let's practice!
C U S T O M E R A N A LY T I C S A N D A / B T E S T I N G I N P Y T H O N
Identifying and
understanding KPIs
C U S T O M E R A N A LY T I C S A N D A / B T E S T I N G I N P Y T H O N

Ryan Grossman
Data Scientist, EDO
Example: meditation app
Services
Paid subscription

In-app purchases

Goals/KPIs
Maintain high free → paid conversion rate

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


Dataset 1: User demographics
import pandas as pd
# load customer_demographics
customer_demographics = pd.read_csv('customer_demographics.csv')
# print the head of customer_demographics
print(customer_demographics.head())

uid reg_date device gender country age


54030035 2017-06-29 and M USA 19
72574201 2018-03-05 iOS F TUR 22
64187558 2016-02-07 iOS M USA 16
92513925 2017-05-25 and M BRA 41
99231338 2017-03-26 iOS M FRA 59

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


Dataset 2: User actions
# load customer_subscriptions
customer_subscriptions = pd.read_csv('customer_subscriptions.csv')
# print the head of customer_subscriptions
print(customer_subscriptions.head())

uid lapse_date subscription_date price


59435065 2017-07-06 2017-07-08 499
26485969 2018-03-12 None 0
64187658 2016-02-14 2016-02-14 499
99231339 2017-04-02 None 0
64229717 2017-05-24 2017-05-25 499

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


KPI: Conversion Rate
Conversion Rate: Percentage of users who Choosing a KPI
subscribe a er the free trial
Stability over time
Of users who convert within one week?
One month?... Importance across di erent user groups

Across all users or just a subset? Correlation with other business factors

...

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


Joining the demographic and subscription data
Merging — equivalent of SQL JOIN

In pandas :
pd.merge(df1, df2)

df1.merge(df2)

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


Merging mechanics
# merge customer_demographics (left) and customer_subscriptions (right)
sub_data_demo = customer_demographics.merge(
# right dataframe
customer_subscriptions,
# join type
how='inner',
# columns to match
on=['uid'])
sub_data_demo.head()

uid reg_date device ...price


54030729 2017-06-29 and ...499

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


Next steps
Aggregate combined dataset

Calculate the potential KPIs

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


Let's practice!
C U S T O M E R A N A LY T I C S A N D A / B T E S T I N G I N P Y T H O N
Exploratory analysis
of KPIs
C U S T O M E R A N A LY T I C S A N D A / B T E S T I N G I N P Y T H O N

Ryan Grossman
Data Scientist, EDO
KPIs
Reminder: conversion rate is just one KPI

Most companies will have many KPIs

Each serves a di erent purpose

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


Methods for calculating KPIs
Group: pandas.DataFrame.groupby()

DataFrame.groupby(by=None, axis=0, level=None,


as_index=True, sort=True,
group_keys=True, squeeze=False, **kwargs)

Aggregate: pandas.DataFrame.agg()

DataFrame.agg(func, axis=0, *args, **kwargs)

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


Grouping Data: .groupby()
by : elds to group by

axis : axis=0 will group by columns, axis=1 will group by rows

as_index : as_index=True will use group labels as index

# sub_data_demo - combined demographics and purchase data


sub_data_grp = sub_data_demo.groupby(by=['country', 'device'],
axis=0,
as_index=False)
sub_data_grp

<pandas.core.groupby.DataFrameGroupBy object at 0x10ec29080>

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


Aggregating data - mean price paid per group
# Mean price paid for each country/device
sub_data_grp.price.mean()

country device price


0 BRA and 312.163551
1 BRA iOS 247.884615
2 CAN and 431.448718
3 CAN iOS 505.659574
4 DEU and 398.848837

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


Aggregate data: .agg()
Pass the name of an aggregation function to agg() :

# Find the mean price paid with agg


sub_data_grp.price.agg('mean')

country device price


0 BRA and 312.163551
1 BRA iOS 247.884615
2 CAN and 431.448718
3 CAN iOS 505.659574
4 DEU and 398.848837

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


.agg(): multiple functions
Pass a list of names of aggregation functions:

# Mean and median price paid for each country/device


sub_data_grp.price.agg(['mean', 'median'])

mean median
country device
BRA and 312.163551 0
iOS 247.884615 0
CAN and 431.448718 699
iOS 505.659574 699
DEU and 398.848837 499
iOS 313.128000 0

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


.agg(): multiple functions, multiple columns
Pass a dictionary of column names and aggregation functions

# Calculate multiple metrics across different groups


sub_data_grp.agg({'price': ['mean', 'min', 'max'],
'age': ['mean', 'min', 'max']})

country device price age


mean min max mean min max
0 BRA and 312.163551 0 999 24.303738 15 67
1 BRA iOS 247.884615 0 999 24.024476 15 79
2 CAN and 431.448718 0 999 23.269231 15 58
3 CAN iOS 505.659574 0 999 22.234043 15 38
4 DEU and 398.848837 0 999 23.848837 15 67
5 DEU iOS 313.128000 0 999 24.208000 15 54

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


.agg(): custom functions
def truncated_mean(data):
"""Compute the mean excluding outliers"""
top_val = data.quantile(.9)
bot_val = data.quantile(.1)
trunc_data = data[(data <= top_val) & (data >= bot_val)]
mean = trunc_data.mean()
return(mean)

# Find the truncated mean age by group


sub_data_grp.agg({'age': [truncated_mean]})

country device age


truncated_mean
0 BRA and 22.636364
...

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


Let's practice!
C U S T O M E R A N A LY T I C S A N D A / B T E S T I N G I N P Y T H O N
Calculating KPIs - a
practical example
C U S T O M E R A N A LY T I C S A N D A / B T E S T I N G I N P Y T H O N

Ryan Grossman
Data Scientist, EDO
Goal - comparing our KPIs
Goal: Examine the KPI "user conversion
rate" a er the free trial

Week One Conversion Rate: Limit to users


who convert in their rst week a er the trial
ends

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


Conversion rate : maximum lapse date
import pandas as pd
from datetime import datetime, timedelta

current_date = pd.to_datetime('2018-03-17')

Lapse Date: Date the trial ends for a given user

# What is the maximum lapse date in our data


print(sub_data_demo.lapse_date.max())

'2018-03-17'

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


KPI calculation : restrict users by lapse date
# latest lapse date: a week before today
max_lapse_date = current_date - timedelta(days=7)
# restrict to users lapsed before max_lapse_date
conv_sub_data = sub_data_demo[(sub_data_demo.lapse_date < max_lapse_date)]
# count the users remaining in our data
total_users_count = conv_sub_data.price.count()
print(total_users_count)

2787

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


KPI calculation: restrict subscription date
# latest subscription date: within 7 days of lapsing
max_sub_date = conv_sub_data.lapse_date + timedelta(days=7)

# filter the users with non-zero subscription price


# who subscribed before max_sub_date
total_subs = conv_sub_data[
(conv_sub_data.price > 0) &
(conv_sub_data.subscription_date <= max_sub_date)
]
# count the users remaining in our data
total_subs_count = total_subs.price.count()
print(total_subs_count)

648

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


KPI calculation: find the conversion rate
Conversion Rate: Total Subscribers / Potential Subscribers

# calculate the conversion rate with our previous values


conversion_rate = total_subs_count / total_users_count
print(conversion_rate)

0.23250807319698599

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


Cohort conversion rate
# Create a copy of our dataframe
conv_sub_data = conv_sub_data.copy()

# keep users who lapsed prior to the last 14 days (2 weeks)


max_lapse_date = current_date - timedelta(days=14)
conv_sub_data = sub_data_demo[
(sub_data_demo.lapse_date <= max_lapse_date)
]

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


Cohort conversion rate
Sub Time: How long it took a user to subscribe

# Find the days between lapse and subscription if they


# subscribed ... and pd.NaT otherwise
sub_time = np.where(
# if: a subscription date exists
conv_sub_data.subscription_date.notnull(),
# then: find how many days since their lapse
(conv_sub_data.subscription_date - conv_sub_data.lapse_date).dt.days,
# else: set the value to pd.NaT
pd.NaT)
# create a new column 'sub_time'
conv_sub_data['sub_time'] = sub_time

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


Cohort conversion rate
gcr7() , gcr14() : calculate the 7 and 14 day conversion rates

# group by the relevant cohorts


purchase_cohorts = conv_sub_data.groupby(by=['gender', 'device'], as_index=False)

# find the conversion rate for each cohort using gcr7,gcr14


purchase_cohorts.agg({sub_time: [gcr7,gcr14]})

gender device sub_time


gcr7 gcr14
0 F and 0.221963 0.230140
1 F iOS 0.229310 0.237931
2 M and 0.252349 0.257718
3 M iOS 0.218045 0.225564

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


How to choose KPI metrics?
In nitely many potential KPIs

How long does it take to determine


Monthly Conversion Rate = 1 Month Wait time

Leverage Exploratory Data Analysis


Reveals relationships between metrics and key results

Keep In Mind How do these KPIs and my Business goals relate

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


Why is conversion rate important?
Strong measure of growth

Potential early warning sign of problems


Sensitive to changes in the overall ecosystem

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


Next chapter: continue exploring conversion rates
How does this KPI evolve over time?

See how changes can impact di erent groups di erently

CUSTOMER ANALYTICS AND A/B TESTING IN PYTHON


Let's practice!
C U S T O M E R A N A LY T I C S A N D A / B T E S T I N G I N P Y T H O N

You might also like