Purchase data -
Legend
The dataset consists of information about the purchases of chocolate candy bars of 500 individuals from a given area when entering a phys
preprocessed and there are no missing values. In addition, the volume of the dataset has been restricted and anonymised to protect the pr
Variable Data type Range
ID numerical Integer
Day numerical Integer
Incidence categorical {0,1}
0
1
Brand categorical {0,1,2,3,4,5}
0
1,2,3,4,5
Quantity numerical integer
Last_Inc_Brand categorical {0,1,2,3,4,5}
0
1,2,3,4,5
Last_Inc_Quantity numerical integer
Price_1 numerical real
Price_2 numerical real
Price_3 numerical real
Price_4 numerical real
Price_5 numerical real
Promotion_1 categorical {0,1}
0
1
Promotion_2 categorical {0,1}
0
1
Promotion_3 categorical {0,1}
0
1
Promotion_4 categorical {0,1}
0
1
Promotion_5 categorical {0,1}
0
1
Sex categorical {0,1}
0
1
Marital status categorical {0,1}
0
1
Age numerical Integer
18
75
Education categorical {0,1,2,3}
0
1
2
3
Income numerical real
38247
309364
Occupation categorical {0,1,2}
0
1
2
Settlement size categorical {0,1,2}
0
1
2
e purchases of chocolate candy bars of 500 individuals from a given area when entering a physical ‘FMCG’ store in the period of 2 years. All data has been
es. In addition, the volume of the dataset has been restricted and anonymised to protect the privacy of the customers.
Description
Shows a unique identificator of a customer.
Day when the customer has visited the store
Purchase Incidence
The customer has not purchased an item from the category of interest
The customer has purchased an item from the category of interest
Shows which brand the customer has purchased
No brand was purchased
Brand ID
Number of items bought by the customer from the product category of interest
Shows which brand the customer has purchased on their previous store visit
No brand was purchased
Brand ID
Number of items bought by the customer from the product category of interest during their
previous store visit
Price of an item from Brand 1 on a particular day
Price of an item from Brand 2 on a particular day
Price of an item from Brand 3 on a particular day
Price of an item from Brand 4 on a particular day
Price of an item from Brand 5 on a particular day
Indicator whether Brand 1 was on promotion or not on a particular day
There is no promotion
There is promotion
Indicator of whether Brand 2 was on promotion or not on a particular day
There is no promotion
There is promotion
Indicator of whether Brand 3 was on promotion or not on a particular day
There is no promotion
There is promotion
Indicator of whether Brand 4 was on promotion or not on a particular day
There is no promotion
There is promotion
Indicator of whether Brand 5 was on promotion or not on a particular day
There is no promotion
There is promotion
Biological sex (gender) of a customer. In this dataset there are only 2 different options.
male
female
Marital status of a customer.
single
non-single (divorced / separated / married / widowed)
The age of the customer in years, calculated as current year minus the year of birth of the
customer at the time of creation of the dataset
Min value (the lowest age observed in the dataset)
Max value (the highest age observed in the dataset)
Level of education of the customer
other / unknown
high school
university
graduate school
Self-reported annual income in US dollars of the customer.
Min value (the lowest income observed in the dataset)
Max value (the highest income observed in the dataset)
Category of occupation of the customer.
unemployed / unskilled
skilled employee / official
management / self-employed / highly qualified employee / officer
The size of the city that the customer lives in.
small city
mid-sized city
big city
ysical ‘FMCG’ store in the period of 2 years. All data has been collected through the loyalty cards they use at checkout. The data has been
privacy of the customers.
ta has been