BA Lab
BA Lab
PRACTICAL FILE
ON
BUSINESS ANALYTICS – LAB (MS 265)
1
Table of Content
2
MODULE-1
1.1 Classification of Data
[Link] CUSTOMER REGION TRANSACTION SOURCE AMOUNT PRODUCT TIME OF THE DAY
ID
1 10001 East 93816545 Web Rs20007 X 22:19
2 10002 West 74083490 Web Rs45000 Y 13:27
3 10003 North 64942368 Web Rs60000 X 14:45
4 10004 West 70560957 Email Rs11000 Y 16:20
5 10005 South 35208817 Email Rs33000 Z 19:25
6 10006 West 20978903 Email Rs26000 Z 20:22
7 10007 East 80103311 Web Rs24000 Z 11:56
8 10008 West 14132683 Email RS12000 X 9:08
9 10009 West 40128225 Email Rs14500 Y 22:12
10 10010 South 49073721 Web Rs64234 Z 8:10
11 10011 North 23456740 Web Rs90000 Z 5:12
12 10012 West 34210967 Web Rs28000 X 9:22
13 10013 East 54523109 Email Rs19984 Y 23:11
Solution:
Element Classification
Customer ID Ordinal
Region Categorial
Transaction Ordinal
Source Categorial
Amount Ratio
Product Categorial
Time of day Interval
3
1.2 Types of Data
[Link] CUSTOME REGIO TRANSACTIO SOURC AMOUN PRODUC TIME OF THE DAY
R ID N N E T T
1 10001 East 93816545 Web Rs20007 X 22:19
2 10002 West 74083490 Web Rs45000 Y 13:27
3 10003 North 64942368 Web Rs60000 X 14:45
4 10004 West 70560957 Email Rs11000 Y 16:20
5 10005 South 35208817 Email Rs33000 Z 19:25
6 10006 West 20978903 Email Rs26000 Z 20:22
7 10007 East 80103311 Web Rs24000 Z 11:56
8 10008 West 14132683 Email RS12000 X 9:08
9 10009 West 40128225 Email Rs14500 Y 22:12
10 10010 South 49073721 Web Rs64234 Z 8:10
11 10011 North 23456740 Web Rs90000 Z 5:12
12 10012 West 34210967 Web Rs28000 X 9:22
13 10013 East 54523109 Email Rs19984 Y 23:11
Solution:
Elements Classification
Customer ID Nominal
Region Nominal
Transaction Nominal
Amount Nominal
Product Ratio
Time of the date Interval
Source Nominal
4
1.3 Break Even Analysis
A firm installs 1500 AC which needs to be serviced. The firm can hire from its logistics department
at a fixed cost of Rs 100000.
Each unit will be serviced by the team at cost of Rs 300.
The firm can also outsource at a cost of Rs 400 inclusive of all charges.
1) For the given number of units compute the total cost of servicing from both options.
Which is better decision?
2) Find the break even and interpret
Outsourcing the
servicing:
5
Solution:
Quantity TC(Outsource) TC
200 80000 160000
400 160000 220000
600 240000 280000
800 320000 340000
1000 400000 400000
1200 480000 460000
1400 560000 520000
1500 600000 550000
1600 640000 580000
1800 720000 640000
800000
700000
600000
500000
400000
300000
200000
100000
0
0 500 1000 1500 2000
Interpretation: The break-even point is at 1000 units. If the number of units is less than
1000, it's cheaper to use the logistics department. If it's more than 1000 units, outsourcing becomes
the more cost-effective option.
6
1.4 Developing Model
A manufacturer of mp3 player is preparing to set the price in a new model. Demand id thought to
depend on the price and is represented by the model D=2500-3P. The accounting department
estimates that the total cost can be represented by C=5000+5D. Develop a model for the total profit
in terms of price P
C=5000+5D
C= 5000+5(2500-3P)
C=17500-15P
TC= 17500-15P
7
Chart Title
Demand Revenue cost profit
600000
500000
400000
300000
200000
100000
0
0 100 200 300 400 500 600 700 800 900 1000
-100000
-200000
-300000
Interpretation: B_E at 831 Rs. In order to have a profit the price should be within Rs.
831/-
8
MODULE-2
2.1 Lookup
1 Find the amount paid by customer ID 10011 using LOOKUP
2 Find the transaction Id of 10010 using LOOKUP
3 Find the region, transaction, source, amount, product and payment of customer
ID 10009
CUSTOMER REGION TRANSACTION SOURCE AMOUNT PRODUCT Payment
ID
10001 East 93816545 Web Rs20007 X Credit
10002 West 74083490 Web Rs45000 Y Paytm
10003 North 64942368 Web Rs60000 X UPI
10004 West 70560957 Email Rs11000 Y DEBIT
10005 South 35208817 Email Rs33000 Z Credit
10006 West 20978903 Email Rs26000 Z Paytm
10007 East 80103311 Web Rs24000 Z UPI
10008 West 14132683 Email RS12000 X Debit
10009 West 40128225 Email Rs14500 Y Credit
10010 South 49073721 Web Rs64234 Z Paytm
10011 North 23456740 Web Rs90000 Z UPI
10012 West 34210967 Web Rs28000 X Debit
10013 East 54523109 Email Rs19984 Y Credit
Solution 1: FORMULA: =vlookup,10011, table array
CUSTOMER ID AMOUNT
10011 Rs90000
Solution 2: FORMULA: =vlookup,10010, table array
CUSTOMER ID TRANSACTION
10010 49073721
Solution 3: FORMULA: =vlookup,10009, lookup vector, result vector
CUSTOMER
REGION TRANSACTION SOURCE AMOUNT PRODUCT Payment
ID
10009 West 40128225 Email Rs14500 Y Credit
9
2.2 V-Lookup
1 FROM THE GIVEN DATA, FIND OUT THE AMOUNT PAID BY CUSTOMER ID 10009
2 FROM THE GIVEN DATA, FIND OUT THE PRODUCT CHOOSEN BY CUSTOMER ID 10012
4 FROM THE GIVEN DATA, FIND OUT THE REGION, TRANSACTION, SOURCE, AMOUNT,
PRODUCT AND PAYMENT OF CUSTOMER ID 10013, 10010, 10012
customer id amount
10009 Rs14500
Solution 2: FORMULA: =vlookup,10012, table array, 6 ,0
customer id product
10012 X
10
2.3 H-lookup
1. Find the amount paid by the customer ID 10008 and 10009 using HLOOKUP
2. Find the transaction code of the customer ID 10007 using HLOOKUP
3. Find the payment method of customer ID 10005 using HLOOKUP
Customer ID 10001 10002 10003 10004 10005 10006 10007 10008 10009
Region East West North South East West North South East
Payment Credit Paytm UPI Debit Credit Paytm UPI Credit Debit
Transaction 21389147 284929 9876676 4567889 9876555 889809 334334 4565767 134788
code
Source Web Email Web Web Email Web Web Email Web
Amount RS 10000 Rs20000 Rs Rs RS Rs Rs Rs Rs
30000 40000 10000 1500 24000 28900 34677
Product X Y X Y X Y X Y X
11
2.4 What If Analysis
The following table is with respect to 20% transportation cost. Apply what if analysis with different
scenario managers with respect to Price and Profit by increasing 100$ in price and decreasing the
price by $[Link] generate scenario summary. Prepare table to find the value of price by setting
the profit value to $[Link] data table for profit with price between $100 to $1000.
Solution:
12
SCENARIO SUMMARY
Scenario Summary
Increased by decreased by
Current Values: original 100 50
Created by AMIT Created by AMIT Created by AMIT
SINGH BISHT SINGH BISHT SINGH BISHT
Modified by
WELCOME on
11/24/2023
Changing
Cells:
$B$2 $200 $250 $350 $200
Result Cells:
price $200 $250 $350 $200
profit $3,000 $5,000 $9,000 $3,000
Notes: Current Values column represents values of changing cells at
time Scenario Summary Report was created. Changing cells for each
scenario are highlighted in gray.
13
2.5 COUNT IF ANLAYSIS
1. Identify the number of iPhones, Motorola, Nokia and Samsung in the data set by using Count if
analysis
2. Find the number of customers having the amount in the specified category by using Count if
analysis.
Solution 1: formula: (count if, range, iPhone), (count if, range, motorola),
(count if, range, nokia), (count if, range, samsung)
Phone type count if
iPhone 7
Motorola 4
Nokia 5
Samsung 4
14
Solution 2: formula: (=COUNTIFS(M15:M24,">=0",M15:M24,"<1000"))
Customer Name Amount in Bank
a $0
b $500
c $4,589
d $25
e $158
f $4,525
g $526
h $2,258
i $1,129
j $5,598
Category frequency
0 upto 1000 5
1000 to 2000 1
2000 to 3000 1
3000 to 4000 0
4000 to 5000 2
5000 to 6000 1
15
2.6 Pivot Table and Pivot Chart
The worksheet contains monthly sales data for three different products product 1, product
2, product 3.
1. Figure out which on the three has been generating the most sales
Solution:
Region (All)
16
2. What were the total profit in the South region in 2016?
3. Which retailer is in the top in terms of profit?
REGION RETAILER TYPE CUSTOMER QUANTITY REVENUE PROFIT
SOUTH FOOD AND STAPLES WINN-DIXIE 1100 22000 7480
WEST MULTILINE NORDSTORM 1600 123200 43120
NORTH FOOD AND STAPLES COSTCO 1600 44800 11200
EAST
SOUTH SPECIALITY FOOTLOCKER 1700 93500 31790
MID WEST FOOD AND STAPLES WINN-DIXIE 800 48800 6832
MID WEST FOOD AND STAPLES NORDSTORM 400 9200 1288
WEST FOOD AND STAPLES FOOTLOCKER 300 56000 2100
MID WEST MULTILINE FOOTLOCKER 500 32500 1710
SOUTH MULTILINE KOHLS 1600 106400 14560
MID WEST FOOD AND STAPLES COSTCO 1400 12600 9425
NORTH SPECIALITY KOHLS 900 14400 20216
EAST
MID WEST MULTILINE COSTCO 1200 58500 3780
WEST MULTILINE KOHLS 1500 13400 3312
Solution 1:
Solution 2:
17
2.7 Conditional Formatting
1. Condition 1: Cell value greater than 75(Green Fill)
Solution:
Column1 Column2
Months Units
January 69
February 86
March 54
April 62
May 48
June 96
July 47
August 72
September 70
October 21
November 65
December 71
18
2.8 Data Validation
Apply data validation for:
Year purchased 1/1/1990 - 1/1/2020
Time 9Am-10PM
Star Rating 1,2,3,4,5
Value >100
Genre Synthpop,New Wave,Classic
Synthpop
Location Living room,Basement
Solution:
Music inventory Column1 Column2 Column3 Column4 Column5 Column6
Date
Title Purchased Time Star Rating Value Genre Location
Blume-Rise From Cassic
Grey 1/1/2019 [Link] AM 1 1000 Synthpop Basement
The Stabilizers- [Link]
Tyranny 1/2/2009 AM 2 2000 New Wave Living room
Cassic
Code 64-Storm 1/3/2019 [Link] AM 1 3000 Synthpop Basement
Code 64- [Link]
Departure 1/1/2020 PM 3 4000 New Wave Basement
BI Machine-
Hybrid 1/5/2008 [Link] PM 4 5000 Synthpop Living room
BI Machine- Cassic
Aftermath 1/6/2004 [Link] PM 1 6000 Synthpop Basement
Midnight
Resistance-
Remote 1/7/2019 [Link] PM 1 7000 New Wave Basement
Depeche Mode- Cassic
Violator 1/8/2019 [Link] PM 5 8000 Synthpop Living room
Alphaville-Forever
Young 1/9/2019 [Link] PM 5 9000 New Wave Basement
Alphaville-
Catching Rays on
Giant 1/10/2019 [Link] PM 5 10000 Synthpop Basement
Cosmicity-Escape Cassic
pod for Two 1/11/2019 [Link] PM 1 11000 Synthpop Living room
19
MODULE 3
3.1 Box Plot
A market research company asks 30 people to evaluate three brands of tablet computers using
a questionnaire. The 30 people. are divided at random into 3 groups of 10 people each, where
the first group evaluates Brand A, the second evaluates Brand B, and the third evaluates Brand
C Summaries the questionnaire scores from these groups. Prepare boxplots to identify
outliners.
Solution:
20
Brand A Brand B Brand C
Sum 7840 7560 10100
Count 10 10 10
Mean 784 756 1010
Median 725 705 905
Standard Deviation 349.1959 437.9548 386.6091911
Minimum 380 300 430
Maximum 1560 1850 1750
Range 1180 1550 1320
Skewness 1.1919217 1.856916 0.623429668
Kurtosis 1.6795012 4.636038 0.196386275
21
3.2 Normality of data
1. Check the normality of the given data:
2. Also find out:
Sum
Count
Mean
Median
Standard Deviation
Minimum
Maximum
Range
Skewness
Kurtosis
Q1(First Quartile)
Q3(Third Quartile)
IQR
Solution:
Column1 Column2
Sample Descriptive Statistic
2950
Mean 3365
4000
Standard Error 98.8966913
3300 Median 3350
3350 Mode 3350
3500 Standard Deviation 312.738798
Sample Variance 97805.5556
3550
Kurtosis 1.16008123
3500
Skewness 0.42446406
2900 Range 1100
3250 Minimum 2900
3350 Maximum 4000
Sum 33650
Count 10
mean 3365
standard 312.7388
deviation
22
Sorted sample Size
Z Score
2.5
1.5
0.5
0
0 2 4 6 8 10 12
-0.5
-1
-1.5
-2
23
MODULE-4
4.1 Simple Regression Analysis
Advertising_Expenditure Sales_Revenue
50 2.5
60 3
45 2.3
55 2.8
65 3.2
70 3.5
40 2
75 3.7
55 2.7
80 4
48 2.6
52 2.4
Solution:
24
4.2 Multiple Regression Analysis
25
4.3 Principle Component Analysis
Output:
Communalities
Initial Extraction
26
27
Rotated Component Matrixa
Component
1 2 3
Price1 .852
Price2 .756
Price3 .607
Price4 .730
Advertising_Expenses1 .787
Advertising_Expenses2 .847
Advertising_Expenses3 .703
Advertising_Expenses4 .819
Sales1 .859
Sales2 .854
Sales3 .850
Sales4 .864
28
4.4 Cluster Analysis
x y QCL_1
8.00 10.00 5.72014
20.00 2.00 5.67646
16.00 8.00 2.35702
8.00 7.00 3.75766
1.00 4.00 3.86264
13.00 10.00 5.08811
15.00 1.00 4.71405
19.00 7.00 3.59011
3.00 4.00 2.12603
3.00 2.00 3.75766
11.00 6.00 4.67856
OUTPUT:
1 2
x 4.60 15.67
y 5.40 5.67
29
ANOVA
Cluster Error
Cluster 1 5.000
2 6.000
Valid 11.000
Missing .000
30