0% found this document useful (0 votes)
19 views6 pages

BA Notes

The document outlines various statistical methods used in business analytics, including hypothesis testing, t-tests, and ANOVA, with a focus on analyzing household income data. It discusses the importance of understanding dispersion, standard deviation, and the significance of differences in income among different groups. Additionally, it emphasizes the role of data visualization tools like Power BI and Looker Studio in presenting analytical findings.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views6 pages

BA Notes

The document outlines various statistical methods used in business analytics, including hypothesis testing, t-tests, and ANOVA, with a focus on analyzing household income data. It discusses the importance of understanding dispersion, standard deviation, and the significance of differences in income among different groups. Additionally, it emphasizes the role of data visualization tools like Power BI and Looker Studio in presenting analytical findings.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

[DAY 1 - 6/8/2024]:

Sau session 5: MID-TERM


Business Analytics life cycle -> Identify Problem ( most important)
+ Who is your business?
+ What
+ How
Power BI
Looker Studio
[DAY 2]:
+ Mode -> the worst choice because of not meeting up to 50% customers
+ Media/ Mean -> meet up to 50% customer -> not a good one
- The midrange => the average of Min and Max
- Dispersion ( sự phân tán): How data is varied from the mean
+ Nếu gần với giá trị mean => low dispersion
+ Nếu xa với giá trị mean => high dispersion
=> ∑(xi-> x ngang) =0 => muốn dương thì bình phương
- Midspread ( Độ trải giữa): 1 khoảng (Q1-Q3):
- Standard Deviation (Độ lệch chuẩn)
Ex: Noodles: 80g +- 5g
- CV = std/mean => càng lớn: high risk
-

Note: WHY banks usually announce the Mean of employees salaries?


+ attract talents
+ attract investors, shareholders, stakeholders and satisfy customers.
+ attract media/ public relations

CHAPTER 7:
1.ONE SAMPLE TEST: x ngang vs C
x ngang income vs 60.000 (constant data)
- Step1: Hypothesis:
+ Ho: x (ngang) income = 60K => There is no difference between the average
household income and 60.000
+ H1: x (ngang) income khác 60K ( Income < 60.000 or >60.000) => There is
difference between the average household income and 60.000
- Set 2: Methodology: One sample test

- Set 3: Descriptive Statistic: N, x ngang, s

A survey of 6400 people showed that average household income is 6474$ and the std =
78718$
- Step 4: Result of T-test

t = 9.629, p< 0.001. There is a significant difference between 60K and average household
income. The 60K is 9474$ smaller than average income and the difference is significant at
the 0.001 level.
(Different or not different => if different is smaller or bigger )
Note: Nếu There is no significant difference… ( k cần report Mean Difference ). Xo luôn true
=> kh cần report
2.INDEPENDENT SAMPLES T-TEST
- S1: Hypothesis:
+ Ho: Income F = Income M
+ H1: Income F khác Income M => biggest smaller
- S2: Methodology: Independent samples t-test
- S3: Descriptive Statistic:
+ N1, x1 ngang, s.dev1 ngang ( of each group)
+ N2, x2 ngang, s.dev 2 ngang
- S4: Homogeneity (đồng nhất) of Variance
Note: Chỉ so sánh khi 2 groups đồng nhất
Flevene =? ( through Levene test)
● p<0.005 => Equal variance NOT assumed ( because the variance is different ) =>
There is a difference …. (XEM p owr sig.detailed 2)
● p>0.005 => Equal variance assumed
=> Identify p< 0.05 ? and t= ?
Giải:
S1: Hypothesis
● Ho: There is no significant difference in between average household income between
male and female
● H1: There is significant difference in between average household income between
male and female
S2: Methodology: Independent sample t-test
S3: Descriptive Statistics

Vào Analyze -> Compare Means -> Independent Sample T-tests

S4: Homogeneity (đồng nhất) of Variance


NOTE:
● Correlation => sự tương quan hai chiều giữa hai biến
● Independent samples t-tests ( 2 groups) => đo lường tđ của 1 biến đến biến còn lại
( 1 chiều)
● One- Sample t-tests ( 1 biến) => so sánh 1 biến với một số
Ex: Impact của Marital Status lên Car => Dùng Independent Sample t-tests

REVIEW:
1. sample x ngang vs c
2. Two samples tests :
- xA vs xB => Independent samples t-test
- xA vs xA phẩy ( same group but in different context) => Paired samples test

3. PAIR-SAMPLES T-TEST
Step 1:
Ho: There is no significant difference between the Estimated and Actual data
H1: There is a significant difference………………………………………………..
Step 2: Hypothesis: Paired samples t- test
Step 3: Descriptives
N, x ngang, x ngang phẩy, std của 2 biến
Note: Không cần Homogeneity vì chỉ có 1 group
Step 4: Correlation
r=? , p<= 0.05
The movement of the data => 2 biến có same direction or not
Nếu move together => Nếu tốt nhất => perfect movement => Correlation should be = 1
Move up hay move down
Step 5: t-test
t=? , p<=0.05?
Nếu p<=0.05:
+ smaller or bigger ( Mean biến nào lớn hơn thì bigger)
+ Mean difference
+ Sig-level
Sửa bài Pile Foundation:
S1:
Ho: There is no significant difference between Estimated data and Actual data
H1: There is significant difference between Estimated data and Actual data
S2: Paired samples t-test
S3: Statistics

S4: Correlation
r=0.797, p<0.001 => The Estimated data highly correlated to the Actual data

S5: Result of t-test

t=-10.929, p<0.01
=> There is significant difference between Estimated data and Actual data
=> The Estimated data is smaller than the Actual data, with the mean difference is 6.38 and
it is significant at 0.001 level.

4. ANOVA:
>= 2 groups
Step 1:
Ho: xA ngang =xB ngang = xC ngang
H1: At least 1 sig.difference
Step 2: Methodology: ANOVA
Step3: Descriptive Statistics
F=?
p<=0.05?
Step 4: Assumption of Homogeneity of Variance

Đề bài: Demo file


carcart (Factor) -> Income ??? => Income: Dependent list
Step 1:
Ho: There is no significant difference in average Household income among the people who
drive different types of car
H1: There is AT LEAST 1 significant difference in average Household income among the
people who drive different types of car.
Step 2: 3 car categories => use the ANOVA
Step 3:Descriptives statistics
There are 1841 respondents who drive the economy car, have the average household
income is 21887$ and std = 5.241$
The highest average household income belongs to the group of respondents who drive the
luxury car, with the mean is 134.621$ and std=102,355$

Step 4: Assumption of Homogeneity of Variance

FLevene’s test = 1129.720, PLevene’s <0.001


=> Equal variances NOT assumed
=> ANOVA table could be wrong
=> Additional test for Equality of Mean (Welch test/ Brown Forsythe)
Note:
● pLevene<=0.05=> Equal variance NOT assumed => Robust test
● pLevene>0.05 => Equal variance assumed => ANOVA (enough, don't need Robust
test)

Step 5:
FWelch’s test = 5752.869, p<0.001
=> There is at least 1 significant difference in the average Household income among the
people who drive different types of car.
Step 6: Post-hoc test

The average household income of respondents who drive the economy car is 20672$
smaller than the average household income of respondents who drive the standard car. the
difference is significant at 0.001 level.
…. ( ss các biến còn lại)

NOTE: Chỉ lấy 3 số thập phân

You might also like