0% found this document useful (0 votes)
36 views8 pages

Introduction To Stats

Stats involves collecting and analyzing sample data to draw conclusions. There are two main types: descriptive stats, which analyzes and summarizes data using measures of central tendency like mean, median, and mode or measures of variability like range and standard deviation; and inference stats, which draws conclusions from the analyzed data. Measures of central tendency indicate the central or typical value of a data set and include the mean, median, and mode, which is used for categorical variables. Measures of variability show how spread out the data is and include range, variance, and standard deviation. Variance measures how far data points are from the mean, with higher variance indicating values are further from the mean.

Uploaded by

Sai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views8 pages

Introduction To Stats

Stats involves collecting and analyzing sample data to draw conclusions. There are two main types: descriptive stats, which analyzes and summarizes data using measures of central tendency like mean, median, and mode or measures of variability like range and standard deviation; and inference stats, which draws conclusions from the analyzed data. Measures of central tendency indicate the central or typical value of a data set and include the mean, median, and mode, which is used for categorical variables. Measures of variability show how spread out the data is and include range, variance, and standard deviation. Variance measures how far data points are from the mean, with higher variance indicating values are further from the mean.

Uploaded by

Sai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd

What is the Stats?

Stats is a domain in which


1 Collecting the right data sample data
2 Analyze it summarize of the data
3 Draw conclusion from it

e.g. Covid vaccination

Stats
Descriptive Stats Analyze the data:
Inference Stats Draw conclusion from the data

Descriptive Stats Analyze the data and summarize the data

Measure of Central Tendency


Mean
Median
Mode

Measure of variability How much data is spreaded away from the mean?
Range Max - Min
Variance
Standard Deviation
Which is suitable to play at 3rd position?
Virat Kohli Ishan kishan
1st match 40 60 1
50 40 2
60 20 3
45 0 4
55 50 5
47 40 Mean Sum of all val 6
53 5 no of values 7
48 10 8
52 35 9
10th match 50 25 10

Mean/Average 50 28.5 Virat Kohli


Median 50 30
Mode 50 40
Range 20 60

When to use which measure of CT?


variable/col/feature is numeric / categorical

numeric mean/median mean


median
categorical mode
Gender
M
F
F
F
M
M
F
F
F
F

Mode F

Virat Kohli diff x - mean


1st match 60 56.25
59 42.25
56 12.25
50 6.25
50 6.25
50 6.25
50 6.25
50 6.25
50 6.25
10th match 50 6.25

Mean 52.5 15.45


Range 10
Variance 15.45 measure of spreadness: how much datapoints are moving away from the mean

Variance is leswhen more data points are closer to mean


Variance is higwhen more data points are away from mean
Virat Kohli Ishan kishan
40 0
45 5
47 10
48 20
50 25 Median
50 30 if n is odd
52 35 if n is even
53 40
55 40
60 60

Salary Salary + extreme values


10000 10000
15000 15000
35000 35000
12000 12000
18000 18000
20000 20000
11000 11000
14000 14000
10000 10000
14000 140000

mean 15900 28500 12600 sum of values


median 14000 14500 500 middle values

mean is getting more impacted because of extreme values

Player A
40
50
60

Mean 50
Variance 66.66666667
Which Player will have highe
Player B has higher variance

way from the mean


middle value
n+1/2
average of (n/2 n+2/2)
Player B
100
0
50

50
1666.666667
hich Player will have higher variance?
ayer B has higher variance

You might also like