0% found this document useful (0 votes)

13 views79 pages

Module 3

Statistics of AIDS

Uploaded by

aditideo624

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views79 pages

Module 3

Statistics of AIDS

Uploaded by

aditideo624

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 79

University of Mumbai

Program – Bachelor of Engineering in

Computer Science and Engineering (Artificial Intelligence
and Machine Learning)

Class - T.E.
Course Code – CSDLO5011

Course Name – Statistics for Artificial

Intelligence Data Science

By
Prof. A.V.Phanse
Module 3 - Statistical Experiments and Significance Testing

Statistical Experiments

 A statistical experiment is a process or set of procedures that generate data,

allowing you to observe the outcome of a random variable.
 Experiments are designed to test hypotheses by manipulating one or more
independent variables to observe their effect on a dependent variable.

Steps in Conducting an Experiment:

1. Define the Hypothesis: Establish a clear, testable hypothesis (null and

alternative).
2. Design the Experiment: Plan how to manipulate variables and control
extraneous factors.
3. Collect Data: Perform the experiment and gather data.
4. Analyze Data: Use statistical methods to analyze the data.
5. Draw Conclusions: Determine whether the results support or reject the
hypothesis.
A/B Testing
 An A/B test is an experiment with two groups to establish which of two
treatments, products, procedures is superior than the other. A/B tests are
common in web design and marketing.
 A/B testing is used to compare two versions of a webpage, app, or other digital
product to determine which one performs better in terms of a specific metric,
like conversion rate, click-through rate, or user engagement.

 When you run an A/B test, you compare one

page against one or more variations that
contain one major difference in an element of
the control page.
 After a set amount of time, or visits, you
compare the results to how the change
affected your results.
 Every visitor will see one version of the page
or another, and you’ll measure conversions
from each set of visitors.
 A/B tests allow you to test one version of
copy, images, forms etc. against another.
How A/B Testing Works:

Create Two Variants (A and B):

Variant A is typically the current version (the control).
Variant B is the modified version (the treatment).
Randomly Split the Audience:

Users are randomly assigned to either Variant A or Variant B. This

randomization helps ensure that any differences in performance are due to
the changes made, rather than external factors.

Measure Performance:

Performance of both versions is tracked against the desired metric (e.g., sign-
ups, purchases etc.).

Analyze Results:

After collecting enough data, statistical analysis is performed to determine

whether the differences in performance between A and B are statistically
significant. This helps in deciding whether the new version (B) should replace
the current version (A).
To conclude this example:

 It appears quite likely that the “A” variant (i.e. orange button) has a higher
conversion rate than the “B” variant (green button)
 Decision: Keep orange button

Applications of A/B Testing:

Web Design: Testing different layouts, button colors, or call-to-action text.

Marketing: Comparing different ad copy or email subject lines.
Product Development: Assessing new features or changes to existing ones.
Significance Testing

 Significance testing is a statistical method used to determine if the observed

results in a sample are strong enough to infer that they apply to a larger
population.

 A sample is used to test a condition and make a statement about the whole
population.
Steps of Hypothesis Testing
So, Hypotheses can be made like
1. Null Hypothesis (H₀):
2. Alternative Hypothesis (H₁):
Difference Hypothesis –

Example –
Association Hypothesis –

Example –
Undirected Hypothesis –

Example –
Example –
Directed Hypothesis –

Example –
 A hypothesis is a proposed explanation or prediction about a phenomenon or
a relationship between variables, often used as a starting point for further
investigation.
 It is typically framed in a way that can be tested through experimentation,
observation, or analysis.

There are two main types of hypotheses:

 Null Hypothesis (H₀): This assumes that

there is no relationship between the
variables being studied, or no effect or
difference exists. Researchers usually
test against the null hypothesis to see if
it can be rejected.

 Alternative Hypothesis (H₁ or Ha): This

proposes that there is a relationship,
effect, or difference between the
variables being studied. It is the opposite
of the null hypothesis.
Hypothesis testing is a method for testing a claim about a parameter in a
population using data measured in a sample.

 Hypothesis Testing is a type of statistical analysis in which you put your

assumptions about a population parameter to the test.
 It is used to estimate the relationship between 2 statistical variables.

Few examples of statistical hypothesis from real-life -

1. A teacher assumes that 60% of his college's students come from lower-middle-
class families.
2. A doctor believes that 3D (Diet, Dose, and Discipline) is 90% effective for
diabetic patients.
 In "classical" inferential statistics, the null hypothesis is always tested using a
hypothesis test. The hypothesis is tested to see if there is no difference or no
relationship.
 If you want to be 100% accurate, the null hypothesis H0 can only ever be
rejected or not rejected using a hypothesis test.
 The non-rejection of H0 is not a sufficient reason to conclude that H0 is true.
Therefore, the wording "H0 was not rejected" is preferable to "H0 was retained."
Developing Null and Alternative Hypotheses

 In statistical hypothesis testing, there are always two hypotheses.

 The hypothesis to be tested is called the null hypothesis and given the symbol
H0.
 The null hypothesis states that there is no difference between a hypothesized
population mean and a sample mean.

For example, if we were to test the hypothesis that college freshmen study 20
hours per week, we would express our null hypothesis as:

In this example, our alternative hypothesis would express that freshmen do not
study 20 hours per week
Deciding Whether to Reject or Not Reject the Null Hypothesis

 The alternative hypothesis can be supported only by rejecting the null

hypothesis.
 To reject the null hypothesis means to find a large enough difference between
your sample mean and the hypothesized (null) mean.
 If the difference between the hypothesized mean and the sample mean is very
large, we reject the null hypothesis.
 If the difference is very small, we do not reject the null hypothesis.
 In each hypothesis test, we have to decide in advance what the magnitude of
that difference must be to allow us to reject the null hypothesis.
 If we fail to find a large enough difference to reject, we fail to reject the null
hypothesis.
Why is there a probability of error in a hypothesis test?

 Each time you take a sample, you of course get a different one, which means
that the results are different every time.

 In the worst case, a sample is taken that happens to deviate very strongly from
the population and the wrong statement is made.

 Therefore there is always a probability of error for every statement or

hypothesis.
Level of significance
 The probability that random value of a statistic will lie in the critical region is
called the Level of significance or α.
 The significance level is used to decide whether the null hypothesis should be
rejected or not.
 If the p-value is smaller than the significance level, the null hypothesis is to be
rejected; otherwise, it is not to be rejected.
 It is important to note that the significance level is always set before the test
and may not be changed afterwards in order to obtain the "desired" statement
after all. To ensure a certain degree of comparability, the significance level is
usually 5% or 1%.

 If a significance level of 5% is set, it means that it is 5% likely to reject the null

hypothesis even though it is actually true. Similarly, If a significance level of 1% is
set, it means that it is 1% likely to reject the null hypothesis even though it is
actually true.
p-value
 In hypothesis testing, the p-value is a probability that measures the strength of
evidence against the null hypothesis.
 It tells us how likely we are to observe the test statistic (or one more extreme)
assuming that the null hypothesis is true.

Assumption –
In the population, there is no difference in salary between men and women.

p-Value –
How likely is to draw a sample in which the salary of men and women differs by
more than 250 Euros.
A hypothesis test can be either one-tailed or two-tailed.
Two-tailed Hypothesis Tests
 The examples discussed in previous slides indicate that the average study time
is either 20 hours per week, or it is not. Computer use averages 3.2 hours per
week, or it does not. We do not specify whether we believe the true mean to be
higher or lower than the hypothesized mean. We just believe it must be
different.
 In a two-tailed test, you will reject the null hypothesis if your sample mean falls
in either tail of the distribution.
 For this reason, the alpha level (let’s assume 0.05) is split across the two tails.
 The curve shows the critical regions for a two-tailed test. These are the regions
under the normal curve that, together, sum to a probability of 0.05.
 Each tail has a probability of 0.025.
 The z-scores that designate the start of the critical region are called the critical
values

 If the sample mean taken from the population falls within these critical regions,
or "rejection regions," we would conclude that there was too much of a
difference and we would reject the null hypothesis.
 However, if the mean from the sample falls in the middle of the distribution (in
between the critical regions) we would fail to reject the null hypothesis.
One-Tailed Hypothesis Test
 We would use a single-tail hypothesis test when the direction of the results is
anticipated or we are only interested in one direction of the results.
 When performing a single-tail hypothesis test, our alternative hypothesis looks
a bit different. We use the symbols of greater than or less than.
 A single-tail hypothesis test also means that we have only one critical region
because we put the entire critical region into just one side of the distribution.
 When the alternative hypothesis is that the sample mean is greater, the critical
region is on the right side of the distribution.
 When the alternative hypothesis is that the sample is smaller, the critical
region is on the left side of the distribution.
Critical Values –

Level of 1% 5% 10%
Significance
Two tail test +/- 2.57 +/- 1.96 +/- 1.645
Right tail test + 2.326 + 1.645 + 1.282
Left tail test - 2.326 - 1.645 - 1.282
 For 5% level of significance in case of a two tailed test, the shaded area under
the curve on both tails is considered the critical region.

 Since this is a two-tailed test, half of 5% i.e. 2.5% of the values would be in
the left tail, and the other 2.5% would be in the right tail.

 Looking up the Z-score associated with 0.025 on a reference table, we find

1.96.

 Therefore, +1.96 is the critical value of the right tail and -1.96 is the critical
value of the left tail.

 The critical value for a 95% confidence level is Z = +/−1.96.

Type I and Type II Errors
When we decide to reject or not reject the null hypothesis, we have four possible
scenarios:
a. A true hypothesis is rejected.
b. A true hypothesis is not rejected.
c. A false hypothesis is not rejected.
d. A false hypothesis is rejected.
 The probability of committing a Type I error is denoted by the symbol α (alpha),
which is typically set at a significance level, such as 0.05.

 This means there’s a 5% risk of rejecting the null hypothesis when it should not
be rejected.

 Understanding Type I errors is crucial in research, as they can lead to misleading

conclusions.

 The probability of committing a Type II error is denoted by the symbol β (beta).

 Committing type II error means failing to detect an effect or difference that truly
exists.

 The power of a test is the probability of correctly rejecting a false null

hypothesis, which is calculated as 1−β. Higher power means a lower chance of
making a Type II error.

 Minimizing Type II errors is crucial for ensuring that real differences or effects
are identified.
Selection of test statistic
Sample size Population standard Population standard
deviation is known deviation is unknown
n >/=30 z-test z-test
n < 30 z-test t-test

 When conducting a hypothesis test, we are asking ourselves whether the

information in the sample is consistent, or inconsistent, with the null
hypothesis about the population.
 We follow a series of four basic steps:
1. State the null and alternative hypotheses.
2. Select the appropriate significance level and check the test assumptions.
3. Analyze the data and compute the test statistic.
4. Interpret the result

 If we reject the null hypothesis we are saying that the difference between the
observed sample mean and the hypothesized population mean is too great to
be accepted.
 When we fail to reject the null hypothesis, we are saying that the difference
between the observed sample mean and the hypothesized population mean
can be accepted if the null hypothesis is true.
Example A

A researcher claims that black horses are, on average, more than 30 lbs heavier
than white horses, which average 1100 lbs. What is the null hypothesis, and what
kind of test is this?
Solution :

The null hypothesis would be notated H0 : µ ≤ 1130 lbs

This is a right-tailed test, since the tail of the graph would be on the right.

Example B

A package of gum claims that the flavor lasts more than 39 minutes. What would be
the null hypothesis of a test to determine the validity of the claim? What sort of
test is this?

Solution

The null hypothesis would by notated as H0 : µ ≤ 39.

This is a right-tailed test, since the rejection region would consist of values greater
than 39.
z test
 A z-test is a statistical test used to determine whether there is a significant
difference between a sample statistic (such as a sample mean) and a population
parameter (such as a population mean) or between two sample statistics.

 The test is based on the z-statistic, which measures how many standard
deviations a sample statistic is from the population parameter.

When to Use z-Test:

 Large Sample Size (n > 30): The sample size should be sufficiently large (more
than 30).
 Known Population Standard Deviation (σ): The population standard deviation
is known.
Types of z-Tests:
1. One-Sample Z-Test

•Purpose: To test whether the mean of a single sample is significantly different

from a known population mean.
•When to Use: When you have one sample and want to compare its mean to a
known population mean, with a known population variance.
•Example: You want to determine if the average weight of apples from one city is
different from the national average weight.

Formula:

2. Two-Sample Z-Test

Purpose : To test if the means of two independent samples are significantly

different from each other
When to Use: When you want to compare the means of two groups, assuming
both groups have known population variances and are independent of each other.
Example: You want to compare the average heights of men and women in a
population to see if there is a significant difference.

Formula
Example: One-Sample Z-Test

Suppose the average weight of apples in a population is μ=150 grams with a

standard deviation of σ=20 grams. You collect a sample of 40 apples and find the
sample mean weight is 155 grams. At a 5% significance level, does this sample
provide enough evidence to suggest that the average apple weight is different
from 150 grams?

Solution :

Conclusion: There is not enough evidence to suggest that the average apple weight
is different from 150 grams.
Example: Two-Sample Z-Test
Compare the average test scores of two classes to see if their average scores differ
significantly.

Solution :
Step 5: Conclusion

There is not enough evidence to suggest that the average test scores of Class A
and Class B are significantly different at the 5% significance level.
Example A

The school nurse thinks the average height of 7th graders has increased. The
average height of a 7th grader five years ago was 145 cm with a standard deviation
of 20 cm. She takes a random sample of 200 students and finds that the average
height of her sample is 147 cm. Are 7th graders now taller than they were before?
Conduct a single-tailed hypothesis test using a .05 significance level to evaluate the
null and alternative hypotheses.

Solution :
First, we develop our null and alternative hypotheses:

Choose α = .05.
The critical value for this one tailed test is z=1.64.
This is a one-tailed test, and a z-score of 1.64 cuts off 5% in the single tail.
Any test statistic greater than 1.64 will be in the rejection region.
Next, we calculate the test statistic for the
sample
The calculated z−score of 1.414 is smaller than 1.64 and thus does not fall in the
critical region.

Our decision is to fail to reject the null hypothesis and conclude that the
probability of obtaining a sample mean equal to 147 is likely to have been due to
chance.
Example B

A farmer is trying out a planting technique that he hopes will increase the yield on
his pea plants. The average number of pods on one of his pea plants is 145 pods
with a standard deviation of 100 pods. This year, after trying his new planting
technique, he takes a random sample of 144 plants and finds the average number
of pods to be 147. He wonders whether or not this is a statistically significant
increase. What are his hypotheses and the test statistic?
Solution :
First, we develop our null and alternative hypotheses:

This alternative hypothesis is >since he believes that there might be a gain in the
number of pods.
Next, we calculate the test statistic for the sample of pea plants.
 If we choose α = 0.05 , the critical value will be 1.645 for one tailed test.

 We will reject the null hypothesis if the test statistic is greater than 1.645.

 The value of the test statistic is 0.24.

 This is less than 1.645 and so our decision is to fail to reject the null hypothesis.

 Based on our sample we believe the mean is equal to 145.

t test
 A t-test is a statistical test used to compare the means of two groups or a
sample to a population when the population variance is unknown.
 Unlike the Z-test, the t-test is more commonly used when the sample size is
small (n < 30) and the population standard deviation is not known.

Types of t-tests:
1. One-Sample t-Test:

Purpose: To test if the mean of a single sample is significantly different from a

known or hypothesized population mean.
When to Use: When you want to compare the mean of one sample to a known
population mean, but the population standard deviation is unknown.
Example: Testing whether the average weight of apples in a sample differs from a
hypothesized population mean of 150 grams.

Formula
2. Independent Two-Sample t-Test:

•Purpose: To compare the means of two independent groups to determine if there

is a statistically significant difference between them.
•When to Use: When you have two independent samples (e.g., different groups of
people) and want to test whether their means are significantly different.
•Example: Comparing the average test scores of two different classes to see if there
is a significant difference in performance.

Formula (assuming equal variances)

3. Paired t-Test:

•Purpose: To compare the means of two related groups, such as measurements

taken from the same group at different times.
•When to Use: When you have paired or matched samples, like before-and-after
measurements or the same participants undergoing two treatments.
•Example: Measuring the weight of individuals before and after a diet program to
test if there is a significant change in weight.

Formula
 Back in the early 1900’s, William Sealy Gosset, a chemist at a brewery in Ireland
discovered that when he was working with very small samples, the distributions
of the mean differed significantly from the normal distribution.
 He noticed that as his sample sizes changed, the shape of the distribution
changed as well.
 He published his results under the pseudonym ‘Student’ and this concept and
the distributions for small sample sizes are now known as “Student’s
t−distributions.”
 The differences between the t-distribution and the normal distribution are more
exaggerated when there are fewer data points, and therefore fewer degrees of
freedom.
 Degrees of freedom are essentially the number of samples that have the
‘freedom’ to change without affecting the sample mean.

 If you were conducting a two-tailed hypothesis test on a sample of 25 students,

your df = 25-1 = 24
Example A

The high school athletic director is asked if football players are doing as well
academically as the other student athletes. We know from a previous study that the
average GPA for the student athletes is 3.10. After an initiative to help improve the
GPA of student athletes, the athletic director randomly samples 20 football players
and finds that the average GPA of the sample is 3.18 with a sample standard
deviation of 0.54. Is there a significant improvement? Use a 0.05 significance level.
Solution :

Step 1: Cleary state the null and alternative hypotheses.

Step 2: Identify the appropriate significance level and confirm the test assumptions.

We were told that we should use a 0.05 significance level. The size of the sample
also helps here, as we have 20 players. So, we can conclude that the assumptions
for the single sample t-test have been met.
Step 3: Analyze the data
We use our t-test formula:

We know that we have 20 observations, so our degrees of freedom for this test is 19.
Nineteen degrees of freedom at the 0.05 significance level gives us a critical value of
± 2.093.
Step 4: Interpret your results

 Since our calculated t-test value is lower than our t-critical value, we fail to
reject the Null Hypothesis.
 Therefore, the average GPA of the sample of football players is not significantly
different from the average GPA of student athletes.
 Thus, the athletic director can conclude that the mean academic performance
of football players does not differ from the mean performance of other student
athletes.
Example B

Duracell manufactures batteries that the CEO claims will last an average of 300
hours under normal use. A researcher randomly selected 20 batteries from the
production line and tested these batteries. The tested batteries had a mean life
span of 270 hours with a standard deviation of 50 hours. Do we have enough
evidence to suggest that the claim of an average lifetime of 300 hours is false?
Solution :

Step 1: Clearly state the Null and Alternative Hypothesis

Step 2: Identify the appropriate significance level and confirm the test
assumptions.
We’ll use the standard significance level of 0.05, and we assume a normal
population distribution.

Step 3: Analyze the data and compute the test statistic

 We know that we have 20 batteries, so our degrees of freedom for this test is
(20-1)= 19.

 Nineteen degrees of freedom at the 0.05 significance level gives us a critical

value of
± 2.093.
Step 4: Interpret your results

Since our calculated t-test value is greater than our t-critical value, it lies in the
critical region therefore, we reject the Null Hypothesis.
The average battery life of the sample is significantly different from the average
battery life claim by the CEO. Therefore, the claim of an average lifetime of 300
hours is false
Example : Independent Two Sample Test

A researcher wants to determine if two different diets have different effects on

weight loss. The researcher takes a random sample of 10 people from each group:
Group 1 (Diet A): Their weight losses in pounds are: 5, 7, 6, 9, 8, 4, 7, 5, 6, 8
Group 2 (Diet B): Their weight losses in pounds are: 8, 10, 6, 9, 12, 11, 9, 10, 8, 11
The researcher wonders if there is a significant difference in the average weight loss
between Diet A and Diet B.

Solution :
Null hypothesis (H₀): There is no difference in the means of the two groups.
H0:μ1=μ2

Alternative hypothesis (H₁): There is a difference in the means of the two groups.
H1:μ1≠μ2

This is a two-tailed test since we are testing for any difference between the
means, not specifically an increase or decrease.
Since ∣t∣=4.01 is greater than the critical value of 2.101, we reject the null hypothesis.
There is significant evidence to suggest that the average weight loss between the two diets is
different.
Chi square test

 The Chi-Squared test is a statistical method used to examine the relationships

between categorical variables.
 It compares observed results with expected outcomes to determine whether
differences between these two are due to chance or if they signify a statistically
significant pattern.
 There are two primary types of Chi-Squared tests:

1. Chi-Squared Test for Independence

•Purpose: Determines if there is a significant association between two categorical
variables.
•Example: You could test if there is an association between gender (male, female)
and voting preference (candidate A, candidate B).

Oi is the observed frequency, and Ei is the

expected frequency.

The degrees of freedom (df) for this test is calculated as:

where r is the number of rows and c is the
number of columns in the contingency table.
2. Chi-Squared Goodness-of-Fit Test

Purpose: Tests whether a sample data matches a population with a specific

distribution.
Example: Testing if a die is fair by comparing the observed outcomes with the
expected frequencies (equal probability for each face).

Important things to note about chi square test

For 5% level of significance
 The calculated chi
square value is less
than the tabulated chi
square value.

 Therefore, the null

hypothesis is not
rejected.
Fishers Exact Test

 Fisher's Exact Test is a statistical significance test used to determine if there are
nonrandom associations between two categorical variables in a contingency
table, typically 2x2.
 It is particularly useful when sample sizes are small, and the assumptions of the
more common Chi-square test might not be valid.

Key Features:

Exact: Unlike the Chi-square test, which is an approximation, Fisher's Exact Test
calculates the exact probability of obtaining a given distribution of the data,
assuming the null hypothesis is true.

Small Sample Sizes: It's especially valuable when dealing with small sample sizes
(n < 5) because it doesn't rely on large-sample approximations.
Example Use:
Suppose you are testing whether a new treatment works better than a standard
one:

Fisher's Exact Test can determine if the success rate of the new drug is
significantly different from that of the standard treatment, without needing large
sample sizes.

How It Works:
It computes the probability of the observed contingency table under the null
hypothesis by calculating hypergeometric probabilities for all possible tables with
the same marginal totals and compares them with the observed table.
When to Use:
When you have a small sample size (less than 5 observations in any cell of the
table).
When you need to evaluate the significance of associations between two
categorical variables
Example:
Let’s say you have a table that compares the outcome of a new drug versus a
placebo for patient recovery:
Thank You…

Overview of Hypothesis Testing Methods
100% (1)
Overview of Hypothesis Testing Methods
21 pages
RM 5
No ratings yet
RM 5
46 pages
Unit - 3
No ratings yet
Unit - 3
69 pages
ES12010 Lecture 8 2023-24
No ratings yet
ES12010 Lecture 8 2023-24
69 pages
Introduction To Statistical Hypothesis Testing in R
No ratings yet
Introduction To Statistical Hypothesis Testing in R
8 pages
Hypothesis - Testing (Updated)
No ratings yet
Hypothesis - Testing (Updated)
13 pages
Mod2 Last
No ratings yet
Mod2 Last
9 pages
Hypothesis Testing - Intro - Summer 2025
No ratings yet
Hypothesis Testing - Intro - Summer 2025
59 pages
PRAC 2 Generating A Hypothesis
No ratings yet
PRAC 2 Generating A Hypothesis
49 pages
Z Test Hypothesis Testing Overview
No ratings yet
Z Test Hypothesis Testing Overview
33 pages
Understanding Hypotheses and Testing
No ratings yet
Understanding Hypotheses and Testing
36 pages
Hypothesis Testing in Inferential Statistics
No ratings yet
Hypothesis Testing in Inferential Statistics
28 pages
Inferential Stat
No ratings yet
Inferential Stat
31 pages
Inferential Statistics & Hypothesis Testing
No ratings yet
Inferential Statistics & Hypothesis Testing
25 pages
SPSS 1
No ratings yet
SPSS 1
191 pages
FALLSEM2025 26 VL TRES102L 00100 TH 2025-08-13 Hypothesis Testing (Parametric and Non Parametric Tests)
No ratings yet
FALLSEM2025 26 VL TRES102L 00100 TH 2025-08-13 Hypothesis Testing (Parametric and Non Parametric Tests)
19 pages
Hypothesis Testing in Python Guide
No ratings yet
Hypothesis Testing in Python Guide
54 pages
Week 13
No ratings yet
Week 13
33 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
10 pages
Lect 7 Hypothesis Testing
No ratings yet
Lect 7 Hypothesis Testing
23 pages
RM&IPR Mod 4
No ratings yet
RM&IPR Mod 4
97 pages
Critical Value for Two Means Test
No ratings yet
Critical Value for Two Means Test
35 pages
PSAI Unit 4
No ratings yet
PSAI Unit 4
38 pages
Intro of Hypothesis Testing
100% (1)
Intro of Hypothesis Testing
66 pages
Unit 3 (Hypothesis Testing)
No ratings yet
Unit 3 (Hypothesis Testing)
40 pages
Essay On Hypothesis Testing
100% (2)
Essay On Hypothesis Testing
4 pages
Statistical Inference and Hypothesis Testing
No ratings yet
Statistical Inference and Hypothesis Testing
33 pages
6 - Hypothesis Testing
No ratings yet
6 - Hypothesis Testing
27 pages
Computational Data Science - Unit 4
No ratings yet
Computational Data Science - Unit 4
18 pages
Lecture 6 - Intro To Hypothesis Testing - Biostats - HS - 280323
No ratings yet
Lecture 6 - Intro To Hypothesis Testing - Biostats - HS - 280323
55 pages
Hypothesis
No ratings yet
Hypothesis
11 pages
An Assignment On Hypothesis Testing
No ratings yet
An Assignment On Hypothesis Testing
12 pages
Chapter IX Hypothesis Testing
No ratings yet
Chapter IX Hypothesis Testing
31 pages
Hypothesis Testing Fundamentals
No ratings yet
Hypothesis Testing Fundamentals
40 pages
L6-Statistical Hypothesis Testing
No ratings yet
L6-Statistical Hypothesis Testing
34 pages
Psychology Research Essentials
No ratings yet
Psychology Research Essentials
145 pages
Overview of Inferential Statistics Techniques
No ratings yet
Overview of Inferential Statistics Techniques
28 pages
4.2 Hypothesis Testing
No ratings yet
4.2 Hypothesis Testing
49 pages
Unit 4 Statistical Testing and Modeling in R
No ratings yet
Unit 4 Statistical Testing and Modeling in R
25 pages
Understanding Hypothesis Testing
No ratings yet
Understanding Hypothesis Testing
8 pages
BI Lec 6 - Hypothesis Testing
No ratings yet
BI Lec 6 - Hypothesis Testing
22 pages
ACCTY 312 - Lesson 4
No ratings yet
ACCTY 312 - Lesson 4
9 pages
Pharmacy Statistics Midterms - Hypothesis Testing
100% (1)
Pharmacy Statistics Midterms - Hypothesis Testing
41 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
86 pages
7.hypothesis Testing and Sample Size Determination
No ratings yet
7.hypothesis Testing and Sample Size Determination
60 pages
Intro to Hypothesis Testing
No ratings yet
Intro to Hypothesis Testing
44 pages
Lecture Inferential Statistical Analysis
No ratings yet
Lecture Inferential Statistical Analysis
43 pages
Lecture 5 Test of Hypothesis Upload T
No ratings yet
Lecture 5 Test of Hypothesis Upload T
30 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
8 pages
Hypothesis Testing Essentials
No ratings yet
Hypothesis Testing Essentials
27 pages
6.2 Hypothesis Testing v1
No ratings yet
6.2 Hypothesis Testing v1
34 pages
BRM Unit 4
No ratings yet
BRM Unit 4
20 pages
Module2 DS
No ratings yet
Module2 DS
46 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
8 pages
Fundamentals of Hypothesis Testing: Zoheb Alam Khan
No ratings yet
Fundamentals of Hypothesis Testing: Zoheb Alam Khan
82 pages
Data Analysis Finals
No ratings yet
Data Analysis Finals
4 pages
Hypothesis Test
No ratings yet
Hypothesis Test
57 pages
Hypothesis Lecture
No ratings yet
Hypothesis Lecture
7 pages
Stating Hypothesis, Study Questions and Formulating Objectives-1
No ratings yet
Stating Hypothesis, Study Questions and Formulating Objectives-1
26 pages
BCT Ass3 Ans
No ratings yet
BCT Ass3 Ans
6 pages
BCT All - PYQ
No ratings yet
BCT All - PYQ
3 pages
Module 6
No ratings yet
Module 6
35 pages
Module 2 - Part 1
No ratings yet
Module 2 - Part 1
42 pages
Mod 3
No ratings yet
Mod 3
9 pages
Adp0945 Xiea
No ratings yet
Adp0945 Xiea
10 pages
Presentation, Analysis and Presentation of Data: Table 4.1 Age
100% (2)
Presentation, Analysis and Presentation of Data: Table 4.1 Age
45 pages
Lecture 02 - Corporate Finance - Risk - CAPM
No ratings yet
Lecture 02 - Corporate Finance - Risk - CAPM
166 pages
Hershcovis2009 Towards A Multi-Foci Approach To Workplace Aggression - A Meta-Analytic Review of Outcomes From Different Perpetrators
No ratings yet
Hershcovis2009 Towards A Multi-Foci Approach To Workplace Aggression - A Meta-Analytic Review of Outcomes From Different Perpetrators
21 pages
Risk, Return, and Capital Opportunity Insights
100% (1)
Risk, Return, and Capital Opportunity Insights
12 pages
Ncert Class 11 Economics Chapter 5 Measures of Central Tendency Notes and Mind Map 0 2024 13-01-092109
No ratings yet
Ncert Class 11 Economics Chapter 5 Measures of Central Tendency Notes and Mind Map 0 2024 13-01-092109
17 pages
GIS Flood Risk Analysis in Godavari
No ratings yet
GIS Flood Risk Analysis in Godavari
23 pages
Mock Reviewer For Comprehensive Exam (Educational Management)
100% (2)
Mock Reviewer For Comprehensive Exam (Educational Management)
21 pages
Measurement of Central Tendency - Ungrouped Data: Tendency and The Measures of Spread or Variability. 2.12 Definition
No ratings yet
Measurement of Central Tendency - Ungrouped Data: Tendency and The Measures of Spread or Variability. 2.12 Definition
9 pages
Glenn P. Jenkins: Cost-Benefit Analysis For Investment Decisions
No ratings yet
Glenn P. Jenkins: Cost-Benefit Analysis For Investment Decisions
41 pages
Exercises Probability and Statistics: Bruno Tuffin Inria, France
No ratings yet
Exercises Probability and Statistics: Bruno Tuffin Inria, France
33 pages
Self-Esteem and Grammar in Communication
No ratings yet
Self-Esteem and Grammar in Communication
21 pages
REVISE CommunicationSkillsOfAccountancyBusinessAndManagementStudents BernardoE BernardoR DonggonV MangogtongA SigbaJ
No ratings yet
REVISE CommunicationSkillsOfAccountancyBusinessAndManagementStudents BernardoE BernardoR DonggonV MangogtongA SigbaJ
34 pages
Weibull Parameter Estimation Guide
No ratings yet
Weibull Parameter Estimation Guide
6 pages
Matrix
No ratings yet
Matrix
14 pages
Kmpfe Sedlmeier Renkewitz
No ratings yet
Kmpfe Sedlmeier Renkewitz
27 pages
POMA Chapter 4 Process Costing
No ratings yet
POMA Chapter 4 Process Costing
17 pages
Integration and Comovement of Developed and Emerging Islamic Stock Markets: A Case Study of Malaysia
No ratings yet
Integration and Comovement of Developed and Emerging Islamic Stock Markets: A Case Study of Malaysia
37 pages
Factors Affecting The Declining Number of Enrollees in Matutum View Baptist Academy 1
0% (2)
Factors Affecting The Declining Number of Enrollees in Matutum View Baptist Academy 1
21 pages
Math Gathering Dta
No ratings yet
Math Gathering Dta
19 pages
College Students on Sexual Harassment
No ratings yet
College Students on Sexual Harassment
10 pages
Central Tendency & Variability in Stats
No ratings yet
Central Tendency & Variability in Stats
23 pages
Group 5 Final Revision
No ratings yet
Group 5 Final Revision
65 pages
ProbStat Lec08
No ratings yet
ProbStat Lec08
20 pages
MSF568 Lec06Slides PDF
No ratings yet
MSF568 Lec06Slides PDF
100 pages
Frostig PDF
100% (5)
Frostig PDF
37 pages
Fiber Length of Pulp by Projection (Revision of T 232 cm-01)
No ratings yet
Fiber Length of Pulp by Projection (Revision of T 232 cm-01)
27 pages
UNIT - II. Biostatistics
No ratings yet
UNIT - II. Biostatistics
84 pages
Determinants of Tax Avoidance Disclosure Moderated by Firm Size
No ratings yet
Determinants of Tax Avoidance Disclosure Moderated by Firm Size
16 pages
Chapter-I Introduction of The Study Talent Management: Meaning
No ratings yet
Chapter-I Introduction of The Study Talent Management: Meaning
63 pages

Module 3

Uploaded by

Module 3

Uploaded by

University of Mumbai

Program – Bachelor of Engineering in

Course Name – Statistics for Artificial

 A statistical experiment is a process or set of procedures that generate data,

Steps in Conducting an Experiment:

1. Define the Hypothesis: Establish a clear, testable hypothesis (null and

 When you run an A/B test, you compare one

Create Two Variants (A and B):

Users are randomly assigned to either Variant A or Variant B. This

After collecting enough data, statistical analysis is performed to determine

Applications of A/B Testing:

Web Design: Testing different layouts, button colors, or call-to-action text.

 Significance testing is a statistical method used to determine if the observed

There are two main types of hypotheses:

 Null Hypothesis (H₀): This assumes that

 Alternative Hypothesis (H₁ or Ha): This

 Hypothesis Testing is a type of statistical analysis in which you put your

Few examples of statistical hypothesis from real-life -

 In statistical hypothesis testing, there are always two hypotheses.

 The alternative hypothesis can be supported only by rejecting the null

 Therefore there is always a probability of error for every statement or

 If a significance level of 5% is set, it means that it is 5% likely to reject the null

 Looking up the Z-score associated with 0.025 on a reference table, we find

 The critical value for a 95% confidence level is Z = +/−1.96.

 Understanding Type I errors is crucial in research, as they can lead to misleading

 The probability of committing a Type II error is denoted by the symbol β (beta).

 The power of a test is the probability of correctly rejecting a false null

 When conducting a hypothesis test, we are asking ourselves whether the

The null hypothesis would be notated H0 : µ ≤ 1130 lbs

The null hypothesis would by notated as H0 : µ ≤ 39.

When to Use z-Test:

•Purpose: To test whether the mean of a single sample is significantly different

Purpose : To test if the means of two independent samples are significantly

Suppose the average weight of apples in a population is μ=150 grams with a

 The value of the test statistic is 0.24.

 Based on our sample we believe the mean is equal to 145.

Purpose: To test if the mean of a single sample is significantly different from a

•Purpose: To compare the means of two independent groups to determine if there

Formula (assuming equal variances)

•Purpose: To compare the means of two related groups, such as measurements

 If you were conducting a two-tailed hypothesis test on a sample of 25 students,

Step 1: Cleary state the null and alternative hypotheses.

Step 1: Clearly state the Null and Alternative Hypothesis

Step 3: Analyze the data and compute the test statistic

 Nineteen degrees of freedom at the 0.05 significance level gives us a critical

A researcher wants to determine if two different diets have different effects on

 The Chi-Squared test is a statistical method used to examine the relationships

1. Chi-Squared Test for Independence

Oi​ is the observed frequency, and Ei​ is the

The degrees of freedom (df) for this test is calculated as:

Purpose: Tests whether a sample data matches a population with a specific

Important things to note about chi square test

 Therefore, the null

You might also like

Oi is the observed frequency, and Ei is the