0% found this document useful (0 votes)
30 views11 pages

STAT PRELIM - Hemal

The document contains statistical calculations and analyses, including mean and standard deviation of age distribution, correlation coefficients, binomial and Poisson probabilities, and hypothesis testing using Chi-square and t-tests. It also covers concepts such as population vs. sample, Type I and II errors, critical regions, and the power of tests. Various statistical problems are solved with detailed steps and final answers provided.

Uploaded by

bhomaleprajwal19
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views11 pages

STAT PRELIM - Hemal

The document contains statistical calculations and analyses, including mean and standard deviation of age distribution, correlation coefficients, binomial and Poisson probabilities, and hypothesis testing using Chi-square and t-tests. It also covers concepts such as population vs. sample, Type I and II errors, critical regions, and the power of tests. Various statistical problems are solved with detailed steps and final answers provided.

Uploaded by

bhomaleprajwal19
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Name: Vedant Mundada

Rollno:SEBD23201
Division: B

STAT PRELIM PAPER

Q.1 a) Calculate the mean and standard deviation for the following table giving the age
distribution of 542 members.

Age (in years) 20 - 30 30 - 40 40 - 50 50 - 60 60 - 70 70 - 80 80 - G0


No. of members 3 61 132 153 140 51 2
Ans:

f x2
and S.D. = σ = √[(Σ Σi fi ) − ( 𝑥𝑖2 )]
i

Age Group Midpoint xi No. of fi xi fi xi2


(years) Members fi
20 - 30 25 3 75 1875
30 - 40 35 61 2135 74725
40 - 50 45 132 5940 267300
50 - 60 55 153 8415 462825
60 - 70 65 140 9100 591500
70 - 80 75 51 3825 286875
80 - 90 85 2 170 14450
Total - 542 29660 1687550

𝑥̅ = 29660 / 542 = 54.7 years

σ = sqrt((1687550 / 542) - (54.7)^2)


σ = sqrt(3114.67 - 2992.09)
σ = sqrt(122.58) σ ≈ 11.08 years

 Mean Age = 54.7 years


 Standard Deviation = 11.08 year

Q.1 b) In a partially destroyed laboratory record of an analysis of correlation data, the following
results only are legible: Variance of X = G Regression equations: 8X – 10Y + 66 = 0,40X – 18Y = 214.
Find: i) The mean values X and Y, ii) The correlation coefficient between X and Y, and iii) The
standard deviation of Y.

Ans: Variance of X = G, Regression equations: 8X - 10Y + 66 = 0,40X - 18Y = 214

Express Y in terms of X: Y = (8X + 66) / 10

Y = (40X - 214) / 18

Solving simultaneously, we get,

X̄ (Mean of X) 7
Ȳ (Mean of Y) 10

r = sqrt(Product of regression slopes / Variance of X)

Calculation Value
Slope of X 8/10 = 0.8
Slope of Y 40/18 ≈ 2.22
Product of Slopes (0.8 × 2.22) = 1.776
Correlation Coefficient r sqrt(1.776 / G) ≈ 0.44

𝜎𝑋
𝜎𝑌 =
|𝑏𝑌 |

Calculation Value
Standard Deviation of X (σX) sqrt(G) = 3
Regression Coefficient of Y (bY) 2.22
Standard Deviation of Y (σY) 3 / 2.22 ≈ 1.35

 Mean of X = 7
 Mean of Y = 10
 Correlation Coefficient (r) = 0.44
 Standard Deviation of Y = 1.35

Q.3) a) Assume that on average, out of 15 telephone numbers called between 2 PM to 3 PM on


weekdays, a certain number are busy. What is the probability that 6 randomly selected telephone
numbers called: i) Not more than 3 are busy. ii) At least 3 are busy.

Ans:

Step 1: Given Data

Total calls (n) = 6


Probability of a busy line (p) = 15 / 15 = 1 Probability of free
line (q) = 1 - p = 0

Since we are dealing with a binomial distribution, the probability formula is:

P(X = k) = (nCk) * (pk) * (q(n-k))

Step 2: Compute Probability for "Not More Than 3 Busy"

P(X ≤ 3) = P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3)

Using binomial expansion:

X Binomial Probability P(X)


0 0.0820
1 0.2458
2 0.3687
3 0.2765
P(X ≤ 3) 0.G730

Step 3: Compute Probability for "At Least 3 Busy"

P(X ≥ 3) = 1 - P(X ≤ 2)

Using binomial calculations:

X Binomial Probability P(X)


0 0.0820
1 0.2458
2 0.3687
P(X ≥ 3) 0.3042

Final Answers

P(X ≤ 3) = 0.9730
P(X ≥ 3) = 0.3042

Q.3) b) If the probability that an individual suffers a bad reaction from a certain injection is 0.001,
determine the probability out of 2000 people, by using Poisson's distribution. i) Exactly 3 will suffer a
bad reaction. ii) More than 1 will suffer a bad reaction.

Ans :

Step 1: Given Data


Probability of bad reaction (p) = 0.001 Total
individuals (n) = 2000
Mean (λ) = n × p = 2000 × 0.001 = 2

Since pp is very small, we use Poisson's probability formula:

(e−λ ∗ λk )
P(X = k) =
k!

Step 2: Compute Probability for "Exactly 3 Suffer a Bad Reaction"


(e−2 ∗ 23 )
P(X = 3) =
3!

P(X = 3) = (0.1353 * 8) / 6
P(X = 3) ≈ 0.180

Step 3: Compute Probability for "More Than 1 Suffer a Bad Reaction"

P(X > 1) = 1 - (P(X = 0) + P(X = 1))

Using Poisson calculations:

X Poisson Probability P(X)


0 0.1353
1 0.2707
P(X > 1) 1 - (0.1353 + 0.2707) = 0.5G4

Final Answers

P(X = 3) = 0.180
P(X > 1) = 0.594

Q.3) c) In a sample of 1000 cases, the mean of a certain test is 14 and the standard deviation is 2.5.
Assuming the distribution to be normal, find: i) How many students scored between 12 and 15.ii)
How many scored below 8.[Given: A(z = 0.8) = 0.2881, A(z = 0.4) = 0.1554, A(z = 2.4) = 0.4G18]

Ans:

Step 1: Given Data


Sample size (n) = 1000
Mean (μ) = 14
Standard deviation (σ) = 2.5

Since the distribution is normal, we use the Z-score formula:

Z = (X - μ) / σ

Step 2: Compute Probability for Students Scoring Between 12 and 15

Z(12) = (12 - 14) / 2.5 = -0.8


Z(15) = (15 - 14) / 2.5 = 0.4

Using given Z-table values:

A(0.8) = 0.2881
A(0.4) = 0.1554

P(12 ≤ X ≤ 15) = A(0.4) - A(-0.8)


= 0.1554 + 0.2881
= 0.4435

Total students = 1000 × 0.4435 = 443

Step 3: Compute Probability for Students Scoring Below 8

Z(8) = (8 - 14) / 2.5 = -2.4

Using given Z-table value:

A(2.4) = 0.4918

P(X < 8) = A(-2.4) = 0.4918

Total students = 1000 × 0.4918 = 492

Final Answers

Number of students scoring between 12 and 15 = 443 Number


of students scoring below 8 = 492
Q.5) a) The following table gives the number of accidents that took place in an industry during
various days of the week. Test if accidents are uniformly distributed over the week.

Days Mon Tue Wed Thu Fri Sat


No. of Accidents 14 18 12 11 15 14

Given: χ²₀.₀₅,₅ = 11.0G.

Ans :

Total number of accidents:

N = 14 + 18 + 12 + 11 + 15 + 14 = 84

Since we are testing uniform distribution, the expected frequency for each day:

E = Total Accidents / Total Days = 84 / 6 = 14

Step 2: Apply Chi-Square Formula

Chi-Square formula:

χ² = Σ ((O - E)² / E)

where:

 O= Observed frequency
 E= Expected frequency

Days Observed (O) Expected (E) (O - E) (O - E)² (O - E)² / E


Mon 14 14 0 0 0.000
Tue 18 14 4 16 1.143
Wed 12 14 -2 4 0.286
Thu 11 14 -3 9 0.643
Fri 15 14 1 1 0.071
Sat 14 14 0 0 0.000
Total χ² - - - - 2.143

Step 3: Compare with Critical Value

Given critical value:

χ²₀.₀₅,₅ = 11.09
Since:

2.143 < 11.09

we fail to reject the hypothesis. This means accidents are uniformly distributed over the week.

Final Answer

Since χ² = 2.143 is less than the critical value 11.09, we conclude that accidents are uniformly distributed
over the week.

Q.5) b) A normal population has mean 6.8 and standard deviation 1.5. A sample of
400 members gave a mean of 6.75. Is the difference significant?
Given: Zα = 1.96 at 5% level of significance.

Ans:

Step 1: Given Data

Population mean (μ) = 6.8


Population standard deviation (σ) = 1.5
Sample size (n) = 400
Sample mean (X̄) = 6.75
Significance level (α) = 5%
Critical value (Zα) = 1.96

Since the sample size is large (n ≥ 30), we use the Z-test formula:

(𝑋̄ − 𝜇)
𝑍 =
𝜎
( )
√(𝑛)

Step 2: Compute Z-score

(6.75 − 6.8)
𝑍 =
1.5
( )
√4000

Z = (-0.05) / (1.5 / 20)


Z = (-0.05) / 0.075
Z ≈ -0.67
Step 3: Compare with Critical Value

|Z| = 0.67 < Zα = 1.96

Since Z is less than the critical value, we fail to reject the null hypothesis. This means the
difference is not significant.

Final Answer

Since |Z| = 0.67 < 1.96, we conclude that the difference in means is not significant.

Q.5) c) Suppose that sweets are sold in packages of fixed weight of contents. The procedure of
the packages is interested in testing the average weight of content in packages in 1 kg. Sum of
squares of deviations from mean of 12 samples is 0.011967. Using above data should we
conclude the average.
Given: X̄ = 0.9883, t₀.₀₅,₁₁ = 2.201.

Ans:

Step 1: Given Data

Sample size (n) = 12


Sample mean (x̄) = 0.9883
Sum of squares of deviations (∑(xi− x̄)2) = 0.011967
Hypothesized mean (μ) = 1
Significance level (α) = 5% Critical
value (t₀.₀₅,₁₁) = 2.201

∑(x𝑖 −x̅)2
The sample variance is calculated as: s 2 = (𝑛−1)

s² = 0.011967 / (12 - 1) = 0.011967 / 11 = 0.001088

The sample standard deviation is:

s = sqrt(0.001088) ≈ 0.0330

Using the t-test formula: t = (X̄ - μ) / (s / sqrt(n))


(𝑋̄ − 𝜇)
𝑡 =
𝑠
( )
√(𝑛)

Step 2: Compute t-score

0.9883 − 1
𝑡=
0.0330
√12

t = (-0.0117) / (0.0330 / 3.464)


t = (-0.0117) / 0.00953
t ≈ -1.23

Step 3: Compare with Critical Value

|t| = 1.23 < 2.201

Since t is less than the critical value, we fail to reject the null hypothesis. This means the
difference in weight is not significant.

Final Answer

Since |t| = 1.23 < 2.201, we conclude that the difference in average weight is not significant.

Q.8) a) Write short notes on:

i) Population and sample


ii) Type I and Type II error
iii) Critical region
iv) Power of test

Ans:

Population and Sample

Population: The entire group that is being studied or analyzed. Sample: A subset
of the population selected for analysis.
Example: In a survey of students' test scores, all students in a university form the population, while
students from one class form a sample.

Type I and Type II Error

Type I Error (False Positive): Rejecting the null hypothesis when it is actually true.
Type II Error (False Negative): Failing to reject the null hypothesis when it is false.

Example:

 Type I Error: A medical test wrongly detects a disease in a healthy person.


 Type II Error: A test fails to detect a disease in an affected person.

Critical Region

Critical region: The set of values for which the null hypothesis is rejected.

It is determined based on a level of significance (α). If the test statistic falls within this region, we
conclude that the null hypothesis is not valid.

Power of Test

Power of a test: The probability of correctly rejecting a false null hypothesis. Formula:

Power = 1 - P(Type II Error)

A higher power means the test is more effective in detecting true differences.

Q.8) b) Let X₁, X₂, ... Xₙ be a random sample of size n from a normal distribution N(μ, σ²)
where μ and σ² both are unknown. Show that the Likelihood Ratio Test (LRT) used to test: H₀:
μ = μ₀ vs H₁: μ ≠ μ₀ for 0 < σ² < ∞ is equivalent to the t-test.
Ans :

Step 1: Given Data

We have a random sample X1, X2, ... , Xn from a normal distribution


N(μ,σ2), where both μ and σ2 are unknown. We want to test:

H₀: μ = μ₀ H₁: μ ≠ μ₀

for 0<σ2<∞

Step 2: Likelihood Function


The likelihood function for the sample is:
𝑛
2)
1 𝛴(𝑋𝑖 − 𝜇)2
𝐿(𝜇, 𝜎 = ( ) ∗ exp (− )
(𝜎√2𝜋) (2𝜎 2 )
The maximum likelihood estimates (MLEs) for μ and σ² are:
(Σ Xi)
μ̂ =
𝑛
(Σ X 𝑖 − μ̂)
̂2 =
σ
𝑛

Under H0H₀, we restrict μμ to μ0μ₀, so the likelihood function becomes:


𝑛
2)
1 𝛴(𝑋𝑖 − μ₀)2
𝐿(μ₀, 𝜎 = ( ) ∗ exp (− )
(𝜎√2𝜋) (2𝜎 2 )

Step 3: Likelihood Ratio Test (LRT) Statistic The likelihood

ratio test statistic is:

λ = L(μ₀, σ²) / L( μ̂, σ²)

Taking the logarithm:

𝛴(𝑋𝑖 − 𝜇0 )2
−2 log(𝜆) = 𝑛 ∗ log ( )
Σ (X 𝑖 − μ̂)2

This follows a chi-square distribution asymptotically.

Step 4: Equivalence to t-Test

The t-test statistic for testing H₀: μ = μ₀ H₁: μ ≠ μ₀ is:

(X̄ − μ0 )
t =
𝑠
( )
√𝑛

where:

 X̄ is the sample mean,


 s is the sample standard deviation.

Since the likelihood ratio test statistic is monotonic in ∣t∣|t|, the rejection region for LRT is
equivalent to the rejection region for the t-test2.

You might also like