Name: Vedant Mundada
Rollno:SEBD23201
Division: B
STAT PRELIM PAPER
Q.1 a) Calculate the mean and standard deviation for the following table giving the age
distribution of 542 members.
Age (in years) 20 - 30 30 - 40 40 - 50 50 - 60 60 - 70 70 - 80 80 - G0
No. of members 3 61 132 153 140 51 2
Ans:
f x2
and S.D. = σ = √[(Σ Σi fi ) − ( 𝑥𝑖2 )]
i
Age Group Midpoint xi No. of fi xi fi xi2
(years) Members fi
20 - 30 25 3 75 1875
30 - 40 35 61 2135 74725
40 - 50 45 132 5940 267300
50 - 60 55 153 8415 462825
60 - 70 65 140 9100 591500
70 - 80 75 51 3825 286875
80 - 90 85 2 170 14450
Total - 542 29660 1687550
𝑥̅ = 29660 / 542 = 54.7 years
σ = sqrt((1687550 / 542) - (54.7)^2)
σ = sqrt(3114.67 - 2992.09)
σ = sqrt(122.58) σ ≈ 11.08 years
Mean Age = 54.7 years
Standard Deviation = 11.08 year
Q.1 b) In a partially destroyed laboratory record of an analysis of correlation data, the following
results only are legible: Variance of X = G Regression equations: 8X – 10Y + 66 = 0,40X – 18Y = 214.
Find: i) The mean values X and Y, ii) The correlation coefficient between X and Y, and iii) The
standard deviation of Y.
Ans: Variance of X = G, Regression equations: 8X - 10Y + 66 = 0,40X - 18Y = 214
Express Y in terms of X: Y = (8X + 66) / 10
Y = (40X - 214) / 18
Solving simultaneously, we get,
X̄ (Mean of X) 7
Ȳ (Mean of Y) 10
r = sqrt(Product of regression slopes / Variance of X)
Calculation Value
Slope of X 8/10 = 0.8
Slope of Y 40/18 ≈ 2.22
Product of Slopes (0.8 × 2.22) = 1.776
Correlation Coefficient r sqrt(1.776 / G) ≈ 0.44
𝜎𝑋
𝜎𝑌 =
|𝑏𝑌 |
Calculation Value
Standard Deviation of X (σX) sqrt(G) = 3
Regression Coefficient of Y (bY) 2.22
Standard Deviation of Y (σY) 3 / 2.22 ≈ 1.35
Mean of X = 7
Mean of Y = 10
Correlation Coefficient (r) = 0.44
Standard Deviation of Y = 1.35
Q.3) a) Assume that on average, out of 15 telephone numbers called between 2 PM to 3 PM on
weekdays, a certain number are busy. What is the probability that 6 randomly selected telephone
numbers called: i) Not more than 3 are busy. ii) At least 3 are busy.
Ans:
Step 1: Given Data
Total calls (n) = 6
Probability of a busy line (p) = 15 / 15 = 1 Probability of free
line (q) = 1 - p = 0
Since we are dealing with a binomial distribution, the probability formula is:
P(X = k) = (nCk) * (pk) * (q(n-k))
Step 2: Compute Probability for "Not More Than 3 Busy"
P(X ≤ 3) = P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3)
Using binomial expansion:
X Binomial Probability P(X)
0 0.0820
1 0.2458
2 0.3687
3 0.2765
P(X ≤ 3) 0.G730
Step 3: Compute Probability for "At Least 3 Busy"
P(X ≥ 3) = 1 - P(X ≤ 2)
Using binomial calculations:
X Binomial Probability P(X)
0 0.0820
1 0.2458
2 0.3687
P(X ≥ 3) 0.3042
Final Answers
P(X ≤ 3) = 0.9730
P(X ≥ 3) = 0.3042
Q.3) b) If the probability that an individual suffers a bad reaction from a certain injection is 0.001,
determine the probability out of 2000 people, by using Poisson's distribution. i) Exactly 3 will suffer a
bad reaction. ii) More than 1 will suffer a bad reaction.
Ans :
Step 1: Given Data
Probability of bad reaction (p) = 0.001 Total
individuals (n) = 2000
Mean (λ) = n × p = 2000 × 0.001 = 2
Since pp is very small, we use Poisson's probability formula:
(e−λ ∗ λk )
P(X = k) =
k!
Step 2: Compute Probability for "Exactly 3 Suffer a Bad Reaction"
(e−2 ∗ 23 )
P(X = 3) =
3!
P(X = 3) = (0.1353 * 8) / 6
P(X = 3) ≈ 0.180
Step 3: Compute Probability for "More Than 1 Suffer a Bad Reaction"
P(X > 1) = 1 - (P(X = 0) + P(X = 1))
Using Poisson calculations:
X Poisson Probability P(X)
0 0.1353
1 0.2707
P(X > 1) 1 - (0.1353 + 0.2707) = 0.5G4
Final Answers
P(X = 3) = 0.180
P(X > 1) = 0.594
Q.3) c) In a sample of 1000 cases, the mean of a certain test is 14 and the standard deviation is 2.5.
Assuming the distribution to be normal, find: i) How many students scored between 12 and 15.ii)
How many scored below 8.[Given: A(z = 0.8) = 0.2881, A(z = 0.4) = 0.1554, A(z = 2.4) = 0.4G18]
Ans:
Step 1: Given Data
Sample size (n) = 1000
Mean (μ) = 14
Standard deviation (σ) = 2.5
Since the distribution is normal, we use the Z-score formula:
Z = (X - μ) / σ
Step 2: Compute Probability for Students Scoring Between 12 and 15
Z(12) = (12 - 14) / 2.5 = -0.8
Z(15) = (15 - 14) / 2.5 = 0.4
Using given Z-table values:
A(0.8) = 0.2881
A(0.4) = 0.1554
P(12 ≤ X ≤ 15) = A(0.4) - A(-0.8)
= 0.1554 + 0.2881
= 0.4435
Total students = 1000 × 0.4435 = 443
Step 3: Compute Probability for Students Scoring Below 8
Z(8) = (8 - 14) / 2.5 = -2.4
Using given Z-table value:
A(2.4) = 0.4918
P(X < 8) = A(-2.4) = 0.4918
Total students = 1000 × 0.4918 = 492
Final Answers
Number of students scoring between 12 and 15 = 443 Number
of students scoring below 8 = 492
Q.5) a) The following table gives the number of accidents that took place in an industry during
various days of the week. Test if accidents are uniformly distributed over the week.
Days Mon Tue Wed Thu Fri Sat
No. of Accidents 14 18 12 11 15 14
Given: χ²₀.₀₅,₅ = 11.0G.
Ans :
Total number of accidents:
N = 14 + 18 + 12 + 11 + 15 + 14 = 84
Since we are testing uniform distribution, the expected frequency for each day:
E = Total Accidents / Total Days = 84 / 6 = 14
Step 2: Apply Chi-Square Formula
Chi-Square formula:
χ² = Σ ((O - E)² / E)
where:
O= Observed frequency
E= Expected frequency
Days Observed (O) Expected (E) (O - E) (O - E)² (O - E)² / E
Mon 14 14 0 0 0.000
Tue 18 14 4 16 1.143
Wed 12 14 -2 4 0.286
Thu 11 14 -3 9 0.643
Fri 15 14 1 1 0.071
Sat 14 14 0 0 0.000
Total χ² - - - - 2.143
Step 3: Compare with Critical Value
Given critical value:
χ²₀.₀₅,₅ = 11.09
Since:
2.143 < 11.09
we fail to reject the hypothesis. This means accidents are uniformly distributed over the week.
Final Answer
Since χ² = 2.143 is less than the critical value 11.09, we conclude that accidents are uniformly distributed
over the week.
Q.5) b) A normal population has mean 6.8 and standard deviation 1.5. A sample of
400 members gave a mean of 6.75. Is the difference significant?
Given: Zα = 1.96 at 5% level of significance.
Ans:
Step 1: Given Data
Population mean (μ) = 6.8
Population standard deviation (σ) = 1.5
Sample size (n) = 400
Sample mean (X̄) = 6.75
Significance level (α) = 5%
Critical value (Zα) = 1.96
Since the sample size is large (n ≥ 30), we use the Z-test formula:
(𝑋̄ − 𝜇)
𝑍 =
𝜎
( )
√(𝑛)
Step 2: Compute Z-score
(6.75 − 6.8)
𝑍 =
1.5
( )
√4000
Z = (-0.05) / (1.5 / 20)
Z = (-0.05) / 0.075
Z ≈ -0.67
Step 3: Compare with Critical Value
|Z| = 0.67 < Zα = 1.96
Since Z is less than the critical value, we fail to reject the null hypothesis. This means the
difference is not significant.
Final Answer
Since |Z| = 0.67 < 1.96, we conclude that the difference in means is not significant.
Q.5) c) Suppose that sweets are sold in packages of fixed weight of contents. The procedure of
the packages is interested in testing the average weight of content in packages in 1 kg. Sum of
squares of deviations from mean of 12 samples is 0.011967. Using above data should we
conclude the average.
Given: X̄ = 0.9883, t₀.₀₅,₁₁ = 2.201.
Ans:
Step 1: Given Data
Sample size (n) = 12
Sample mean (x̄) = 0.9883
Sum of squares of deviations (∑(xi− x̄)2) = 0.011967
Hypothesized mean (μ) = 1
Significance level (α) = 5% Critical
value (t₀.₀₅,₁₁) = 2.201
∑(x𝑖 −x̅)2
The sample variance is calculated as: s 2 = (𝑛−1)
s² = 0.011967 / (12 - 1) = 0.011967 / 11 = 0.001088
The sample standard deviation is:
s = sqrt(0.001088) ≈ 0.0330
Using the t-test formula: t = (X̄ - μ) / (s / sqrt(n))
(𝑋̄ − 𝜇)
𝑡 =
𝑠
( )
√(𝑛)
Step 2: Compute t-score
0.9883 − 1
𝑡=
0.0330
√12
t = (-0.0117) / (0.0330 / 3.464)
t = (-0.0117) / 0.00953
t ≈ -1.23
Step 3: Compare with Critical Value
|t| = 1.23 < 2.201
Since t is less than the critical value, we fail to reject the null hypothesis. This means the
difference in weight is not significant.
Final Answer
Since |t| = 1.23 < 2.201, we conclude that the difference in average weight is not significant.
Q.8) a) Write short notes on:
i) Population and sample
ii) Type I and Type II error
iii) Critical region
iv) Power of test
Ans:
Population and Sample
Population: The entire group that is being studied or analyzed. Sample: A subset
of the population selected for analysis.
Example: In a survey of students' test scores, all students in a university form the population, while
students from one class form a sample.
Type I and Type II Error
Type I Error (False Positive): Rejecting the null hypothesis when it is actually true.
Type II Error (False Negative): Failing to reject the null hypothesis when it is false.
Example:
Type I Error: A medical test wrongly detects a disease in a healthy person.
Type II Error: A test fails to detect a disease in an affected person.
Critical Region
Critical region: The set of values for which the null hypothesis is rejected.
It is determined based on a level of significance (α). If the test statistic falls within this region, we
conclude that the null hypothesis is not valid.
Power of Test
Power of a test: The probability of correctly rejecting a false null hypothesis. Formula:
Power = 1 - P(Type II Error)
A higher power means the test is more effective in detecting true differences.
Q.8) b) Let X₁, X₂, ... Xₙ be a random sample of size n from a normal distribution N(μ, σ²)
where μ and σ² both are unknown. Show that the Likelihood Ratio Test (LRT) used to test: H₀:
μ = μ₀ vs H₁: μ ≠ μ₀ for 0 < σ² < ∞ is equivalent to the t-test.
Ans :
Step 1: Given Data
We have a random sample X1, X2, ... , Xn from a normal distribution
N(μ,σ2), where both μ and σ2 are unknown. We want to test:
H₀: μ = μ₀ H₁: μ ≠ μ₀
for 0<σ2<∞
Step 2: Likelihood Function
The likelihood function for the sample is:
𝑛
2)
1 𝛴(𝑋𝑖 − 𝜇)2
𝐿(𝜇, 𝜎 = ( ) ∗ exp (− )
(𝜎√2𝜋) (2𝜎 2 )
The maximum likelihood estimates (MLEs) for μ and σ² are:
(Σ Xi)
μ̂ =
𝑛
(Σ X 𝑖 − μ̂)
̂2 =
σ
𝑛
Under H0H₀, we restrict μμ to μ0μ₀, so the likelihood function becomes:
𝑛
2)
1 𝛴(𝑋𝑖 − μ₀)2
𝐿(μ₀, 𝜎 = ( ) ∗ exp (− )
(𝜎√2𝜋) (2𝜎 2 )
Step 3: Likelihood Ratio Test (LRT) Statistic The likelihood
ratio test statistic is:
λ = L(μ₀, σ²) / L( μ̂, σ²)
Taking the logarithm:
𝛴(𝑋𝑖 − 𝜇0 )2
−2 log(𝜆) = 𝑛 ∗ log ( )
Σ (X 𝑖 − μ̂)2
This follows a chi-square distribution asymptotically.
Step 4: Equivalence to t-Test
The t-test statistic for testing H₀: μ = μ₀ H₁: μ ≠ μ₀ is:
(X̄ − μ0 )
t =
𝑠
( )
√𝑛
where:
X̄ is the sample mean,
s is the sample standard deviation.
Since the likelihood ratio test statistic is monotonic in ∣t∣|t|, the rejection region for LRT is
equivalent to the rejection region for the t-test2.