0% found this document useful (0 votes)
19 views8 pages

PS Sample Paper

The document is a sample question paper for a Probability and Statistics exam scheduled for May 2025, consisting of multiple-choice, short-answer, and long-answer questions. It includes instructions, example questions with solutions, and covers topics such as Bayes' Theorem, hypothesis testing, normal distribution, and regression analysis. The total marks for the exam are 100, with a time limit of 3 hours.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views8 pages

PS Sample Paper

The document is a sample question paper for a Probability and Statistics exam scheduled for May 2025, consisting of multiple-choice, short-answer, and long-answer questions. It includes instructions, example questions with solutions, and covers topics such as Bayes' Theorem, hypothesis testing, normal distribution, and regression analysis. The total marks for the exam are 100, with a time limit of 3 hours.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Probability and Statistics (PS)

Sample Question Paper with Solutions

May 2025

Total Marks: 100 Time: 3 Hours

Instructions
• Answer all questions.
• Section A consists of multiple-choice questions (MCQs). Each question carries 2 marks.
• Section B consists of short-answer questions. Each question carries 5 marks.
• Section C consists of long-answer questions. Each question carries 10 marks.
• Use of a standard normal table, t-table, F-table, and Chi-Square table is allowed.
• Solutions are provided below each question for reference.

Section A: Multiple-Choice Questions (5 × 2 = 10


Marks)
1. A disease affects 1% of the population. A test for the disease has a 95% true positive
rate and a 5% false positive rate. If a person tests positive, what is the probability
they actually have the disease? (Bayes’ Theorem)

(a) 0.161
(b) 0.214
(c) 0.324
(d) 0.461

Solution:
Let D be the event that a person has the disease, and T + be the event that the test is
positive. Given: P (D) = 0.01, P (Dc ) = 0.99, P (T + |D) = 0.95, P (T + |Dc ) = 0.05.
We need P (D|T + ). Using Bayes’ Theorem:

P (T + |D)P (D)
P (D|T + ) =
P (T + |D)P (D) + P (T + |Dc )P (Dc )
(0.95)(0.01) 0.0095 0.0095
P (D|T + ) = = = ≈ 0.161
(0.95)(0.01) + (0.05)(0.99) 0.0095 + 0.0495 0.059
Answer: (a) 0.161

1
Probability and Statistics Question Paper May 2025

2. In hypothesis testing, rejecting the null hypothesis when it is true is called: (Type
I and Type II Errors)

(a) Type I error


(b) Type II error
(c) Power of the test
(d) Significance level

Solution:
Rejecting the null hypothesis when it is true is defined as a Type I error (also known
as a false positive). A Type II error is failing to reject the null when it is false.
Answer: (a) Type I error
3. The p-value in hypothesis testing represents: (Hypothesis Testing Definitions)

(a) The probability of rejecting the null hypothesis


(b) The probability of observing the test statistic or more extreme under the null
hypothesis
(c) The probability of Type I error
(d) The probability of Type II error

Solution:
The p-value is the probability of observing a test statistic as extreme or more ex-
treme than the one observed, assuming the null hypothesis is true.
Answer: (b) The probability of observing the test statistic or more ex-
treme under the null hypothesis
4. According to the Central Limit Theorem, the distribution of the sample mean ap-
proaches: (Central Limit Theorem)

(a) A uniform distribution


(b) A binomial distribution
(c) A normal distribution
(d) A Poisson distribution

Solution:
The Central Limit Theorem states that for a sufficiently large sample size, the
distribution of the sample mean approaches a normal distribution, regardless of the
population distribution.
Answer: (c) A normal distribution

2
Probability and Statistics Question Paper May 2025

5. If the correlation coefficient between two variables X and Y is 0, it indicates: (Cor-


relation - 2D)

(a) A perfect linear relationship


(b) No linear relationship
(c) A strong negative relationship
(d) A strong positive relationship

Solution:
A correlation coefficient of 0 indicates no linear relationship between the variables
X and Y . There may still be a non-linear relationship.
Answer: (b) No linear relationship

Section B: Short-Answer Questions (6 × 5 = 30 Marks)


1. In a normal distribution, 31% of the items are under 45, and 8% are over 64. Find
the mean and standard deviation of the distribution. (Normal Distribution Param-
eter Estimation) Solution:

Given: X ∼ N (µ, σ 2 ), P (X < 45) = 0.31, P (X > 64) = 0.08. Thus, P (X < 45) =
0.31 and P (X < 64) = 1 − 0.08 = 0.92. Standardize:
   
45 − µ 45 − µ 64 − µ 64 − µ
P Z< = 0.31 =⇒ = z1 , P Z < = 0.92 =⇒ = z2
σ σ σ σ
From standard normal tables: P (Z < z1 ) = 0.31 =⇒ z1 ≈ −0.4959 (since
P (Z < 0) = 0.5, 0.5 − 0.31 = 0.19, P (Z < −0.4959) ≈ 0.31),
P (Z < z2 ) = 0.92 =⇒ z2 ≈ 1.4051 (since P (Z < 1.4051) ≈ 0.92).
Equations:
45 − µ
= −0.4959 =⇒ 45 − µ = −0.4959σ (1)
σ
64 − µ
= 1.4051 =⇒ 64 − µ = 1.4051σ (2)
σ
Subtract (1) from (2):

(64−µ)−(45−µ) = 1.4051σ−(−0.4959σ) =⇒ 64−45 = (1.4051+0.4959)σ =⇒ 19 = 1.901σ


19
σ= ≈ 9.995 ≈ 10
1.901
Substitute σ ≈ 10 into (1):

45 − µ = −0.4959 × 10 =⇒ 45 − µ ≈ −4.959 =⇒ µ ≈ 45 + 4.959 ≈ 49.96 ≈ 50


Graph of the normal distribution:

3
Probability and Statistics Question Paper May 2025

31% 8%
45 50 64 X

Answer: Mean µ ≈ 50, Standard Deviation σ ≈ 10.


2. In a survey of 400 people, 220 support a new policy. Test the null hypothesis that
the population proportion supporting the policy is 0.5 against the alternative that
it is greater than 0.5 at a 5% significance level. State the hypotheses and compute
the test statistic. (Large Sample Test for Single Proportion) Solution:

Hypotheses: H0 : p = 0.5 vs H1 : p > 0.5 (one-tailed test).


220
Sample proportion: p̂ = 400 = 0.55.
Standard error under H0 :
r r r
p(1 − p) 0.5 × 0.5 0.25 √
SE = = = = 0.000625 = 0.025
n 400 400
Test statistic:
p̂ − p0 0.55 − 0.5 0.05
z= = = =2
SE 0.025 0.025
Critical value for a one-tailed test at α = 0.05 is z0.05 = 1.645. Since z = 2 > 1.645,
reject H0 .
Conclusion: There is evidence to suggest the population proportion is
greater than 0.5.
3. A sample of 10 students has a mean test score of 78 with a sample standard devi-
ation of 5. Test the null hypothesis that the population mean score is 80 against
a two-sided alternative at a 5% significance level. Use the t-test and provide your
conclusion. (t-test for Single Mean) Solution:

Hypotheses: H0 : µ = 80 vs H1 : µ ̸= 80 (two-tailed test).


Sample mean: x̄ = 78, s = 5, n = 10.
Standard error: SE = √sn = √510 ≈ 3.162
5
≈ 1.581.
Test statistic:
x̄ − µ0 78 − 80 −2
t= = = ≈ −1.265
SE 1.581 1.581
Degrees of freedom: df = n − 1 = 9. Critical value for a two-tailed test at α = 0.05
is t0.025,9 ≈ 2.262. Since |t| = 1.265 < 2.262, fail to reject H0 .
Conclusion: There is insufficient evidence to suggest the population
mean differs from 80.
4. A researcher wants to estimate the mean of a population with variance 36. Deter-
mine the minimum sample size required to ensure that the probability of the sample
mean being within 2 units of the true mean is at least 0.95, using the Central Limit
Theorem. (Sample Size Determination) Solution:

4
Probability and Statistics Question Paper May 2025

2
Given: σ 2 = 36, so σ = 6. We need P (|X̄ −µ| ≤ 2) ≥ 0.95. By CLT, X̄ ∼ N (µ, σn ).
Standardize:
   
X̄ − µ 2 2
P √ ≤ √ = P |Z| ≤ √ ≥ 0.95
σ/ n σ/ n σ/ n
For a standard normal, P (|Z| ≤ z) = 0.95 when z = 1.96. Thus:

2 2 2 n √ 1.96 × 6
√ = 1.96 =⇒ 6 = 1.96 =⇒ = 1.96 =⇒ n = = 5.88
σ/ n √ 6 2
n

n = (5.88)2 ≈ 34.57
Round up to the nearest integer: n = 35.
Answer: Minimum sample size n = 35.
5. For the data X: 1, 2, 3, 4, 5, 6, 7 and Y : 9, 8, 10, 12, 11, 13, 14, find the equation
of the regression line of Y on X. (Regression - 2D) Solution:
∑ ∑ ∑
Regression line: Y = a + bX. Compute b = n n XY ∑ −( ∑ X)( Y )
and a = Ȳ − bX̄.
P P X 2 −( X)2
Data: n = 7, X = 1+2+3+4+5+6+7 = 28, Y = 9+8+10+12+11+13+14 =
77,
P 2 P
X = 1 + 4 + 9 + 16 + 25 + 36 + 49 = 140, XY = (1)(9) + (2)(8) + (3)(10) +
(4)(12) + (5)(11) + (6)(13) + (7)(14) = 9 + 16 + 30 + 48 + 55 + 78 + 98 = 334.

7(334) − (28)(77) 2338 − 2156 182


b= = = ≈ 0.9286
7(140) − (28) 2 980 − 784 196
28 77
X̄ = = 4, Ȳ = = 11, a = 11 − (0.9286)(4) ≈ 11 − 3.7144 = 7.2856
7 7
Regression line: Y ≈ 7.29 + 0.93X.
Answer: Y = 7.29 + 0.93X
6. Compute the coefficient of correlation between X and Y for the following data:
X: 5, 10, 15, 20, 25
Y : 16, 19, 23, 26, 30
(Correlation - 2D) Solution:
∑ ∑ ∑
−( X)( Y )
Formula: r = √ ∑ n 2 XY
∑ ∑ 2 ∑ 2 .
P [n X −( X) ][n Y −( Y ) ] P
2

Data:
P 2 n = 5, X = 5+10+15+20+25 P = 75, Y = 16+19+23+26+30 = 114,
2
P X = 25+100+225+400+625 = 1375, Y = 256+361+529+676+900 = 2722,
XY = (5)(16) + (10)(19) + (15)(23) + (20)(26) + (25)(30) = 80 + 190 + 345 +
520 + 750 = 1885.

5(1885) − (75)(114) 9425 − 8550


r=p =p
[5(1375) − (75)2 ][5(2722) − (114)2 ] [6875 − 5625][13610 − 12996]
875 875 875
=p =√ ≈ ≈ 0.998
(1250)(614) 767500 876.42
Answer: r ≈ 0.998 (strong positive correlation).

5
Probability and Statistics Question Paper May 2025

Section C: Long-Answer Questions (6 × 10 = 60 Marks)


1. The joint probability mass function of (X, Y ) is given by p(x, y) = k(2x + 3y),
x = 0, 1, 2, y = 1, 2, 3. Find all the marginal and conditional probability distribu-
tions. Also find the probability distribution of (X + Y ). (Discrete Joint Probability
Mass Function - 2D) Solution:

(a) Find k by ensuring the total probability sums to 1:


Compute p(x, y) for all combinations:

X \Y 1 2 3
0 k(0 + 3) = 3k k(0 + 6) = 6k k(0 + 9) = 9k
1 k(2 + 3) = 5k k(2 + 6) = 8k k(2 + 9) = 11k
2 k(4 + 3) = 7k k(4 + 6) = 10k k(4 + 9) = 13k

Sum: 3k + 6k + 9k + 5k + 8k + 11k + 7k + 10k + 13k = 72k. Set equal to 1:


1
72k = 1 =⇒ k =
72
Joint PMF: p(x, y) = 2x+3y 72
.
(b) Marginal
P distributions:
pX (x) = 3y=1 p(x, y):
18
pX (0) = 3k + 6k + 9k = 18k = 72 = 14 ,
pX (1) = 5k + 8k + 11k = 24k = 72 = 13 , 24

pX (2) = 7k + 10k + 13k = 30k = 30 = 12 5


.
P2 72
pY (y) = x=0 p(x, y):
pY (1) = 3k + 5k + 7k = 15k = 15 72
= 245
,
pY (2) = 6k + 8k + 10k = 24k = 72 = 13 , 24
33 11
pY (3) = 9k + 11k + 13k = 33k = 72 = 24 .
(c) Conditional distributions:
pX|Y (x|y) = p(x,y)
pY (y)
: For y = 1:
3k 3
pX|Y (0|1) = 15k = 15 = 15 , pX|Y (1|1) = 15k 5k
= 13 , pX|Y (2|1) = 15k
7k
= 7
15
.
For y = 2, y = 3, compute similarly.
pY |X (y|x) = pp(x,y)
X (x)
: For x = 0:
pY |X (1|0) = 18k = 16 , pY |X (2|0) = 18k
3k 6k
= 31 , pY |X (3|0) = 18k 9k
= 12 .
(d) Distribution of X + Y : Possible values of X + Y : 1 to 5.
3
P (X + Y = 1): (0, 1) =⇒ 72 ,
6 5
P (X + Y = 2): (0, 2), (1, 1) =⇒ 72 + 72 = 11
72
,
9 8 7
P (X + Y = 3): (0, 3), (1, 2), (2, 1) =⇒ 72 + 72 + 72 = 24
72
= 13 ,
11 10 21 7
P (X + Y = 4): (1, 3), (2, 2) =⇒ 72 + 72 = 72 = 24 ,
P (X + Y = 5): (2, 3) =⇒ 13 72
.
Table of distributions:

6
Probability and Statistics Question Paper May 2025

X \Y 1 2 3 pX (x) X +Y P (X + Y )
3 6 9 1 3
0 72 72 72 4
1 72
5 8 11 1 11
1 72 72 72 3
2 72
7 10 13 5 1
2 72 72 72 12
3 3
5 1 11 7
pY (y) 24 3 24
4 24
13
5 72

Answer: See table for distributions.


2. Compute the coefficient of correlation between X and Y for the following data:
X: 65, 67, 66, 71, 67, 70, 68, 69
Y : 67, 68, 68, 70, 64, 67, 72, 70
Interpret the result. (Correlation - 2D) Solution:
∑ ∑ ∑
−( X)( Y )
Formula: r = √ ∑ n 2 XY
∑ ∑ 2 ∑ 2 .
P[n X −( X)P][n Y −( YP
2 ) ]
P 2 P
Data: n = 8, X = 543, Y = 546, X 2 = 36865, Y = 37346, XY =
37048.

8(37048) − (543)(546) −94


r=p ≈p ≈ −0.437
[8(36865) − (543)2 ][8(37346) − (546)2 ] (71)(652)
Interpretation: r ≈ −0.437 indicates a moderate negative linear relationship.
Answer: r ≈ −0.437 (moderate negative correlation).
3. Ten students got the following percentage of marks in Economics and Statistics.
Calculate the coefficient of correlation and interpret the result.

Roll. No 1 2 3 4 5 6 7 8 9 10
Marks in Economics 78 36 98 25 75 82 90 62 65 39
Marks in Statistics 84 51 91 60 68 62 86 58 53 47

(Correlation - 2D) Solution:


P P P P P
n = 10, X = 650, Y = 660, X 2 = 47564, Y 2 = 46466, XY = 45198.

10(45198) − (650)(660) 22980


r=p ≈p ≈ 0.585
[10(47564) − (650)2 ][10(46466) − (660)2 ] (53140)(29060)
Interpretation: r ≈ 0.585 indicates a moderate positive linear relationship.
Answer: r ≈ 0.585 (moderate positive correlation).
4. A survey of 500 individuals examines the association between age group (Under 30,
30–50, Over 50) and product preference (Like, Dislike). The observed frequencies
are:

7
Probability and Statistics Question Paper May 2025

Age Group Like Dislike


Under 30 80 70
30–50 100 120
Over 50 60 70

At a 5% significance level, perform a Chi-Square test to determine whether age


group and product preference are independent. State the hypotheses, calculate the
test statistic, and conclude. (Chi-Square Test for Independence) Solution:

Hypotheses: H0 : Age group and product preference are independent. H1 : They


are not independent.
Expected frequencies: Total = 500, row totals: 150, 220, 130; column totals: 240,
260.
E(Under 30, Like) = 150×240
500
= 72, E(Under 30, Dislike) = 78,
E(30–50, Like) = 105.6, E(30–50, Dislike) = 114.4,
E(Over 50, Like) = 62.4, E(Over 50, Dislike) = 67.6.
P (O−E)2
χ2 = E
≈ 2.458.
df = (3 − 1)(2 − 1) = 2, χ20.05,2 = 5.991. Since 2.458 < 5.991, fail to reject H0 .
Conclusion: There is insufficient evidence to suggest age group and prod-
uct preference are dependent.
5. The daily production of a factory follows a distribution with a mean of 200 units
and a variance of 25 units. A random sample of 50 days is taken. Using the Central
Limit Theorem, find the probability that the sample mean production is between
198 and 202 units. (Central Limit Theorem) Solution:

Given: µ = 200, σ = 5, n = 50. By CLT, X̄ ∼ N (200, 50


25
).
Standard error: n ≈ 0.707.
√σ

z1 = 198−200
0.707
≈ −2.828, z2 = 202−200
0.707
≈ 2.828.
P = P (Z ≤ 2.828) − P (Z ≤ −2.828) ≈ 0.9977 − 0.0023 = 0.9954.
Answer: P ≈ 0.9954.
6. A survey in two cities, A and B, determines support for a new policy. In City
A, 320 out of 500 residents support the policy, while in City B, 280 out of 450
support it. At a 5% significance level, test the null hypothesis that the proportion
of supporters is the same in both cities against the alternative that they differ.
State the hypotheses, compute the test statistic, and conclude. (Large Sample Test
for Difference of Proportions) Solution:

Hypotheses: H0 : pA = pB vs H1 : pA ̸= pB .
p̂A = 0.64, p̂B ≈ 0.622, p̂ ≈ 0.632.
SE ≈ 0.03134, z = 0.64−0.622
0.03134
≈ 0.574.
Critical value: z0.025 = 1.96. Since 0.574 < 1.96, fail to reject H0 .
Conclusion: There is insufficient evidence to suggest the proportions
differ.

You might also like