0% found this document useful (0 votes)
39 views17 pages

Study Notes On Estimation

The document covers key concepts in estimation, hypothesis testing, and analysis of variance (ANOVA) in statistics. It explains sampling distributions, confidence intervals, and the Central Limit Theorem, as well as the process of hypothesis testing including types of errors and test statistics. Additionally, it details ANOVA's purpose, hypotheses, and key components for analyzing differences between group means.

Uploaded by

sayor seller
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views17 pages

Study Notes On Estimation

The document covers key concepts in estimation, hypothesis testing, and analysis of variance (ANOVA) in statistics. It explains sampling distributions, confidence intervals, and the Central Limit Theorem, as well as the process of hypothesis testing including types of errors and test statistics. Additionally, it details ANOVA's purpose, hypotheses, and key components for analyzing differences between group means.

Uploaded by

sayor seller
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Study Notes on Estimation (from the Document)

1. Sampling Distributions

 Population vs. Sample:


o Population: Entire group being studied.
o Sample: Subset of the population.
 Population Parameter vs. Sample Statistic:
o Population Parameter: Numerical value from the entire population (e.g., mean, variance).
o Sample Statistic: Value derived from a sample (e.g., sample mean, sample variance).
 Population Distribution vs. Sampling Distribution:
o Population Distribution: Probability distribution for the entire population.
o Sampling Distribution: Distribution of a sample statistic (like sample mean) across multiple
samples.
 Estimators: A sample statistic that estimates a population parameter (e.g., sample mean estimating
population mean).

2. Sampling Distribution of the Sample Mean

 Example: If the class has test scores 70, 78, 80, 80, and 95:
o Population Distribution: Probability distribution for the 5 test scores.
o Sample Mean: Calculate using:
xˉ=∑xn\bar{x} = \frac{\sum x}{n}
o Standard Deviation of the Sample Mean:
σxˉ=σn\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}
 Important Properties:
o The mean of the sampling distribution of sample means (μxˉ\mu_{\bar{x}}) is equal to the
population mean (μ\mu).
o The standard deviation of the sampling distribution (σxˉ\sigma_{\bar{x}}) is the population
standard deviation (σ\sigma) divided by the square root of the sample size (nn).

3. Central Limit Theorem (CLT)

 CLT: For large sample sizes (n≥30n \geq 30), the sampling distribution of the sample mean becomes
approximately normal, even if the population is not normally distributed.
 Key Formula:
o For a large sample size, the sample mean (xˉ\bar{x}) follows a normal distribution:

xˉ∼N(μ,σ2n)\bar{x} \sim N \left( \mu, \frac{\sigma^2}{n} \right)

4. Estimation

 Definition: Estimation is the process of assigning values to a population parameter based on sample
statistics.
 Estimator: A sample statistic used to estimate a population parameter.
o Point Estimate: A single value estimate of a population parameter (e.g., sample mean).
o Interval Estimate: A range around the point estimate, providing more confidence that the
true parameter lies within this range.

5. Confidence Interval (CI)

 Formula for Confidence Interval (when population standard deviation σ\sigma is known):

xˉ±zα/2⋅σn\bar{x} \pm z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}}

o Where:
 xˉ\bar{x}: Sample mean
 zα/2z_{\alpha/2}: Z-value corresponding to the desired confidence level (e.g., 1.96
for 95% CI)
 Confidence Level: The probability that the confidence interval contains the true population
parameter (e.g., 95% confidence means there's a 95% chance the interval contains the true mean).

6. Estimating the Population Mean

 When σ2\sigma^2 is Known:


o Point Estimate: Use sample mean xˉ\bar{x} to estimate the population mean μ\mu.
o Confidence Interval:

μ=xˉ±zα/2⋅σn\mu = \bar{x} \pm z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}}

 When σ2\sigma^2 is Unknown (using tt-distribution for small samples):

μ=xˉ±tα/2⋅sn\mu = \bar{x} \pm t_{\alpha/2} \cdot \frac{s}{\sqrt{n}}

o Degrees of Freedom: df=n−1df = n - 1


o Critical Value: tα/2t_{\alpha/2} from the tt-table based on the degrees of freedom.

7. Interval Estimation for Two Population Means

 Independent Samples:
o When variances are known:

μ1−μ2=(xˉ1−xˉ2)±zα/2⋅σ12n1+σ22n2\mu_1 - \mu_2 = (\bar{x}_1 - \bar{x}_2) \pm z_{\


alpha/2} \cdot \sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}

o When variances are unknown (using pooled standard deviation):

μ1−μ2=(xˉ1−xˉ2)±tα/2,df⋅s12n1+s22n2\mu_1 - \mu_2 = (\bar{x}_1 - \bar{x}_2) \pm t_{\


alpha/2, df} \cdot \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}

 Dependent Samples (paired data):


o Calculate the difference for each pair and use this difference to calculate confidence intervals
for the population mean difference.
8. Estimating Population Variance

 Confidence Interval for Variance:

(n−1)s2χα/2,df2<σ2<(n−1)s2χ1−α/2,df2\frac{(n-1)s^2}{\chi^2_{\alpha/2, df}} < \sigma^2 < \


frac{(n-1)s^2}{\chi^2_{1-\alpha/2, df}}

o Where:
 χ2\chi^2: Chi-square distribution critical values based on degrees of freedom.

9. Chi-Square and F Distributions

 Chi-Square Distribution: Used for variance estimation; skewed to the right and based on degrees of
freedom.
 F Distribution: Used to test the ratio of two variances.

Key Takeaways

 Sample Size and Confidence: Larger sample sizes reduce the margin of error in confidence
intervals.
 CLT: The larger the sample size, the more likely the sample mean distribution is normal, even for
non-normal populations.
 Estimators: Good estimators are unbiased, consistent, and efficient.
 Interval Estimation: Use CI formulas to estimate population parameters with a given confidence
level.

This summary should provide a solid foundation for beginner students to understand estimation in statistics.
Study Notes on Hypothesis Testing (from the Document)

1. Introduction to Hypothesis Testing

 Hypothesis in Statistics: A hypothesis is a statement or conjecture about a population parameter.


 Null Hypothesis (H0H_0): The assumption that there is no effect or difference; it is presumed true
until proven otherwise.
 Alternative Hypothesis (H1H_1): The hypothesis that contradicts H0H_0, suggesting there is an
effect or difference.

2. Types of Hypotheses

 Null Hypothesis: Represents the statement we assume to be true initially. Example: "The mean is
20."
 Alternative Hypothesis: Represents what we aim to prove. Example: "The mean is different from
20."

3. Types of Errors

 Type I Error (False Positive): Rejecting H0H_0 when it is actually true. The probability of this
error is denoted as α\alpha, also called the significance level.
 Type II Error (False Negative): Failing to reject H0H_0 when it is actually false. The probability of
this error is denoted as β\beta.

4. Rejection and Non-Rejection Regions

 Rejection Region: The part of the distribution where you reject the null hypothesis if the test statistic
falls in this region.
 Non-Rejection Region: The part of the distribution where you do not reject the null hypothesis.

5. Tails of a Test

 Two-Tailed Test: The rejection region is in both tails of the distribution. Used when the alternative
hypothesis claims the parameter is different (not equal).
 Left-Tailed Test: The rejection region is in the left tail of the distribution. Used when the alternative
hypothesis claims the parameter is less than a certain value.
 Right-Tailed Test: The rejection region is in the right tail of the distribution. Used when the
alternative hypothesis claims the parameter is greater than a certain value.

6. Approaches to Hypothesis Testing

1. pp-Value Approach:
o Calculate the pp-value, which is the probability of observing the test statistic or something
more extreme assuming the null hypothesis is true.
o Decision Rule: Reject H0H_0 if p≤αp \leq \alpha; otherwise, do not reject H0H_0.
2. Critical Value Approach:
o Determine the critical value(s) from the table (normal or t distribution) based on the
significance level (α\alpha).
o Decision Rule: Compare the test statistic with the critical value:
 Two-tailed test: Reject H0H_0 if zcalc≥zα/2z_{\text{calc}} \geq z_{\alpha/2} or
zcalc≤−zα/2z_{\text{calc}} \leq -z_{\alpha/2}.
 Right-tailed test: Reject H0H_0 if zcalc≥zαz_{\text{calc}} \geq z_{\alpha}.
 Left-tailed test: Reject H0H_0 if zcalc≤−zαz_{\text{calc}} \leq -z_{\alpha}.

7. Steps to Perform a Hypothesis Test

1. State the Hypotheses: Define H0H_0 and H1H_1.


o Two-tailed: H0:μ=μ0H_0: \mu = \mu_0, H1:μ≠μ0H_1: \mu \neq \mu_0
o Right-tailed: H0:μ=μ0H_0: \mu = \mu_0, H1:μ>μ0H_1: \mu > \mu_0
o Left-tailed: H0:μ=μ0H_0: \mu = \mu_0, H1:μ<μ0H_1: \mu < \mu_0
2. Assume H0H_0 is true and select the appropriate distribution.
3. Determine the rejection region based on α\alpha.
4. Calculate the test statistic (either from output or using formulas).
5. Compare the test statistic with the critical value or pp-value.
6. Make a decision: Reject or do not reject H0H_0.
7. Draw a conclusion: Based on the decision, conclude whether the evidence supports H1H_1.

8. Examples of Hypothesis Tests

1. One Population Mean Test:


o Example: Testing if the mean length of calls differs from 13.14 minutes using a sample of
150 calls.
2. Two Population Means Test:
o Example: Testing if the mean sales per day for two supermarkets differ.
3. Variance and Standard Deviation Tests:
o Example: Testing if the variance of scores in a class is less than the population variance.
4. Testing Differences in Variances:
o Example: Comparing the variances of waiting times in two hospitals.

9. Statistical Methods for Testing

1. zz-Test: Used when the population standard deviation is known and the sample size is large.
2. tt-Test: Used when the population standard deviation is unknown or the sample size is small.
3. Chi-Square Test: Used for testing variance or standard deviation.
4. F-Test: Used for comparing variances between two samples.

10. Common Statistical Tests

 One-Sample t-Test: Used to test the mean of a single population.


 Two-Sample t-Test: Used to test if two population means are different.
 Paired t-Test: Used to test the difference in means from two related samples.
 Chi-Square Test: Used to test the variance of a single population or compare variances between two
populations.
Key Formulas for Hypothesis Testing

 Test Statistic for One Mean:

t=xˉ−μ0snt = \frac{\bar{x} - \mu_0}{\frac{s}{\sqrt{n}}}

Where:

o xˉ\bar{x}: Sample mean


o μ0\mu_0: Population mean
o ss: Sample standard deviation
o nn: Sample size
 Test Statistic for Two Means (with equal variances):

t=(xˉ1−xˉ2)sp1n1+1n2t = \frac{(\bar{x}_1 - \bar{x}_2)}{s_p \sqrt{\frac{1}{n_1} + \frac{1}


{n_2}}}

Where:

o xˉ1,xˉ2\bar{x}_1, \bar{x}_2: Sample means


o sps_p: Pooled standard deviation
o n1,n2n_1, n_2: Sample sizes
 Chi-Square Test for Variance:

χ2=(n−1)s2σ02\chi^2 = \frac{(n - 1)s^2}{\sigma_0^2}

Where:

o nn: Sample size


o s2s^2: Sample variance
o σ02\sigma_0^2: Population variance

Conclusion

Hypothesis testing is a fundamental statistical method used to make inferences about populations based on
sample data. By following the correct steps and using the appropriate test statistic, you can determine
whether there is enough evidence to reject the null hypothesis and support the alternative hypothesis.
Study Notes on Analysis of Variance (ANOVA)

1. Introduction to ANOVA

 Purpose: ANOVA is used to test if there are significant differences between the means of three or
more groups.
 Example: Testing if three different teaching methods produce different mean scores for students.

2. Hypotheses in ANOVA

 Null Hypothesis (H0H_0): All population means are equal.


o Example: H0:μ1=μ2=μ3H_0: \mu_1 = \mu_2 = \mu_3
 Alternative Hypothesis (H1H_1): At least one population mean is different from the others.
o Example: H1:At least one mean is differentH_1: \text{At least one mean is different}

3. Key Definitions in Experimental Design

 Treatment: A condition or set of conditions imposed by the experimenter on the groups.


 Level: The magnitude of a treatment (e.g., different doses or methods).
 Factor: A general category of treatments (e.g., type of medication).
 Block: Groups of subjects with homogeneous characteristics.
 Randomisation: Randomly assigning elements to groups.
 Designed Experiment: The experimenter controls the assignment of elements to groups.
 Observational Study: The experimenter observes without controlling the group assignments.

4. One-Way ANOVA

 Purpose: Analyzes one factor or independent variable with multiple levels (groups).
 Assumptions:
o Populations are normally distributed.
o Equal variances across groups.
o Random and independent samples.

Hypotheses for One-Way ANOVA:

 Null Hypothesis (H0H_0): All group means are equal.


 Alternative Hypothesis (H1H_1): At least two group means are different.

5. Key Components of One-Way ANOVA

 Between-group variance (MSA): Variance due to differences between group means.


 Within-group variance (MSE): Variance within each group, unaffected by group means.
 F-Statistic: Ratio of MSA to MSE.

F=MSAMSEF = \frac{MSA}{MSE}
 Degrees of Freedom:
o Numerator (d.f.N.d.f.N.): k−1k - 1 (where kk is the number of groups).
o Denominator (d.f.D.d.f.D.): N−kN - k (where NN is the total sample size).

6. Steps in One-Way ANOVA

1. State the Hypotheses:


o H0:μ1=μ2=μ3H_0: \mu_1 = \mu_2 = \mu_3
o H1:At least one mean is differentH_1: \text{At least one mean is different}
2. Calculate the F-statistic using MSA and MSE.
3. Determine the critical value for FF or use the pp-value approach.
4. Make a decision: Reject or fail to reject H0H_0.
5. Draw a conclusion.

7. Example: One-Way ANOVA

 Data: Three teaching methods with student scores.


 Calculate the F-statistic based on between-group and within-group variance.
 Conclusion: If pp-value < 0.05, reject H0H_0, indicating a significant difference between methods.

8. ANOVA Table

An ANOVA table summarizes the calculations for each component of the ANOVA.

 Source of Variation:
o Treatments: Between-group variance.
o Error: Within-group variance.
o Total: Total variance.

Example ANOVA table:

Source of Variation Degrees of Freedom Sum of Squares Mean Square F-Statistic


Treatments k−1k - 1 SSA MSA FF
Error N−kN - k SSE MSE
Total N−1N - 1 SST

9. Randomised Complete Block Design

 Blocking: Dividing subjects into homogeneous blocks (e.g., age groups) to control for confounding
variables.
 ANOVA Table for Block Design: Includes additional terms for blocks.

Example ANOVA table for Randomized Block Design:

Source of Variation Degrees of Freedom Sum of Squares Mean Square F-Statistic


Treatments k−1k - 1 SSA MSA FAF_A
Blocks b−1b - 1 SSB MSB FBF_B
Source of Variation Degrees of Freedom Sum of Squares Mean Square F-Statistic
Error (k−1)(b−1)(k - 1)(b - 1) SSE MSE
Total kb−1kb - 1 SST

10. Two-Way ANOVA

 Purpose: Analyzes the impact of two factors on a dependent variable and their interaction.
 Main Effects: Effects of each independent variable (e.g., type of gasoline and type of automobile).
 Interaction Effect: Whether the effects of one factor depend on the level of the other factor.

Hypotheses for Two-Way ANOVA:

1. Interaction Effect: H0:No interaction effectH_0: \text{No interaction effect}


2. Main Effect (Factor A): H0:No difference between levels of factor AH_0: \text{No difference
between levels of factor A}
3. Main Effect (Factor B): H0:No difference between levels of factor BH_0: \text{No difference
between levels of factor B}

11. Two-Way ANOVA Table

For two factors (A and B), the table includes:

 Factor A: Main effect of factor A.


 Factor B: Main effect of factor B.
 Interaction (A × B): Interaction effect between A and B.

Example ANOVA table for Two-Way ANOVA:

Source of Variation Degrees of Freedom Sum of Squares Mean Square F-Statistic


Factor A a−1a - 1 SSA MSA FAF_A
Factor B b−1b - 1 SSB MSB FBF_B
Interaction (A × B) (a−1)(b−1)(a - 1)(b - 1) SSA×B MSAB FA×BF_{A \times B}
Error ab(n−1)ab(n - 1) SSE MSE
Total abn−1abn - 1 SST

12. Two-Way ANOVA Example

 Data: Effects of gasoline type and automobile type on fuel efficiency.


 Hypotheses:
o H0H_0: No interaction between gasoline type and automobile type.
o H0H_0: No difference in fuel efficiency due to gasoline type.
o H0H_0: No difference in fuel efficiency due to automobile type.

Conclusion: If the pp-value for the interaction effect is small, reject H0H_0 and conclude that there is an
interaction between the two factors.

Key Takeaways
 ANOVA is used to compare the means of three or more groups.
 One-Way ANOVA: Tests one factor.
 Two-Way ANOVA: Tests the effect of two factors and their interaction.
 Blocking reduces experimental error by grouping similar units.
 Use F-statistics to determine if the differences between group means are statistically significant.

This breakdown should help you better understand ANOVA for analyzing variance in data, starting from
simple one-way designs to more complex two-way designs.
Solutions for Questions 2 to 4

QUESTION 2

a) Calculate the missing values of xˉ\bar{x} and s2s^2

Given:

 Sample size n=28n = 28


 SE=6.50SE = 6.50
 s=34.40s = 34.40
 Sum of data Σx=4334.00\Sigma x = 4334.00
 Sum of squares Σx2=702796.00\Sigma x^2 = 702796.00

To calculate xˉ\bar{x} and s2s^2:

1. Sample Mean xˉ\bar{x}:

xˉ=Σxn=4334.0028=154.79\bar{x} = \frac{\Sigma x}{n} = \frac{4334.00}{28} = 154.79

2. Sample Variance s2s^2:


Using the formula for variance:

s2=Σx2−(Σx)2nn−1s^2 = \frac{\Sigma x^2 - \frac{(\Sigma x)^2}{n}}{n-1}

Substituting the values:

s2=702796−433422827=702796−187108562827=702796−668250.5727=12864.58s^2 = \
frac{702796 - \frac{4334^2}{28}}{27} = \frac{702796 - \frac{18710856}{28}}{27} = \
frac{702796 - 668250.57}{27} = 12864.58

b) Construct the 98% confidence interval for the mean serum cholesterol level

For the 98% confidence interval:

 xˉ=154.79\bar{x} = 154.79
 s=34.40s = 34.40
 n=28n = 28

Using the formula for the confidence interval:

CI=xˉ±tα/2⋅snCI = \bar{x} \pm t_{\alpha/2} \cdot \frac{s}{\sqrt{n}}

Where tα/2t_{\alpha/2} for 98% confidence and 27 degrees of freedom is approximately 2.048 (from the t-
table).

Thus:

CI=154.79±2.048⋅34.4028=154.79±2.048⋅6.51=154.79±13.33CI = 154.79 \pm 2.048 \cdot \frac{34.40}{\


sqrt{28}} = 154.79 \pm 2.048 \cdot 6.51 = 154.79 \pm 13.33

So, the confidence interval is:


CI=(141.46,168.12)CI = (141.46, 168.12)

c) Test at 2% significance level whether the standard deviation is different from 31 mg/dL

Null hypothesis H0:σ=31H_0: \sigma = 31


Alternative hypothesis H1:σ≠31H_1: \sigma \neq 31

Use the Chi-Square test:

χ2=(n−1)s2σ02\chi^2 = \frac{(n-1) s^2}{\sigma_0^2}

Where:

 n=28n = 28
 s2=12864.58s^2 = 12864.58
 σ0=31\sigma_0 = 31

Thus:

χ2=(28−1)⋅12864.58312=27⋅12864.58961=366.73\chi^2 = \frac{(28-1) \cdot 12864.58}{31^2} = \frac{27 \


cdot 12864.58}{961} = 366.73

Using the Chi-Square distribution with 2727 degrees of freedom and a significance level of α=0.02\alpha =
0.02, the critical values for the two-tailed test are approximately χ0.012=12.401\chi^2_{0.01} = 12.401 and
χ0.992=43.531\chi^2_{0.99} = 43.531.

Since 12.401<366.73<43.53112.401 < 366.73 < 43.531, we reject H0H_0. Thus, the standard deviation is
significantly different from 31 mg/dL.

d) Can it be concluded that the distribution of serum cholesterol levels for low-income rural residents
is normal with mean 160 mg/dL and standard deviation 31 mg/dL?

Based on the results from (b) and (c), the confidence interval for the mean was (141.46,168.12)(141.46,
168.12), which includes 160. Therefore, it is possible that the population mean is 160 mg/dL. However, the
Chi-Square test for the standard deviation rejected the hypothesis that the standard deviation is 31 mg/dL,
indicating that the distribution does not have a standard deviation of 31 mg/dL.

Thus, we cannot conclusively say that the distribution is normal with mean 160 mg/dL and standard
deviation 31 mg/dL.

QUESTION 3

a) Missing values for paired T-test

We are given the following data:

 Mean for Clay Pipe: 1.632


 Mean for Cement Blocks & Brush: 2.189
 Standard deviation for Clay Pipe: 2.662
 Standard deviation for Cement Blocks & Brush: 3.195
 Standard deviation of differences sds_d: 0.810
i) Finding the missing values of nn, dd, and sgs_g

 n=12n = 12 (already provided)


 The difference d=2.189−1.632=0.557d = 2.189 - 1.632 = 0.557
 Standard deviation of the difference: sg=0.810s_g = 0.810

ii) Showing the T-value

To calculate the T-value:

T=mean differenceSEmean differenceT = \frac{\text{mean difference}}{SE_{\text{mean difference}}}

Where SEmean difference=sdn=0.81012=0.234SE_{\text{mean difference}} = \frac{s_d}{\sqrt{n}} = \


frac{0.810}{\sqrt{12}} = 0.234.

Thus:

T=0.5570.234=2.38T = \frac{0.557}{0.234} = 2.38

This matches the given value of T=−2.38T = -2.38 (sign reversal due to direction of test).

iii) Hypothesis Test for the Mean Difference

 Null hypothesis H0:μd=0H_0: \mu_d = 0


 Alternative hypothesis H1:μd<0H_1: \mu_d < 0

Using the calculated T-value, we find the critical value for a left-tailed test at α=0.05\alpha = 0.05 with 11
degrees of freedom. The critical value is approximately −1.796-1.796.

Since T=−2.38<−1.796T = -2.38 < -1.796, we reject H0H_0. Therefore, there is sufficient evidence to
conclude that the mean weight of fish caught using Clay Pipes is less than using Cement Blocks & Brush.

iv) Independent or Dependent Samples?

Since the same fishing periods are used to compare the two methods, the samples are dependent.

QUESTION 4

a) Data for Tomato Yield and EC Levels

i) Total Sum of Squares (SST)

The total SST is calculated as:

SST=∑(x2)−(∑x)2NSST = \sum (x^2) - \frac{(\sum x)^2}{N}

For this problem, it has been provided that SST = 570.1.

ii) Testing the Difference in Mean Yield

Null hypothesis: H0:μ1=μ2=μ3=μ4H_0: \mu_1 = \mu_2 = \mu_3 = \mu_4 (no difference in mean yield)
Alternative hypothesis: H1:At least one mean is differentH_1: \text{At least one mean is different}
Using the F-statistic:

F=MSAMSE=149.8708.605=17.42F = \frac{MSA}{MSE} = \frac{149.870}{8.605} = 17.42

For α=0.05\alpha = 0.05, the critical value from the F-distribution table with 3 and 14 degrees of freedom is
3.29.

Since F=17.42>3.29F = 17.42 > 3.29, we reject H0H_0 and conclude that there is a significant difference in
the mean yield of tomatoes at different salinity levels.

iii) Reason for Using One-Way ANOVA

The One-Way ANOVA was used because we are testing the effect of a single factor (salinity level) on the
yield of tomatoes, which involves multiple groups (EC levels). This makes it appropriate to use a One-Way
ANOVA.

These solutions should help you understand the methodology and reasoning behind hypothesis testing and
ANOVA in various contexts.
Apologies for the confusion! Let's focus on Question 3(b) about the nucleic acid experiment involving
trained and untrained rats.

QUESTION 3(b)

Given Information:

The experiment involves testing whether learning could be transferred by nucleic acid, using trained rats
and untrained rats. The data provided include the number of errors made by rats in 20 trials.

We are provided with the following data and results:

Sample N Mean StDev SE Mean


Untrained Injection 10 10.40 2.55 0.81
Trained Injection 10 9.20 2.30 0.73

 Test and CI for Two Variances (F-Test):


o Estimated Ratio of Variances: 1.22689
o 95% Confidence Interval for Ratio: [0.305, 4.939]
o Test Statistic (F): 1.23
o P-Value: 0.766
o Degrees of Freedom: 9 for both groups (Untrained and Trained).

i) 95% Confidence Interval for the Ratio of Variances

 The 95% Confidence Interval for the Ratio of Variances is already provided as [0.305, 4.939].

Interpretation:

 Since the confidence interval includes 1 (which is the value indicating equal variances), we fail to
reject the null hypothesis that the variances of the two groups are equal. Therefore, we do not have
sufficient evidence to conclude that the variances of errors between the untrained and trained rats are
significantly different.

ii) Test to Check if the Average Number of Errors Made by Rats Injected with Untrained Nucleic
Acid is More Than Those Injected with Trained Nucleic Acid

 Null Hypothesis (H0H_0): There is no difference in the mean number of errors made by rats
injected with untrained nucleic acid and those injected with trained nucleic acid.

H0:μUntrained=μTrainedH_0: \mu_{\text{Untrained}} = \mu_{\text{Trained}}

 Alternative Hypothesis (H1H_1): Rats injected with untrained nucleic acid make more errors than
those injected with trained nucleic acid.

H1:μUntrained>μTrainedH_1: \mu_{\text{Untrained}} > \mu_{\text{Trained}}


This is a one-tailed test where we are testing if the mean of the untrained group is greater than the mean of
the trained group.

Test Statistic (t-test for Two Independent Samples):

Since we have two independent samples with equal variances (as shown by the F-test), we can use a pooled
t-test for the comparison of means.

The formula for the t-statistic is:

t=(Xˉ1−Xˉ2)s2n1+s2n2t = \frac{(\bar{X}_1 - \bar{X}_2)}{\sqrt{\frac{s^2}{n_1} + \frac{s^2}{n_2}}}

Where:

 Xˉ1=10.40\bar{X}_1 = 10.40, Xˉ2=9.20\bar{X}_2 = 9.20 (means of the untrained and trained


groups)
 s=2.55s = 2.55 (pooled standard deviation since variances are equal)
 n1=n2=10n_1 = n_2 = 10 (sample sizes for both groups)

Now, calculate the pooled standard deviation (sps_p):

sp=(n1−1)s12+(n2−1)s22n1+n2−2=(10−1)⋅2.552+(10−1)⋅2.30210+10−2s_p = \sqrt{\frac{(n_1 - 1)s_1^2 +


(n_2 - 1)s_2^2}{n_1 + n_2 - 2}} = \sqrt{\frac{(10 - 1) \cdot 2.55^2 + (10 - 1) \cdot 2.30^2}{10 + 10 - 2}}
sp=9⋅6.5025+9⋅5.2918=58.5225+47.6118=106.132518=5.89=2.43s_p = \sqrt{\frac{9 \cdot 6.5025 + 9 \cdot
5.29}{18}} = \sqrt{\frac{58.5225 + 47.61}{18}} = \sqrt{\frac{106.1325}{18}} = \sqrt{5.89} = 2.43

Now, calculate the t-statistic:

t=(10.40−9.20)2.43210+2.43210=1.205.904910+5.904910=1.201.18098=1.201.085=1.11t = \frac{(10.40 -
9.20)}{\sqrt{\frac{2.43^2}{10} + \frac{2.43^2}{10}}} = \frac{1.20}{\sqrt{\frac{5.9049}{10} + \
frac{5.9049}{10}}} = \frac{1.20}{\sqrt{1.18098}} = \frac{1.20}{1.085} = 1.11
Degrees of Freedom:

For a pooled t-test:

df=n1+n2−2=10+10−2=18df = n_1 + n_2 - 2 = 10 + 10 - 2 = 18


Critical Value:

For α=0.05\alpha = 0.05 (one-tailed test) and df=18df = 18, the critical value tcriticalt_{\text{critical}} from
the t-distribution table is approximately 1.734.

Comparison:

 Calculated t-value: 1.11


 Critical t-value: 1.734

Since 1.11 < 1.734, we fail to reject the null hypothesis.

Conclusion:

There is insufficient evidence to conclude that rats injected with untrained nucleic acid make more errors
than those injected with trained nucleic acid at the 5% significance level.
iii) Type I Error

A Type I error occurs when the null hypothesis is rejected when it is actually true.

Based on the result from ii), we fail to reject the null hypothesis, meaning no Type I error has occurred. If
we had rejected H0H_0 and concluded that the untrained group made more errors, we would have made a
Type I error if in fact there was no real difference between the groups.

Thus, I disagree with the claim that a Type I error occurred in this case, because the hypothesis was not
rejected.

Summary of the Solution:

1. Ratio of Variances: The 95% confidence interval for the ratio of variances included 1, meaning we
fail to reject the null hypothesis that the variances are equal.
2. Test for Mean Errors: The calculated t-statistic was 1.11, which was less than the critical value of
1.734, so we fail to reject the null hypothesis that the mean errors for untrained rats are greater than
those for trained rats.
3. Type I Error: Since the null hypothesis was not rejected, no Type I error occurred in this case.

Let me know if anything needs further clarification or if you'd like additional details!

You might also like