0% found this document useful (0 votes)
15 views8 pages

Answer To Multiple Regression Task

The document outlines a research study investigating the relationship between final grades of nutrition students and their attendance in lectures, seminars, and first-year grades using multiple regression analysis. The analysis indicates that first-year grades are the only significant predictor of final grades, accounting for 80.1% of the variance. The study also confirms that the assumptions for multiple regression analysis were met, including normality and the absence of collinearity among predictor variables.

Uploaded by

jaasiya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views8 pages

Answer To Multiple Regression Task

The document outlines a research study investigating the relationship between final grades of nutrition students and their attendance in lectures, seminars, and first-year grades using multiple regression analysis. The analysis indicates that first-year grades are the only significant predictor of final grades, accounting for 80.1% of the variance. The study also confirms that the assumptions for multiple regression analysis were met, including normality and the absence of collinearity among predictor variables.

Uploaded by

jaasiya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

MSc Research Methods

Regression Analysis (Task 1)


ENTER method

A researcher wanted to investigate whether the final grades of nutrition students


could be explained by attendance in lectures, attendance in seminars, and first year
grades. The researcher collected data on each of these variables from 15 students
and the data are presented below.
Student Lectures Seminars attended 1st year grade Final grade
attended (%) (%) (%) (%)
1 80 65 65.6 67.2
2 98 78 62.4 68
3 65 49 36.8 54.4
4 81 57 56 68
5 94 86 75.2 67.2
6 75 65 64 60.8
7 62 48 24 57.6
8 100 100 80 80
9 92 78 76.8 71.2
10 56 48 19.2 48
11 65 52 64 61.6
12 69 59 68.8 67.2
13 54 48 32 45.6
14 83 67 52.8 59.2
15 90 78 78.4 72

a) Which of the variables are the predictor (independent) variables?

__________________________________________________________

b) What is the outcome (dependent) variable?

_______________________

c) What level of data do you have?


 Nominal
 Ordinal
 Interval
 X Ratio

d) Does this level of data meet the requirements for a multiple regression analysis?
 X Yes, but I have to check if the relationship is linear, if there are any
outliers and whether the data is normally distributed
 No, I may have to consider transforming my data and then re-check for a
linear relationship, outliers and normal distribution

1
1. Make a scatterplot in Excel and visually inspect the graph to assess if the
relationship is linear or not and if there are any outliers

e) The relationship is linear YESdelete as appropriate)

f) There are outliers YES(delete as appropriate)

2. SETUP IN SPSS
The data should be entered into SPSS as it is in the table above. Each student had 4
pieces of information therefore there should be 4 columns in SPSS (one row per
student). Label your variables in the Variable view screen and enter the data in the
Data View screen.

3. CONDUCT TESTS OF NORMALITY


Analyze  Descriptive  Explore  Move all variables to the Dependent List box
 Plots  Select ‘none’ in Boxplots area  De-select ‘Stem-and-leaf’ but select
‘Histogram’ in Descriptive area  ‘Normality plots with tests’  Continue.

Now also click Statistics  Select Descriptives and Outliers  Continue  OK.

g) Examine the ‘extreme values’ table to identify if there are any values that seem
excessively high/low compared to the rest.

h) Record the Shapiro-Wilk significance value for lecture attendance, seminar


attendance, 1st year grade and final year grade:

i) Lecture attendance: p = 0.454

ii): Seminar attendance: p = 0.128

iii) 1st year grade: p = 0.069

iv) Final grade: p = 0.663

i) Write a short statement to report your findings:

_____________________Assumption of normality was met for all variables (p>0.05)


as assessed using Shapiro Wilk’s tests.
___________________________________________________________________
___________________________________________________________________
_____________

2
3. TEST PROCEDURE IN SPSS FOR LINEAR REGRESSION
Analyse Regression  Linear  Move independent (predictor) variables into the
independent box  move dependent (outcome) variable into the dependent box 
OK.

Now click on Statistics  Some tests/measures are already selected but you will
need to tick some additional boxes so that finally you have the following
tests/measures ticked:

 Model fit (pre-selected)


 Estimates (pre-selected)
 Confidence intervals
 Descriptive
 Part and Partial Correlations
 Collinearity Diagnostics

Continue  OK

4. OUTPUT OF LINEAR REGRESSION ANALYSIS:


SPSS will generate quite a few tables in its results section for a multiple linear
regression.

The first table is a table of Descriptives which provides the mean and standard
deviation of all variables.

The next table is a Correlations table which is a matrix of the relationship between
all variables. Generally we would not want any of independent variables to have a
relationship > 0.7. We can see from our table that some correlation coefficients are

3
higher than this (but for the purposes of this example we will continue on with our
analysis).
The main table of interest is the Model Summary table. This table provides the R
and R2 value. The R value represents the simple correlation. The Adjusted R 2 value
indicates how much of the outcome variable (final grade), can be explained by the
predictor variables (lecture attendance, seminar attendance and first year grade).

Model Summary

Model R R Square Adjusted R Std. Error of the


Square Estimate
a
1 .918 .843 .801 4.11874

a. Predictors: (Constant), OneGrade, Seminars, Lectures

j) Record the R and R2 values:

R = 0.918

R2adj = 0.801

In this case, the correlation between the predictors and the outcome variable is very
strong, and 80.1% of the variance observed in the Final Grade can be explained be
explained by the predictor variables.

The next table is the ANOVA table. This table indicates whether the regression
model predicts the outcome variable significantly well.

ANOVA(b)

Sum of Mean
Model Squares df Square F Sig.
1 Regression 1005.07 19.749 .000(a)
3 335.025
6
Residual 186.604 11 16.964
Total 1191.68
14
0
a Predictors: (Constant), OneGrade, Seminars, Lectures
b Dependent Variable: FinalGrade

If we look at the ‘Regression’ row we can see that the F value (the test statistic) =
19.749. The model is also significant (p < 0.05). This means the model applied is
significantly good enough in predicting the outcome variable.

4
How do we write up what we have so far?
So far we have an R2 value, an F statistic and a p (significance) value……….

Multiple regression analysis using the enter method was used to test the hypothesis
that lecture attendance, seminar attendance and first year grade could predict
______final year grade_. The predictor variables accounted for __80.1___% of the
variance observed in final year grades (R2adj = __0.801___) and was significant (F=
___19.749__, p _<_ 0.05).

But it’s not over yet!.................

Coefficients
The table below, Coefficients, provides us with information on each predictor
variable, identifying which are individually significant predictors of our outcome
variable.

k) Which predictor variables contribute significantly to the model? (look at the sig.
column).

Significant predictors:___________________

Remember that the regression equation is as follows:

y = a + b1x1 + b2x2 + b3x3

Constant Lecture Seminar One Grade

By looking at the B column under the Unstandardized Coefficients column we can


present the regression equation as follows:

Final Year Grade = 28.312 + 0.304(Lecture) + -0.041(Seminar) + 0.245(First Year)

Note: the ENTER method has included all the variables in the regression equation even
5
though only one of them (final year grade) is a significant predictor.
So what would the predicted final year grade score be for student 5?

Student 5
Lecture attendance: 94 % Seminar attendance: 86 % First year grade: 75.2 %

 Final Year Grade = 28.312 + 0.304(94) + -0.041(86) + 0.245(75.2)

 Final Year Grade = 28.312 + 28.576 + -3.526 + 18.424

 Final Year Grade = 71.8% (predicted score)

Compared to the actual final year grade score of 67.2% the equation has predicted
the score within 4%.

l) Try the same again to predict the final year grade score for student 13.

Student 13
Lecture attendance: 54 % Seminar attendance: 48 % First year grade: 32 %

 Final Year Grade = 28.312 + 0.304(_54____) + (-0.041)(_48____) +


0.245(__32___)
 Final Year Grade = 28.312 + 16.416 -1.968 + 7.84_______
 Final Year Grade = _50.6%_____ (predicted score)

How does this compare to the actual score?

Compared to the actual final year grade score of 45.6 % the equation has predicted
the score within 5%.

Collinearity
We should also take note of the Collinearity Statistics to make sure there is no
collinearity of our variables. These can be found at the end of the Coefficients table

If the Tolerance value is less than 0.1 (or a VIF greater than
10) this indicates a collinearity problem. In this example, all the
Tolerance values are (just!) greater than 0.1 and VIF’s lower than
10 - so we can be fairly confident that we do not have a problem
with collinearity in this particular data set.

Note: we should report this in our results!


VIF and tolerance levels are within accepted limits indicating that collinearity is not a
problem.

6
5. REPORTING THE RESULTS OF MULTIPLE REGRESSION

We already have our report about the first section done:

________Multiple regression analysis using the enter method was used to test the
hypothesis that lecture attendance, seminar attendance and first year grade could
predict ______final year grade________. The predictor variables accounted for
_80.1___% of the variance observed in final year grades (R2adj = 0.801_____) and
were significant (F = 19.749_____, p < 0.05).

We should also include that we have checked the data set to ensure it meets the
assumptions of the test. e.g. detail visual inspection of scatter plots to determine
linearity and outliers, report Shapiro-Wilks test, comment on Tolerance levels and
VIF.

Now we can also add the results of the regression equation:

First year grade was the only significant predictor of final year grade (b=0.245, t =
2.58, p=0.026 or p<0.05______) and unstandardised beta coefficients indicated that
this was a ___positive____ relationship; as first year grade increases so does final
year grade. There were no other significant predictors for final year grade.

Note: You should present your coefficients and standard errors in a supporting table.

7
8

You might also like