SINGLE-EQUATION
REGRESSION MODELS
ECN 5121 Econometric Methods I
Chapter Outlines
The Modern Interpretation of Regression
Regression versus Causation
Regression versus Correlation
The Nature and Sources of Data for
Economic Analysis
2
The Modern Interpretation of
Regression
Regression analysis is concerned with the
study of the dependence of one variable,
the dependent variable, on one or more
other variables, the explanatory variables,
with a view to estimating and/or predicting
the (population) mean or average value of
the former in terms of the known or fixed
(in repeated sampling) values of the latter.
3
Examples
Average height of sons increases with the
father’s height
Average height increases with age
Marginal propensity to consume (MPC)
Price elasticity
Rate of change of money wages in relation to
the unemployment rate
Etc.
4
Statistical Vs Deterministic
Relationships
Concerned the statistical dependence
among variables.
Deal with random or stochastic variable
variables that have probability distribution.
Bound with some “intrinsic” or random
variability in the dependent-variable that
cannot be fully explained.
5
Relationships
We are not considering deterministic or
mathematical relationships.
We are considering a statistical
relationship
there is not a unique value for y given values
for x
but there is a relationship that can be
described exactly in probabilistic terms.
6
Relationships (cont.)
y = 10 + 5x
y is known exactly if x is known.
x is known exactly if y is known.
which is dependent variable here?
y = 10 + 5x + u u = +3, 0, -3 with equal
probability
expected value of y is known if x is known
because we also know the distribution of the
error or disturbance term u.
7
The Linear Regression Model
Postulate: The dependent variable, Y, is a
function of the explanatory variable, X, or
Yi = ƒ(Xi)
However, the relationship is not deterministic
Value of Y is not completely determined by
value of X
Thus, we incorporate an error term (residual)
into the model which provides a statistical
relationship
Yi = ƒ(Xi) + i
8
The Linear Regression Model
(cont.)
Probabilistic Model: yi = 0 + 1xi +i
where yi = a value of the dependent variable, y
xi = a value of the independent variable, x
0 = the y-intercept of the regression line
1 = the slope of the regression line
i = random error, the residual
Deterministic Model:
ŷ1 = b0 + b1xi where b0≈ β0, b1≈ β1
and ŷ1 is the predicted value of y in contrast to the
actual value of y.
9
Regression Vs Causation
Regression: Estimate relationships between two
or more variables.
Causation: A specific action leads to a specific,
measurable consequences.
A statistical relationship in itself cannot logically
imply causation.
Appeal to priori or theoretical considerations.
10
Regression Vs Correlation
Correlation analysis: measure the strength or
degree of linear association between two
variables.
Correlation is only concerned with strength of the
relationship
No causal effect is implied with correlation
There is no distinction between dependent and
explanatory variables.
Based on the assumption of randomness of
variables.
A scatter plot (or scatter diagram) can be used
to show the relationship between two variables.
11
Properties of Coefficient of
Correlation
Can be positive or negative.
-1 ≤ r ≤ 1.
Symmetrical in nature, rXY = rYX.
Independent of the origin and scale.
If X and Y are statistically independent, rXY = 0.
Measure of linear association or linear dependence
only.
Not necessarily imply any cause-and-effect
relationship.
12
Type of Relationships
Linear relationships Curvilinear relationships
Y Y
X X
X X
13
Type of Relationships (cont.)
Strong relationships Weak relationships
Y Y
X X
Y Y
14
Type of Relationships (cont.)
No relationship
X
15
Regression Vs Correlation (cont.)
In regression, we treat the dependent
variable (y) and the independent
variable(s) (x’s) very differently
The y variable is assumed to be random or
“stochastic” in some way.
The x variables are assumed to have fixed
(“non-stochastic”) values in repeated
samples.
16
Terminology and Notation
Dependent Variable (Y) Explanatory Variable (X)
- Explained variable - Independent variable
- Predictand - Predictor
- Regressand - Regressor
- Response - Stimulus
- Endogenous - Exogenous
- Outcome - Covariate
- Controlled variable - Control variable
17
Terminology and Notation (cont.)
Simple, or two-variable, regression
analysis
The dependence of a variable on only a single
explanatory variable.
Multiple regression analysis
The dependence of one variable on more
than one explanatory variable
The term random is a synonym for the
term stochastic.
18
Structure of Economic Data
a. Time series data
Observations on one or several variables over time
Chronological order and can have different time
frequencies (annual, quarterly, monthly, weekly, daily,
hourly)
Examples: GDP, unemployment, inflation, stock prices
Vectors or columns (as in a spreadsheet)
Denoted with the subscript t.
Follow certain frequencies might exhibit a strong
seasonal pattern.
Structure of Economic Data
a. Time series data
Structure of Economic Data
2) Cross-sectional Data
A sample of individuals, households, firms, countries,
regions, cities or any other type of units at a specific point
in time.
The data across all units do not correspond to exactly the
same time period
Denoted by the subscript i
Mainly associated with applied microeconomics
Labour economics, health economics, state and local public
finance, business economics
Structure of Economic Data
2) Cross-sectional Data
Economic Growth and Private Sector Credit
.0 6
South Korea
Thailand Singapore
Ireland
E c o n om ic G row th (% )
.0 4
Hong Kong
India
Indonesia Chile Malaysia
Luxembourg
Sri Lanka
Egypt
Pakistan
Latvia Tunisia Norway
Portugal
Finland Spain
.0 2
Turkey Bangladesh Iceland UK
Sudan Hungary Australia
Israel
Denmark
Belgium AustriaGermany USA Japan
MoroccoTrinidad Italy
& Tobago Sweden
Canada Netherlands
Greece France
Congo Costa Colombia
RicaUruguay Panama New Zealand
Syria Mexico
Jamaica Guyana Jordan Switzerland
Ghana Argentina
Ecuador Brazil
Philippines
Iran
Peru
Mali Algeria
Honuras
Paraguay
GambiaSenegal
Guatemala
Papua El Salvador
New Guinea South Africa
Kenya
0
Cameroon Bolivia
Malawi
Sierra Gabon
LeoneTogoVenezuela
Zambia
-.0 2
Niger Saudi Arabia
Kuwait
Haiti Cote d'Ivoire
-.0 4
Congo, DR linear fit 95% CI
0 .5 1 1.5
Private Sector Credit (% of GDP)
Structure of Economic Data
3. Panel data
A combination of time series and cross-sectional data
Examples:
• the sales and the number of employees for 50 firms over a
five-year period.
• GDP and money supply data for a set of 10 countries for 20-
year period.
Matrices (columns and rows to make an n times m
matrix)
Denoted by the use of both i and t subscripts
Example: Panel Data Structure
Country Year Human Capital (Barro and Lee) RGDPC FDI
Singapore 1980 1.5412 : :
Singapore 1985 1.7128 : :
Singapore 1990 1.8569 : :
Singapore 1995 2.2783 : :
Singapore 2000 2.6958 : :
Singapore 2005 2.8993 : :
Malaysia 1980 1.7737 : :
Malaysia 1985 2.4028 : :
Malaysia 1990 2.3729 : :
Malaysia 1995 3.2757 : :
Malaysia 2000 3.7768 : :
Malaysia 2005 4.1357 : :
Thailand 1980 0.6081 : :
Thailand 1985 0.7427 : :
Thailand 1990 0.8542 : :
Thailand 1995 1.0379 : :
Thailand 2000 1.2924 : :
Thailand 2005 1.5419 : :
Viet Nam 1980 1.7826 : :
Viet Nam 1985 1.1871 : :
Viet Nam 1990 0.6027 : :
Viet Nam 1995 0.8133 : :
Viet Nam 2000 1.1196 : :
Viet Nam 2005 1.5265 : :
Example: Panel Data Structure
Notation – Structure of Economic Data
Time series: Yt, t=1990, 1991, …, 2012
Cross-sectional: Yi, i=1, 2, 3, …, 40
Panel: Yit, i and t defined as above.
It is common to denote each observation by the letter t
and the total number of observations by T for time series
data, and to denote each observation by the letter i and
the total number of observations by N for cross-sectional
data.
Nature and Sources of Data
Types of data
Time series data Ct = β1+ β2Yt + εt
Cross-section data Ci = β1+ β2Yi + εi
Pooled data Ci,t = β1+ β2Yi,t + εi,t
combination of time series and cross-section.
27
Structure of Economic Data
– Quantitative vs Qualitative
The data may be quantitative and qualitative
Quantitative (e.g. GDP per capita, exchange
rates, stock prices, unemployment rates)
Qualitative (e.g. day of the week, gender, level
of education)
Sources of Data
Governmental agency
Census Bureau
Central Bank
International agency
IMF
World Bank
Public information
Internet
Other econometricians (individual)
Published articles
29
Measurement Scales of Variables
Ratio scale – X1/X2
Interval scale – distance between two time
period, (1995 – 2000)
Ordinal scale – natural ordering, (A, B, C
grades)
Nominal scale – (male, female)
30