Chapter IV
Tranformations and Dummy Variables
Dr Hédi ESSID
Topics to be Covered
Logarithmic transformations - log-linear (constant
elasticity) models .
Dummy variables for qualitative factors.
An application to illustrate the above.
Econometrics 2
Log-linear Regression Models
In many cases relationships between economic
variables may be non-linear. However we can
distinguish between functional forms that are
intrinsically non-linear (and will need to be estimated by
some kind of iterative non-linear least squares method)
and those that can be transformed into an equation to
which we can apply ordinary least squares techniques.
Of those non-linear equations that can be transformed,
the best known is the multiplicative power function form
(sometimes called the Cobb-Douglas functional form),
which is transformed into a linear format by taking
logarithms.
Econometrics 3
Log-linear Regression Models
Production functions
For example, suppose we have cross-section data on
firms in a particular industry with observations both on
the output (Q) of each firm and on the inputs of Labour
(L) and Capital (K).
Consider the following functional form
Q AL K 1
Econometrics 4
Log-linear Regression Models
Equation [1] means that the impact on Q of a change in
L (K held constant) will not be constant but will different
according to the values of L and K.
In mathematical terms Q/L (the derivative of Q with
respect to L) is not a constant.
Differentiating [1] we can see that Q/L=AL-1Kβ
Note that this can also be written as Q/L=(Q/L) ; and
= (Q/Q)/(L/L) = LogQ/LogL
Econometrics 5
Log-linear Regression Models
In term of economics Q/L is the Marginal Product of
Labour (MPL).
Note: If we differentiate again with respect to L we can
see that if 0<<1 the MPL will be positive but declining
function of L.
= LogQ/LogL is the elasticity of Q with respect to L.
This elasticity is defined to be the proportional change
in Q for a given proportional change in L.
Similarly MPK=Q/K=β(Q/K) ; and
β= LogQ/LogK is the elasticity of Q with respect to K.
Econometrics 6
Log-linear Regression Models
Further, if
α + β = 1, the production function has constant returns
to scales, meaning that doubling the usage of capital
K and labor L will also double output Q. If
α + β < 1, returns to scale are decreasing, and if
α + β > 1, returns to scale are increasing.
Econometrics 7
Log-linear Regression Models
Now lets see what happens when we take logarithms of
both sides of [1]
LogQ=LogA + LogL +βLogK [2]
The logarithmic transformation has converted the
equation to one which is linear in the logarithms.
Econometrics 8
Log-linear Regression Models
Of course up until now have neglected to incorporate an
error term into the equation. For it, to be additive in
equation [2] it must have been multiplicative in equation [1]
i.e. Q=ALKβeu [1*]
Becomes LogQ= LogA +LogL +βLogK +u [2*]
The parameters and can be estimated directly from a
regression of the variable LogQ on LogL and LogK.
Econometrics 9
Log-linear Regression Models
Demande functions
Similary we might find if it necessary to use a multiplicative function
form when studying demand.
e.g. Q AP 1Y 2 R 3 eU
or LogQ 0 1LogP 2LogY 3 LogR U
where here Q =quantity demanded of a comodity
P =the price of a comomdity
Y=consumer's income
R =the price of the related comodity
and we write 0 for log A
Econometrics 10
Log-linear Regression Models
This functional from would be consistent with curves
which are convex to the origin. It also has the advantage
that the regression parameters can be interpreted as the
elasticities.
For example the (own) price elasticity = (Q/P)*(P/Q)=1
[Prove this as an exercise]. Of course this is expected to
be negative.
Similarly the income elasticity of demand will be 2 and
the cross-price elasticity of demand is 3 (>0 if our
commodity is a substitute for the related commodity, <0 if
the goods are complements).
Econometrics 11
Logarithms in Regression :
Logarithms can be used to transform the dependent
variable Y, an independent variable X, or both (but the
variable being transformed must be positive).
The following table summarizes three cases and the
interpretation of the regression coefficient 1. In each
case, 1 can be estimated by applying OLS after taking
the logarithm of the dependant variable and/or
independent variable.
Econometrics 12
Logarithms in Regression :
Case Regression Specification Interpretation of α1
A 1% change in X is associated with
I Yi = α0+α1logXi+ui
a change in Y of 0.01α1
II A change in X by one unit (ΔX=1)
logYi = α0+α1Xi+ui is associated with a 100α1% change
in Y.
III A 1% change in X is associated with
logYi = α0+α1 logXi+ui a α1 % change in Y, so α1 is the
elasticity of Y with respect to X.
Econometrics 13
Dummy Variables
Dummy variables (sometimes called dichotomous
variables) are variables that are created to allow for
qualitative effects in a regression model.
A dummy variable will take the value 1 or 0 according
to whether or not the condition is present or absent for
a particular observation.
For example suppose we are investigating the
relationship between the wage (Wage) and the number
of years of education (Education) in the textile sector.
Econometrics 14
Dummy Variables
Our initial model is
Y = β1 + β2 X + u
However, we are concerned that the wages of female
workers may be below that of male workers with
similar experience.
To test for this we can introduce a dummy variable to
distinguish between the observations for male and
female workers in the regression.
Econometrics 15
Dummy Variables
Define D = 1 for male workers and 0 for female workers.
The overall equation becomes
Y = β0 + β1 X + D + u
where will measure the differential between male and female
workers, having taken account of differences in experience. We can
run a normal multiple regression with X and D as explanatory
variables.
Assuming that is positive it means that the regression line for male
workers lies above that for female workers - measures the extent
of the upward shift.
We can use its t-value to test whether these differences are
statistically significant.
Econometrics 16
Dummy Variables
Example : A wage-discrimination model
Our dataset includes wage as dependent variable, education and
dummy variables for gender and race as independent variables.
Let us consider the model in the above equation:
Wage = 1+2Educ + 1Black + 2Female + 3(black*female) + U
1. Is there a difference between wages of male and female workers with
similar experience ?
2. What are the expected wages for different categories ?
- White/male.
- Black/male.
- White/female.
- Black/female.
3. Given the same education, calculate the difference between black
female and white male. Econometrics 17
Chapter IV
Tranformations and Dummy Variables
Dr Hédi ESSID
Topics to be Covered
Logarithmic transformations - log-linear (constant
elasticity) models .
Dummy variables for qualitative factors.
An application to illustrate the above.
Econometrics 2
Log-linear Regression Models
In many cases relationships between economic
variables may be non-linear. However we can
distinguish between functional forms that are
intrinsically non-linear (and will need to be estimated by
some kind of iterative non-linear least squares method)
and those that can be transformed into an equation to
which we can apply ordinary least squares techniques.
Of those non-linear equations that can be transformed,
the best known is the multiplicative power function form
(sometimes called the Cobb-Douglas functional form),
which is transformed into a linear format by taking
logarithms.
Econometrics 3
Log-linear Regression Models
Production functions
For example, suppose we have cross-section data on
firms in a particular industry with observations both on
the output (Q) of each firm and on the inputs of Labour
(L) and Capital (K).
Consider the following functional form
Q AL K 1
Econometrics 4
Log-linear Regression Models
Equation [1] means that the impact on Q of a change in
L (K held constant) will not be constant but will different
according to the values of L and K.
In mathematical terms Q/L (the derivative of Q with
respect to L) is not a constant.
Differentiating [1] we can see that Q/L=AL-1Kβ
Note that this can also be written as Q/L=(Q/L) ; and
= (Q/Q)/(L/L) = LogQ/LogL
Econometrics 5
Log-linear Regression Models
In term of economics Q/L is the Marginal Product of
Labour (MPL).
Note: If we differentiate again with respect to L we can
see that if 0<<1 the MPL will be positive but declining
function of L.
= LogQ/LogL is the elasticity of Q with respect to L.
This elasticity is defined to be the proportional change
in Q for a given proportional change in L.
Similarly MPK=Q/K=β(Q/K) ; and
β= LogQ/LogK is the elasticity of Q with respect to K.
Econometrics 6
Log-linear Regression Models
Further, if
α + β = 1, the production function has constant returns
to scales, meaning that doubling the usage of capital
K and labor L will also double output Q. If
α + β < 1, returns to scale are decreasing, and if
α + β > 1, returns to scale are increasing.
Econometrics 7
Log-linear Regression Models
Now lets see what happens when we take logarithms of
both sides of [1]
LogQ=LogA + LogL +βLogK [2]
The logarithmic transformation has converted the
equation to one which is linear in the logarithms.
Econometrics 8
Log-linear Regression Models
Of course up until now have neglected to incorporate an
error term into the equation. For it, to be additive in
equation [2] it must have been multiplicative in equation [1]
i.e. Q=ALKβeu [1*]
Becomes LogQ= LogA +LogL +βLogK +u [2*]
The parameters and can be estimated directly from a
regression of the variable LogQ on LogL and LogK.
Econometrics 9
Log-linear Regression Models
Demande functions
Similary we might find if it necessary to use a multiplicative function
form when studying demand.
e.g. Q AP 1Y 2 R 3 eU
or LogQ 0 1LogP 2LogY 3 LogR U
where here Q =quantity demanded of a comodity
P =the price of a comomdity
Y=consumer's income
R =the price of the related comodity
and we write 0 for log A
Econometrics 10
Log-linear Regression Models
This functional from would be consistent with curves
which are convex to the origin. It also has the advantage
that the regression parameters can be interpreted as the
elasticities.
For example the (own) price elasticity = (Q/P)*(P/Q)=1
[Prove this as an exercise]. Of course this is expected to
be negative.
Similarly the income elasticity of demand will be 2 and
the cross-price elasticity of demand is 3 (>0 if our
commodity is a substitute for the related commodity, <0 if
the goods are complements).
Econometrics 11
Logarithms in Regression :
Logarithms can be used to transform the dependent
variable Y, an independent variable X, or both (but the
variable being transformed must be positive).
The following table summarizes three cases and the
interpretation of the regression coefficient 1. In each
case, 1 can be estimated by applying OLS after taking
the logarithm of the dependant variable and/or
independent variable.
Econometrics 12
Logarithms in Regression :
Case Regression Specification Interpretation of α1
A 1% change in X is associated with
I Yi = α0+α1logXi+ui
a change in Y of 0.01α1
II A change in X by one unit (ΔX=1)
logYi = α0+α1Xi+ui is associated with a 100α1% change
in Y.
III A 1% change in X is associated with
logYi = α0+α1 logXi+ui a α1 % change in Y, so α1 is the
elasticity of Y with respect to X.
Econometrics 13
Dummy Variables
Dummy variables (sometimes called dichotomous
variables) are variables that are created to allow for
qualitative effects in a regression model.
A dummy variable will take the value 1 or 0 according
to whether or not the condition is present or absent for
a particular observation.
For example suppose we are investigating the
relationship between the wage (Wage) and the number
of years of education (Education) in the textile sector.
Econometrics 14
Dummy Variables
Our initial model is
Y = β1 + β2 X + u
However, we are concerned that the wages of female
workers may be below that of male workers with
similar experience.
To test for this we can introduce a dummy variable to
distinguish between the observations for male and
female workers in the regression.
Econometrics 15
Dummy Variables
Define D = 1 for male workers and 0 for female workers.
The overall equation becomes
Y = β0 + β1 X + D + u
where will measure the differential between male and female
workers, having taken account of differences in experience. We can
run a normal multiple regression with X and D as explanatory
variables.
Assuming that is positive it means that the regression line for male
workers lies above that for female workers - measures the extent
of the upward shift.
We can use its t-value to test whether these differences are
statistically significant.
Econometrics 16
Dummy Variables
Example : A wage-discrimination model
Our dataset includes wage as dependent variable, education and
dummy variables for gender and race as independent variables.
Let us consider the model in the above equation:
Wage = 1+2Educ + 1Black + 2Female + 3(black*female) + U
1. Is there a difference between wages of male and female workers with
similar experience ?
2. What are the expected wages for different categories ?
- White/male.
- Black/male.
- White/female.
- Black/female.
3. Given the same education, calculate the difference between black
female and white male. Econometrics 17