Lecture Notes - Production Functions - 2017
Lecture Notes - Production Functions - 2017
2 Introduction
Also can be important as inputs into other interesting questions, e.g. dynamic models of
industry evolution, evaluation of …rm conduct (e.g. collusion)
For this lecture note, we will work with a simple two input Cobb-Douglas production function
Yi = e 0 Ki 1 Li 2 e"i
where i indexes …rms, Ki is units of capital, Li is units of labor, and Yi is units of output.
( 0 ; 1 ; 2 ) are parameters and "i captures unobservables that a¤ects output (e.g. weather,
soil quality, management quality)
– Additional inputs, e.g. R&D (knowledge capital), dummies representing discrete tech-
nologies, di¤erent types of labor/capital, intermediate inputs.
– Later we will see more ‡exible models
yi = f (ki ; li ; ) + "i
3 Endogeneity Issues
Problem is that inputs ki , li are typically choice variables of the …rm. Typically, these choices
are made to maximize pro…ts, and hence will often depend on unobservables "i .
Of course, this dependence depends on what the …rm knows about "i when they make these
input choices.
1
Example: Suppose a …rm operating in perfectly competitive output and input markets (with
respective prices pi ; ri , and wi ) perfectly observes "i before optimally choosing inputs. Pro…t
maximization problem is:
max pi e 0 Ki 1 Li 2 e"i ri Ki wi Li
Ki ;Li
FOC’s will imply that optimal choices of Ki and Li (ki and li ) will depend on "i . Intuition:
"i positively a¤ects marginal product of inputs. Hence …rms with higher "i ’s will want to
use more inputs.
yi = 0 + 1 ki + 2 li + "i
using OLS because ki and li are correlated with "i . Generally one would expect coe¢ cients
to be positively biased.
Similar problems would arise in more complicated models (e.g. non-perfectly competitive
output or input markets, "i only partially observed), except the special case where the …rm
has no knowledge of "i when choosing inputs
If ki is a "less variable" input than li , one might expect the …rm to have less knowledge about
"i when choosing ki (relative to li ). Generally, this will imply ki will be less correlated with
"i than li is: So one might expect more bias in the labor coe¢ cient.
Note: we will generally assume that the unobservables "i are generated or evolve exogenously,
i.e. they are not choice variables of the …rm. Things get considerably harder when the
unobservables are choice variables of the …rm.
yi = 0 + 1 ki + 2 li + !i + i
where ! i is an unobservable that is predictable (or partially predictable) to the …rm when it
makes its input decisions, and i is an unobservable that the …rm has no information about
when making input decisions (e.g. ! i represents average weather conditions on a particular
farm, i represents deviations from that average in a given year (after inputs are chosen)). i
could also represent measurement error in output.
In this formulation, ! i is causing the endogeneity problem, not i. Let’s call ! i the "produc-
tivity shock".
4 Traditional Solutions
Two traditional solutions to endogeneity problems can be used here: instrumental variables
and …xed e¤ects model. I will discuss these before moving to more recent methodological
approaches.
2
4.1 Instrumental Variables
Want to …nd "instruments" that are correlated with the endogenous inputs, but do not directly
determine yi and are not correlated with ! i (and i ).
Speci…cally, consider input and output prices wi , ri , and pi :Theory tells us that these prices
will a¤ect …rms’optimal choices of ki and li . Theory also says that these prices are excluded
from the production function as they do not directly determine output yi conditional on the
inputs.
Last requirement is that wi , ri , and pi are not correlated with the productivity shock ! i .
When will this be the case (or not be the case)?
One key issue is the form of competition in input and output markets.
– If output markets are imperfectly competitive (i.e. …rms face downward sloping demand
curves), then a higher ! i will increase a …rm’s output, driving pi down. In other words,
pi will be positively correlated with ! i , invalidating pi as an instrument.
– If input markets are imperfectly competitive (i.e. …rms face upward sloping supply
curves), then a higher ! i will increase a …rm’s input demand, driving wi and/or ri up.
So wi and/or ri are now correlated with ! i , invalidating them as instruments.
So for these instruments want …rms operating in perfectly competitive input or output mar-
kets. Typically, this is more believable for input markets than for output markets.
Unfortunately, even if willing to make these assumptions, IV solutions haven’t been that
broadly used in practice. First, one needs data on wi and ri . Second, there is often very
little variation in wi and ri across …rms (often there is a real question of whether …rms actually
operate in di¤erent input markets?). Third, one often wonders whether observed variation
in e.g. wi , actually represents …rms facing di¤erent input prices, or whether it represents
things like variation in unobserved labor quality (i.e. the …rm with the higher wi is employing
workers of higher quality). If the latter, then wi is not a valid instrument.
While there might be "true" variation in input prices across time, this is usually not helpful,
because if one has data across time, one often wants to allow the production function to
change across time, e.g.
yit = 0t + 1 kit + 2 lit + ! it + it
(though there could be exceptions)
That said, I think if one can …nd a market where there is convincing exogenous input price
variation, IV approach is probably more convincing than the approaches I will talk about in
the rest of this lecture note, as there seem to be less auxiliary assumptions.
Notes:
3
– As is typically done in this literature, I have implicitly made a "homogeneous treatment
e¤ects" assumption. A heterogeneous treatment e¤ects model would be
yi = 0 + 1i ki + 2i li + !i + i
This a¤ects the interpretation of IV estimators, e.g. Heckman and Robb (1985), Angrist
and Imbens (1994, Ecta)
– If there are unobserved …rm choice variables in ! i , it becomes quite hard to …nd valid
instruments, even with the above assumptions.
Assume that it is independent across t (this is consistent with it not being predictable by
the …rm when choosing kit and lit )
Suppose one is willing to assume that the productivity shock is contant over time (…xed e¤ect
assumption), i.e.
! it = ! i
or …rst di¤erence
Since the problematic unobservable ! it have been di¤erenced out of these expressions (recall
that we have assumed that the it ’s are uncorrelated with input choices) these equations can
be estimated with OLS.
Problems:
– 1) ! it = ! i is a strong assumption
– 2) These estimators often produce strange estimates. In particular, they often generate
very small (or even negative) capital coe¢ cients. Perhaps this is due to measurement
error in capital (Griliches and Hausman (1986, JoE))?
Other notes:
– The mean di¤erence approach requires all the input choices to be uncorrelated with all
the it (strict exogeneity). The …rst di¤erence approach only requires current and lagged
inputs to be uncorrelated with current and lagged it . Using kit 1 and lit 1 (or other
lags) as instruments for (kit kit 1 ) and (lit lit 1 ), one can allow current inputs to be
arbitrarily correlated with past it ’s (sequential exogeneity)
4
– Panel data approach can be extended to richer error structures (Arellano and Bond
(1991, ReStud), Arellano and Bover (1995. JoE), Blundell and Bond (1998, JoE, 2000,
ER), Arellano and Honore (2001, Handbook)) e.g.
! it = ! it 1 + it
or
! it = i + it where it = it 1 + it
For example, for a …rm operating in perfectly competitive input and output markets, static
cost minimization implies that
@Y L wL
=
@L Y pY
@Y K rK
=
@K Y pY
i.e. the output elasticity w.r.t. an input must equal its (cost) share in revenue.
In a Cobb-Douglas context, these output elasticities are the production function coe¢ cients
1 and 2 , so observations on these revenue shares across …rms could provide estimates of
the coe¢ cients.
Note that r can often be assumed known and often one directly observes wL and pY (rather
than L and Y - i.e. labor input and output are measured in terms of dollar units (that are
implicitly assumed to be comparable across …rms))
But:
– This assumes static cost minimization - i.e. it assumes away dynamics, adjustment costs,
etc.. At the very least we often think about the capital input being subject to a dynamic
accumulation process, e.g. Kit = Kit 1 + iit 1
– There are additional terms when …rms are not operating in perfectly competitive mar-
kets, e.g. when …rms face downward sloping demand curve
@Y L wL
=
@L Y pY
@Y K rK
=
@K Y pY
p p
where = mc ;i.e. percentage markup. Note that pro…t maximization implies mc = 1+ ,
where is the elasticity of demand. So, for example, one could still identify production
coe¢ cients using this method if the elasticity of demand was known (this is done in
Hsieh and Klenow (2009, QJE)). Or, one might be able to identify both with additional
restrictions, e.g. Constant Returns to Scale (related to Hall (1988, JPE)).
5
5 Olley and Pakes (1996, Ecta)
Alternative approach to estimating production functions. I will argue that key assumptions
are timing/information set assumptions, a scalar unobservable assumption, and a monotonic-
ity assumption.
Setup:
yit = 0 + 1 kit + 2 lit + ! it + it (1)
Again, the unobserved productivity shocks ! it are potentially correlated with kit and lit .but
the unobservables it are measurement errors or unforecastable shocks that are not correlated
with inputs kit and lit .
Basic Idea: Endogeneity problem is due to the fact that ! it is unobserved by the econome-
trician. If some other equation can tell us what ! it is (i.e. making it "observable"), then the
endogeneity problem would be eliminated.
Olley and Pakes will use observed …rms’investment decisions iit to "tell us" about ! it :
Assumptions:
where Iit is …rm i’s information set at t (which includes current and past ! it ’s). Note:
– g(! it ) can be thought of as the "predictable" component of ! it+1 , it+1 can be thought
of as the "innovation" component, i.e. the part that the …rm doesn’t observe until t + 1:
– This can be extended to higher order Markov processes (see ABBP Handbook article
and Ackerberg and Hahn (2015))
2) Labor is a perfectly variable input, i.e. lit is chosen by the …rm at time t (after observing
! it ).
3) Labor has no dynamic implications. In other words, my choice of lit at t only a¤ects
pro…ts at period t, not future pro…ts. This rules out, e.g. labor adjustment costs like …ring
or hiring costs.
6
4) On the other hand, kit is accumulated according to a dynamic investment process. Specif-
ically
Kit = Kit 1 + iit 1
where iit is the investment level chosen by the …rm in period t (after observing ! it ). Im-
portantly, note that kit depends on last period’s investment, not current investment. The
assumption here is that it takes full time period for new capital to be ordered, delivered, and
installed. This also implies that kit was actually decided by the …rm at time t 1. This is
what I refer to as a "timing assumption".
In summary:
Given this setup, lets think about a …rm’s optimal investment choice iit . Given iit will a¤ect
future capital levels, a pro…t maximizing …rm will choose iit to maximize the PDV of its future
pro…ts. This is a dynamic programming problem, and will result in an dynamic investment
demand function of the form:
iit = ft (kit ; ! it ) (2)
Note that:
– kit and ! it are part of the state space, but lit does not enter the state space. Why?
– ft is indexed by t. This implicitly allows investment decisions to depend on other state
variables (e.g. input prices, demand conditions, industry structure) that are constant
across …rms.
– ft will likely be a complicated function because it is the solution to a dynamic pro-
gramming problem. Fortunately, we can estimate the production function parameters
without actually solving this DP problem (this is helpful not only computationally, but
also allows us to estimate the production function without having to specify large parts
of the …rms optimization problem (semiparametric)). This is a nice example of how
semiparametrics can help in terms of computation - literature based on Hotz and Miller
(1993, ReStud) is similar in nature.
One of the key ideas behind OP is that under some conditions, the investment demand
equation (2) can be inverted to obtain
i.e. we can write the productivity shock ! it as a function of variables that are observed by
the econometrician (though the function is unknown)
7
– 2) (scalar unobservable) ! it is the only econometric unobservable in the investment
equation, i.e.
Essentially no unobserved input prices that vary across …rms (if there were observed
input prices that varied across …rms, they could be included as arguments of ft ).
There is one exception to this - labor input price shocks across …rms that are not
correlated across time.
No other structural unobservables that a¤ect …rms’optimal investment levels (e.g ef-
…ciency at doing investment, heterogeneity in adjustment costs, other heterogeneity
in the production function (e.g. random coe¢ cients))
No optimization or measurement error in i
Since we don’t know the form of the function ft 1 (and it is a complicated solution to a
dynamic programming problem), let’s just treat it non-parametrically, e.g. a high order
polynomial in iit and kit , e.g.
2 2
yit = 0 + 1 kit + 2 lit + 0t + 1t kit + 2t iit + 3t kit + 4t iit + 5t kit iit + it (5)
Main point is that under the OP assumptions, we have eliminated the unobservable causing
the endogeneity problem
In this literature, iit is sometimes called a control variable and sometimes called a proxy
variable. Neither is perfect terminology.
So we can think about estimating this equation with a simple OLS regression of yit on kit ,
lit , and a polynomial in kit and iit :
Problem: 1 kit is collinear with the linear term in the polynomial, so we can’t separately
identify 1 from 1t . Intuitively, there is no way to separate out the e¤ect of kit on yit
through the production function, from the e¤ect of kit on yit through ft 1 .
But, there is no lit in the polynomial, so 2 can in principle be identi…ed (though see discussion
of Ackerberg, Caves, and Frazer (ACF, 2015, Ecta) below)
where e0t = 0 + 0t and e1t = 1 + 1t . This produces an estimate of the labor coe¢ cient
b
2
8
To estimate the coe¢ cient on capital, 1, we need a "second stage".
E[ it j kit ] = 0
and therefore
E[ it kit ] = 0
This moment condition can be used to estimate the capital coe¢ cient
– 1) Guess a candidate 1
– 2) Compute
b it (
! 1) = b it 1 kit
for all i and t. ! b it ( 1 ) are the "implied" ! it ’s given the guess of 1 . If our guess is
the true 1 , ! b it ( 1 ) will be the true ! it ’s (asymptotically). If our guess is not the true
1 , the b
! it ( 1 )’s will not be the true ! it ’s asymptotically. (Note: Actually, ! b it ( 1 ) is
really ! it + 0 , but the constant term ends up not mattering)
– 3) Given the implied ! b it ( 1 )’s, we now want to compute the implied innovations in ! it
i.e. implied it ’s. To do this, consider the equation
! it = g(! it 1) + it
Think about estimating this equation, i.e. non-parametrically regressing the implied
b it ( 1 )’s (from step 2) on the implied !
! b it 1 ( 1 )’s (also from step 2). Again, we can
think of representing g non-parametrically using a polynomial in ! b it 1 ( 1 ). Call the
residuals from this regression
b ( )
it 1
These are the implied innovations in ! it . Again, if our guess is the true 1 , bit ( 1 ) will
be the true it ’s (asymptotically). If our guess is not the true 1 , then the bit ( 1 )’s will
not be the true it ’s.
– 4) Lastly, evaluate the sample analogue of the moment condition E [ it kit ] = 0; i.e.
1 1 XXb
it ( 1 )kit =0
NT t i
1 1 XXb b
it ( 1 )kit = 0
NT t i
9
– This is a version of the second stage of OP. It is essentially a non-linear GMM estimator
Notes
E [( it + it ) kit ]=0
to estimate 1.
– 3) There are other formulations as well. For example, Wooldridge (2009, EcLet) suggests
estimating both …rst stage and second stage simultaneously. This has two potential
advantages: 1) e¢ ciency (though this is not always the case, see, e.g. Ackerberg, Chen,
Hahn, and Liao (2014, ReStud) , and 2) it makes it easier to compute standard errors
(with two-step procedure, it is typically easiest to bootstrap). On the other hand, a
disadvantage is that it requires a non-linear search over a larger set of parameters ( 1
plus the parameters of g and ft 1 ), whereas the above two step formulations only require
a non-linear search for 1 (or 1 and g)
– 4) Note that there are additional moments generated by the model. The assumptions
of the model imply that E [ it j Iit 1 ] = 0. This means that the implied it ’s should not
2 ......
only be uncorrelated with kit , but everything else in Iit 1 , e.g. kit 1 , kit 2 , lit 1 ,kit
(though not lit ). These additional moments can potentially add e¢ ciency, but also result
in an overidenti…ed model, which can lead to small sample bias. The extent to which
one utilizes these additional moments is typically a matter of taste.
10
– 5) Intuitive description of identi…cation
First stage: Compare output of …rms with same iit and kit (which imply the same
! it ) , but di¤erent lit . This variation in lit is uncorrelated with the remaining
unobservables determining yit ( it ), and so it identi…es the labor coe¢ cient. (But
again, see ACF section below)
Second stage: Compare output of …rms with same ! it 1 , but di¤erent kit ’s (note
that …rms can have the same ! it 1 , but di¤erent iit 1 and kit 1 ).
yit b lit = +
2 0 1 kit
+ g(! it 1 ) + it + it
= b
0 + 1 kit + g( it 1 0 1 kit 1 ) + it + it
11
Anyway, given these problems with 0 investment and an unwillingness to throw away data,
the basic idea of LP is to use a di¤erent "control" variable to learn about ! it , one that is
more likely to be strictly monotonic in ! it . They use an intermediate input, e.g. inputs like
materials, fuel, or electricity. These types of inputs rarely take the value 0.
Production Function:
where mit is an intermediate input. mit is assumed to be a variable, non-dynamic input, like
labor.
Consider a …rm’s optimal choice of mit . Like investment in OP, mit will be chosen as a
function of the state variables kit and ! it , i.e.
Assuming strict monotonicity, this can be inverted and substituted into the production func-
tion
yit = 0 + 1 kit + 2 lit + 3 mit + ft 1 (kit ; mit ) + it (13)
Our question: Under what conditions is the labor coe¢ cient b 2 identi…ed in the …rst stage?
The LP …rst stage regresses yit on lit and a non-parametric function of kit and mit . (call this
non-parametric function np(kit ; mit ))
There is no endogeneity problem here (since it assumed uncorrelated with everything). But
our question is whether the labor input lit moves around independently of np(kit ; mit ). In
other words, can two …rms with the same kit and mit have di¤erent lit ? To analyze this, we
need to think about a model of how …rms chose lit
Most natural model seems as follows. Since mit and lit are both non-dynamic, variable inputs,
and since LP have already assumed that
12
it seems logical to treat lit symmetrically and assume
lit = ht (kit ; ! it )
= ht (kit ; ft 1 (kit ; mit ))
= e
ht (kit ; mit )
The last line implies that lit is a deterministic function of kit and mit
since it implies that lit is functionally dependent on ("collinear" with) np(kit ; mit ), i.e. lit
doesn’t move independently of np(kit ; mit ).
Another way of saying this is as follows: LP want to condition on kit and mit (i.e. condition
on ! it ) and look at remaining variation in lit to identify 2 . But according to the above, lit
is a deterministic function of kit and mit . Hence, there is no remaining variation in lit once
we condition on kit and mit !!!
So if (16) is correct, then 2 should not be identi…ed in the …rst stage. If OLS does in fact
produce an estimate of b 2 , then some assumption of the model must be incorrect.
To get the …rst stage of LP to work, we need to …nd something that moves around lit inde-
pendently of np(kit ; mit ). Unfortunately, this is hard to do within the context of LP’s other
maintained assumptions. For example, suppose one assumes that there is some …rm-speci…c
unobserved shock to the price of labor, vit . This will clearly a¤ect …rms’optimal labor choices,
i.e..
lit = ht (kit ; ! it ; vit )
The problem is that this will also generally a¤ect the …rms’optimal choice of materials
which then violates the scalar unobservable assumption necessary to invert ft and write ! it
as a function of observables.
Our paper thinks about various alternative models of lit (i.e. various data-generating processes
(DGPs)) that might "break" this functional dependence problem. We can only come up with
2 such DGPs, and neither seems all that general.
13
1) Suppose there is "optimization" error in lit , i.e.
where uit is independent of (kit ; ! it ). In other words i.e. for some exogenous reason …rms
do not get the optimal choice of labor correct. This breaks the functional dependence prob-
lem (and does not seem completely unreasonable). On the other hand, one simultaneously
needs to assume that there is 0 optimization error in mit (otherwise, the scalar unobservable
assumption is violated). It seems challenging to argue that there is a signi…cant amount of
optimization error in lit , but almost no optimization error in mit . (One example might be if
one’s data measures "planned" or ordered materials, but actual labor (e.g. subject to sick
days or unexpected quits))
2) Suppose that mit is chosen at some point in time prior to lit , and that:
This second DGP also allows the labor coe¢ cient to be identi…ed in the …rst stage, because the
shock vit moves lit around conditional on mit and kit . Note that vit needs to be independent
across time, otherwise the choice of mit at t will optimally depend on vit 1 , violating the
scalar unobservable assumption needed for invertibility.
Again, this DGP does not seem very general. Why is mit chosen before lit (if anything, I
would tend to think the reverse)? And it seems like a stretch to assume that there are no
unobserved …rm speci…c input price shocks except for this very special vit shock that must be
realized between these two points in time.
Notes:
1) Parametric treatment of the intermediate input demand function does not rescue the LP
…rst stage identi…cation - see ACF for details (though unlike with iit it is not hard to do this,
and it likely adds e¢ ciency)
2) The OP estimator is also a¤ected by this critique. However, there is a 3rd DGP that
breaks the functional dependence problem. This involves lit being chosen with incomplete
knowledge of ! it , e.g. prior to the realization of ! it - see ACF for details.
3)Bond and Soderbom (2005) make a related argument that criticizes the second stage iden-
ti…cation of 3 (the coe¢ cient on the intermediate input) in LP. The crux of the implications
of their argument is that under the assumptions of the LP model the moment condition
E [( it + it ) mit 1 ] = 0 or E [ it mit 1 ] = 0 is not informative about the coe¢ cient 3 . The
intuition is that under the assumptions of the LP model, mit 1 is not correlated with mit
conditional on kit and ! it 1 - hence it is uninformative as an instrument. A serially corre-
lated, …rm-speci…c, unobserved price shock to the intermediate input could generate such
correlation, but it violates the LP scalar unobservable assumption. More generally, Bond
and Soderbom show that without …rm speci…c input price variation, coe¢ cients on perfectly
14
variable, non-dynamic inputs are not identi…ed in Cobb-Douglas production functions. GNR
extend this argument to more general production functions, essentially showing that if a per-
fectly variable, non-dynamic input is being used as a proxy/control variable in the context of
an OP/LP like procedure, its e¤ect on output cannot generally be identi…ed using the above
moments (and they argue that FOC approaches are needed for these inputs, see below).
To do this, assume that mit is chosen either at the same time or after lit is chosen. This
implies we can write
mit = ft (kit ; ! it ; lit ) (19)
This can be thought of as a "conditional" (on lit ) intermediate input demand function, in
contrast to LP’s "unconditional" (on lit ) intermediate input demand function:
Assuming strict monotonicity, this conditional intermediate input demand function can be
inverted and substituted into the production function to get
Treating ft 1 non-parametrically, it is obvious that now not even 2 can be identi…ed in the
…rst stage. However, we can still identify the composite term:
b it = + d
0 1 kit + 2 lit + ! it
(these are just the predicted values of yit from the regression)
Just like in OP, given guesses of 1 and 2 , we can a) compute implied ! b it ( 1 ; 2 )’s, then b)
regress the ! b it 1 ( 1 ; 2 )’s to obtain implied bit ( 1 ; 2 )’s (the residuals
b it ( 1 ; 2 )’s on the !
from the regression). and c) compute the sample moment
1 1 XXb
it ( 1; 2 )kit =0
NT t i
But since there is an additional parameter to estimate in the second stage, we need another
moment condition. We suggest
1 1 XXb
it ( 1; 2 )lit 1 =0
NT t i
which should be approximately zero at the true 1 and 2 since lit 1 2 Iit 1 and hence
E [ it lit 1 ] = 0:
15
So in summary, the second stage estimates of b 1 and b 2 are de…ned by
!
1 1 XX b ( b ; b )kit
it 1 2
b ( b ; b )lit 1 =0
NT t it 1 2
i
Alternatively, if we are willing to assume that like capital, lit is "…xed" and decided at t 1
(and thus lit 2 Iit 1 ), we could use
!
1 1 X X bit ( b 1 ; b 2 )kit
b ( b ; b )lit =0
NT t it 1 2
i
One can see how various other timing/information assumptions would determine the di¤erent
moment conditions here (e.g. Ackerberg (2016)).
Notes:
1) Even though …rst stage does not identify any parameters, it is still crucial in that it
"separates" ! it from it .
2) The procedure does not rely on labor being a non-dynamic input, i.e. labor choices could
have dynamic implications (e.g. hiring or …ring costs).
3) The procedure allows …rm speci…c, serially correlated, unobserved shocks to the price of
labor (as well as to capital costs). We cannot allow such shocks to the price of intermediate
inputs (it would violate the scalar unobservable assumption necessary for the inversion), but
in many cases intermediate inputs are commodities where we would expect very little price
variation across …rms.
– OP - rules out serially correlated, unobserved, …rm speci…c shocks to all input prices
(iit ; lit ,mit ) (note: can allow non-serially correlated shocks to prices of lit and mit )
– LP - allows serially correlated, unobserved, …rm speci…c input price shocks to iit , but
not to (lit ,mit )
– ACF (with intermediate input proxy) allows serially correlated, unobserved, …rm speci…c
input price shocks to iit and lit , but not to mit
4) Bond and Soderbom (2005) and GNR argument implies that we actually need some degree
of 2) or 3) for identi…cation of the labor coe¢ cient.
5) Can use iit rather than mit as the proxy variable in ACF procedure, but lose ability to
allow serially correlated, unobserved, …rm speci…c input price shocks to iit and lit .
6) Can also overidentify the model by adding further lags of inputs as instruments.
7) Also note that the production function (20) does not include mit . This is because the Bond-
Soderbom (and GNR) arguments imply that we cannot use mit 1 as an instrument to identify
the coe¢ cient on mit . Not including mit is implicitly using a value-added production function
(i.e. yit is de‡ated revenues minus de‡ated costs of intermediate inputs). One structural
interpretation of this is that the production function is Leontief in the intermediate input
(e.g. materials), i.e. n o
1 2 ! it
Yit = min 0 Kit Lit e ; 3 Mit e it
16
which implies that
yit = 0 + 1 kit + 2 lit + ! it + it (21)
An alternative would be to follow Appendix B of Levinsohn and Petrin, or Gandhi, Navarro
and Rivers (2015) and use a …rst order condition to obtain the coe¢ cient on the intermediate
input.
– Ackerberg and Hahn (2015) show that in this "value added model"
9) More generally can mix-and-match the di¤erent identi…cation strategies (for di¤erent in-
puts), i.e.
– timing/information set assumptions (though as detailed above, this does not work for a
non-dynamic, variable inputs that is being used to proxy for unobserved productivity).
– …rst order conditions (at least for static inputs)
– observed …rm speci…c input price shocks as instruments
"Dynamic Panel" is somewhat of a misnomer in the context that I am using these methodolo-
gies, as there is no lagged dependent r.h.s. variable. Many of these methods were developed
in that context.
17
I will focus on one very simple example, to try to highlight the similarities and di¤erences
between this literature and the literature stemming from OP.
Production function
yit = 0 + 1 kit + 2 lit + ! it + it (22)
where
! it = ! it 1 + it
Suppose that it satis…es strict exogeneity, i.e. it ’s are uncorrelated with all input choices.
Suppose that ! it not observed until t, that kit is chosen at t 1, and that lit is chosen at t.
These assumptions are analagous to the timing/information set assumptions made in OP,
and imply the following orthogonality conditions
or
Now, given a guess of the parameters ( ; 0; 1; 2) one can compute the implied values of
the term it + ( it it 1 ), i.e.
Then estimation can proceed using, e.g. by setting the sample moment
0 0 11
1
1 1 XXB B CC
B it + ( itd it 1 ) ( ; 0 ; 1 ; 2 ) B kit CC = 0
NT @ @ kit 1 AA !
i t
lit 1
Again, there are actually many more potential moment conditions, since all values of k; l; and
y prior to (kit ; lit 1 ; yit 2 ) are also uncorrelated with it + ( it it 1 ).
Note:
– There are implicit assumptions here about what makes lags strong instruments. This
depends on dynamic issues (e.g. adjustment costs) and issues regarding serial correla-
tion in input prices. See Blundell and Bond (1999) and Bond and Soderbom (2005)
for discussion of when they are strong instruments, and when they are not so strong
instruments.
18
– One can extend the model to allow for an additional unobservable that is …xed across
time, i.e.
yit = 0 + 1 kit + 2 lit + i + ! it + it (23)
where i is allowed to be correlated with all input choices. This model requires "double
di¤erencing", i.e.
(yit yit 1) (yit 1 yit 2) = 1 [(kit kit 1) (kit 1 kit 2 )] + 2 [(lit lit 1) (lit 1
+ it it 1 +( it it 1 ) ( it 1 it 2 )
Again, one can appropriately lag k; l; and y to …nd valid moments. One problem is
that double di¤erencing can be demanding on the data, and estimates can be imprecise.
Arellano and Bover (1995) and Blundell and Bond (1998, 2000) suggest some additional
moment conditions based on stationarity assumptions that can help here, though these
assumptions may be strong.
So how do these dynamic panel approaches compare to the Olley-Pakes literature? I’d say
the main tradeo¤ is the following
– The dynamic panel approach does not require the scalar unobservable and strict monotonic-
ity assumptions that are required for the OP/LP/ACF inversions. So in the dynamic
panel literature, for example, one does not need to worry about unobserved …rm speci…c
input prices, nor other sorts of unobservables like optimization error.
– On the other hand, the dynamic panel approach requires that the serial correlation in
! it is linear, e.g. an AR or MA process. This is essential to being able to construct
usable moments. In contrast, the OP/LP/ACF literature can allow the productivity
shock to follow a completely general …rst order markov process.
– Dynamic panel literature can also allow additional …xed e¤ects, although precise estima-
tion appears to be challenging. Intermediate assumption of Arellano-Bover Blundell-
Bover may be helpful.
Given that theory may provide little guidance between choosing between these two sets of
assumptions (and because they are both strong), I would suggest trying both approaches.
9 Other Issues
1) Often observe …rm revenues as the output measure, not physical quantities. As pointed out
by Klette and Griliches (1996, JAE), this can be problematic when …rms operate in distinct
imperfectly competitive output markets (and one does not observe output price).
– Intuition: Suppose observe that …rms that (exogenously) use double the inputs of others
produce less than double revenue of other. There are two explanations - 1) declining
returns to scale (but, e.g., perfect competition), 2) constant returns to scale with a
downward sloping demand curve.
– Even if one doesn’t care about separating the above two e¤ects, there may now be two
distinct sources of unobservables in the (revenue) production function, which can be
problematic for the proxy based approaches.
19
– See, e.g. Klette (1999, JIE), Foster, Haltiwanger, and Syverson (2007, AER), DeLoecker
(2011, Ecta), DeLoecker and Warzynski (2012, AER).
– Note that just observing quantites is not a complete panacea - need to be equivalent
across …rms for these quantities to be meaningful.
3) Additional inputs (e.g. Griliches Knowledge Capital model, Doraszelski and Jaumandreu
(2014, ReStud))
where
t
X
t
cit = c di
=0
where
! it = g(! it 1 ; dit 1 )
– Advantages DJ
Doesn’t have has initial t = 0 R&D stock issue that Griliches has (assuming use
OP/LP/ACF related methods to estimate)
Explicitly has uncertainty in the contribution of R&D to productivity
– Disadvantages DJ
Unobserved component of "productivity" and R&D component of "productivity"
a¤ect future through scalar - this is restrictive, e.g.
! it = ! it 1 + dit 1 + it
dit 1 and it forced to depreciate at same rate. This is not the case in Griliches -
i.e. c di¤erent than .
Alternative model of physical capital stock
where
! it = g(! it 1 ; dit 1 ; iit 1 )
4) Empirical questions -
20
– Determinants of productivity - deregulation (OP), trade openess, exporting - similar to
above
generally best to include these factors as inputs in the production function
– Allocative e¢ ciency - recently Hsieh and Klenow (2009, QJE), Asker, Collard-Wexler,
and DeLoecker (2014, JPE)
21