0% found this document useful (0 votes)
13 views21 pages

Lecture Notes - Production Functions - 2017

The document discusses production functions, particularly the Cobb-Douglas production function, and highlights the importance of understanding input choices and their correlation with unobservable factors affecting productivity. It addresses endogeneity issues arising from firms' input decisions and presents traditional solutions like instrumental variables and fixed effects models, as well as the Olley and Pakes method for estimating production functions. The document emphasizes the complexities involved in accurately estimating production functions due to the interplay of observable and unobservable variables.

Uploaded by

khoerul mubin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views21 pages

Lecture Notes - Production Functions - 2017

The document discusses production functions, particularly the Cobb-Douglas production function, and highlights the importance of understanding input choices and their correlation with unobservable factors affecting productivity. It addresses endogeneity issues arising from firms' input decisions and presents traditional solutions like instrumental variables and fixed effects models, as well as the Olley and Pakes method for estimating production functions. The document emphasizes the complexities involved in accurately estimating production functions due to the interplay of observable and unobservable variables.

Uploaded by

khoerul mubin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

1 Lecture Notes - Production Functions - 1/5/2017 D.A.

2 Introduction

Production functions are one of the most basic components of economics

They are important in themselves, e.g.

– What is the level of returns to scale?


– How do input coe¢ cients on capital and labor change over time?
– How does adoption of a new technology a¤ect production?
– How much heterogeneity is there in measured productivity across …rms, and what ex-
plains it?
– How does the allocation of …rm inputs relate to productivity

Also can be important as inputs into other interesting questions, e.g. dynamic models of
industry evolution, evaluation of …rm conduct (e.g. collusion)

For this lecture note, we will work with a simple two input Cobb-Douglas production function

Yi = e 0 Ki 1 Li 2 e"i

where i indexes …rms, Ki is units of capital, Li is units of labor, and Yi is units of output.
( 0 ; 1 ; 2 ) are parameters and "i captures unobservables that a¤ects output (e.g. weather,
soil quality, management quality)

Take natural logs to get:


yi = 0 + 1 ki + 2 li + "i

This can be extended to

– Additional inputs, e.g. R&D (knowledge capital), dummies representing discrete tech-
nologies, di¤erent types of labor/capital, intermediate inputs.
– Later we will see more ‡exible models

yi = f (ki ; li ; ) + "i

yi = f (ki ; li ; "i ; ) (with scalar/monotonic "i )


yi = 0 + 1i ki + 2i li + "i

3 Endogeneity Issues

Problem is that inputs ki , li are typically choice variables of the …rm. Typically, these choices
are made to maximize pro…ts, and hence will often depend on unobservables "i .

Of course, this dependence depends on what the …rm knows about "i when they make these
input choices.

1
Example: Suppose a …rm operating in perfectly competitive output and input markets (with
respective prices pi ; ri , and wi ) perfectly observes "i before optimally choosing inputs. Pro…t
maximization problem is:

max pi e 0 Ki 1 Li 2 e"i ri Ki wi Li
Ki ;Li

FOC’s will imply that optimal choices of Ki and Li (ki and li ) will depend on "i . Intuition:
"i positively a¤ects marginal product of inputs. Hence …rms with higher "i ’s will want to
use more inputs.

As a result, one cannot estimate

yi = 0 + 1 ki + 2 li + "i

using OLS because ki and li are correlated with "i . Generally one would expect coe¢ cients
to be positively biased.

Similar problems would arise in more complicated models (e.g. non-perfectly competitive
output or input markets, "i only partially observed), except the special case where the …rm
has no knowledge of "i when choosing inputs

If ki is a "less variable" input than li , one might expect the …rm to have less knowledge about
"i when choosing ki (relative to li ). Generally, this will imply ki will be less correlated with
"i than li is: So one might expect more bias in the labor coe¢ cient.

Note: we will generally assume that the unobservables "i are generated or evolve exogenously,
i.e. they are not choice variables of the …rm. Things get considerably harder when the
unobservables are choice variables of the …rm.

WLOG, lets think about "i having two components, i.e.

yi = 0 + 1 ki + 2 li + !i + i

where ! i is an unobservable that is predictable (or partially predictable) to the …rm when it
makes its input decisions, and i is an unobservable that the …rm has no information about
when making input decisions (e.g. ! i represents average weather conditions on a particular
farm, i represents deviations from that average in a given year (after inputs are chosen)). i
could also represent measurement error in output.

In this formulation, ! i is causing the endogeneity problem, not i. Let’s call ! i the "produc-
tivity shock".

4 Traditional Solutions
Two traditional solutions to endogeneity problems can be used here: instrumental variables
and …xed e¤ects model. I will discuss these before moving to more recent methodological
approaches.

2
4.1 Instrumental Variables
Want to …nd "instruments" that are correlated with the endogenous inputs, but do not directly
determine yi and are not correlated with ! i (and i ).

Good news is that theory can provide us with such instruments.

Speci…cally, consider input and output prices wi , ri , and pi :Theory tells us that these prices
will a¤ect …rms’optimal choices of ki and li . Theory also says that these prices are excluded
from the production function as they do not directly determine output yi conditional on the
inputs.

Last requirement is that wi , ri , and pi are not correlated with the productivity shock ! i .
When will this be the case (or not be the case)?

One key issue is the form of competition in input and output markets.

– If output markets are imperfectly competitive (i.e. …rms face downward sloping demand
curves), then a higher ! i will increase a …rm’s output, driving pi down. In other words,
pi will be positively correlated with ! i , invalidating pi as an instrument.
– If input markets are imperfectly competitive (i.e. …rms face upward sloping supply
curves), then a higher ! i will increase a …rm’s input demand, driving wi and/or ri up.
So wi and/or ri are now correlated with ! i , invalidating them as instruments.

So for these instruments want …rms operating in perfectly competitive input or output mar-
kets. Typically, this is more believable for input markets than for output markets.

Unfortunately, even if willing to make these assumptions, IV solutions haven’t been that
broadly used in practice. First, one needs data on wi and ri . Second, there is often very
little variation in wi and ri across …rms (often there is a real question of whether …rms actually
operate in di¤erent input markets?). Third, one often wonders whether observed variation
in e.g. wi , actually represents …rms facing di¤erent input prices, or whether it represents
things like variation in unobserved labor quality (i.e. the …rm with the higher wi is employing
workers of higher quality). If the latter, then wi is not a valid instrument.

While there might be "true" variation in input prices across time, this is usually not helpful,
because if one has data across time, one often wants to allow the production function to
change across time, e.g.
yit = 0t + 1 kit + 2 lit + ! it + it
(though there could be exceptions)

That said, I think if one can …nd a market where there is convincing exogenous input price
variation, IV approach is probably more convincing than the approaches I will talk about in
the rest of this lecture note, as there seem to be less auxiliary assumptions.

Notes:

– Randomized experiments - either directly manipulating inputs, or manipulating input


prices.

3
– As is typically done in this literature, I have implicitly made a "homogeneous treatment
e¤ects" assumption. A heterogeneous treatment e¤ects model would be

yi = 0 + 1i ki + 2i li + !i + i

This a¤ects the interpretation of IV estimators, e.g. Heckman and Robb (1985), Angrist
and Imbens (1994, Ecta)
– If there are unobserved …rm choice variables in ! i , it becomes quite hard to …nd valid
instruments, even with the above assumptions.

4.2 Fixed E¤ects


This approach relies on having panel data on …rms across time, i.e.

yit = 0 + 1 kit + 2 lit + ! it + it

Assume that it is independent across t (this is consistent with it not being predictable by
the …rm when choosing kit and lit )

Suppose one is willing to assume that the productivity shock is contant over time (…xed e¤ect
assumption), i.e.
! it = ! i

Then one can either mean di¤erence

yit yi = 1 kit ki + 2 lit li + ( it i)

or …rst di¤erence

yit yit 1 = 1 (kit kit 1) + 2 (lit lit 1) +( it it 1 )

Since the problematic unobservable ! it have been di¤erenced out of these expressions (recall
that we have assumed that the it ’s are uncorrelated with input choices) these equations can
be estimated with OLS.

Problems:

– 1) ! it = ! i is a strong assumption
– 2) These estimators often produce strange estimates. In particular, they often generate
very small (or even negative) capital coe¢ cients. Perhaps this is due to measurement
error in capital (Griliches and Hausman (1986, JoE))?

Other notes:

– The mean di¤erence approach requires all the input choices to be uncorrelated with all
the it (strict exogeneity). The …rst di¤erence approach only requires current and lagged
inputs to be uncorrelated with current and lagged it . Using kit 1 and lit 1 (or other
lags) as instruments for (kit kit 1 ) and (lit lit 1 ), one can allow current inputs to be
arbitrarily correlated with past it ’s (sequential exogeneity)

4
– Panel data approach can be extended to richer error structures (Arellano and Bond
(1991, ReStud), Arellano and Bover (1995. JoE), Blundell and Bond (1998, JoE, 2000,
ER), Arellano and Honore (2001, Handbook)) e.g.

! it = ! it 1 + it

or
! it = i + it where it = it 1 + it

I will talk further about these these later.

4.3 First Order Conditions


A third approach to estimating production functions is based on information in …rst order
conditions of optimizing …rms.

For example, for a …rm operating in perfectly competitive input and output markets, static
cost minimization implies that
@Y L wL
=
@L Y pY
@Y K rK
=
@K Y pY
i.e. the output elasticity w.r.t. an input must equal its (cost) share in revenue.

In a Cobb-Douglas context, these output elasticities are the production function coe¢ cients
1 and 2 , so observations on these revenue shares across …rms could provide estimates of
the coe¢ cients.

Note that r can often be assumed known and often one directly observes wL and pY (rather
than L and Y - i.e. labor input and output are measured in terms of dollar units (that are
implicitly assumed to be comparable across …rms))

But:

– This assumes static cost minimization - i.e. it assumes away dynamics, adjustment costs,
etc.. At the very least we often think about the capital input being subject to a dynamic
accumulation process, e.g. Kit = Kit 1 + iit 1
– There are additional terms when …rms are not operating in perfectly competitive mar-
kets, e.g. when …rms face downward sloping demand curve
@Y L wL
=
@L Y pY
@Y K rK
=
@K Y pY
p p
where = mc ;i.e. percentage markup. Note that pro…t maximization implies mc = 1+ ,
where is the elasticity of demand. So, for example, one could still identify production
coe¢ cients using this method if the elasticity of demand was known (this is done in
Hsieh and Klenow (2009, QJE)). Or, one might be able to identify both with additional
restrictions, e.g. Constant Returns to Scale (related to Hall (1988, JPE)).

5
5 Olley and Pakes (1996, Ecta)
Alternative approach to estimating production functions. I will argue that key assumptions
are timing/information set assumptions, a scalar unobservable assumption, and a monotonic-
ity assumption.

Setup:
yit = 0 + 1 kit + 2 lit + ! it + it (1)
Again, the unobserved productivity shocks ! it are potentially correlated with kit and lit .but
the unobservables it are measurement errors or unforecastable shocks that are not correlated
with inputs kit and lit .

Basic Idea: Endogeneity problem is due to the fact that ! it is unobserved by the econome-
trician. If some other equation can tell us what ! it is (i.e. making it "observable"), then the
endogeneity problem would be eliminated.

Olley and Pakes will use observed …rms’investment decisions iit to "tell us" about ! it :

Assumptions:

1)The productivity shock ! it follows a …rst order markov process, i.e.

p(! it+1 jIit ) = p(! it+1 j! it )

where Iit is …rm i’s information set at t (which includes current and past ! it ’s). Note:

– This is both an assumption on the stochastic process governing ! it and an assumption


on …rms’information sets at various points in time. Essentially, …rms are moving through
time, observing ! it at t, and forming expectations about future ! it using p(! it+1 j! it ).
– The form of this …rst order markov process is completely general, e.g. it is more general
than ! it = ! i or ! it .following an AR(1) process.
– This assumption implies that

E [! it+1 j Iit ] = g(! it )

and that we can write

! it+1 = g(! it ) + it+1 where by construction E it+1 j Iit = 0

– g(! it ) can be thought of as the "predictable" component of ! it+1 , it+1 can be thought
of as the "innovation" component, i.e. the part that the …rm doesn’t observe until t + 1:
– This can be extended to higher order Markov processes (see ABBP Handbook article
and Ackerberg and Hahn (2015))

2) Labor is a perfectly variable input, i.e. lit is chosen by the …rm at time t (after observing
! it ).

3) Labor has no dynamic implications. In other words, my choice of lit at t only a¤ects
pro…ts at period t, not future pro…ts. This rules out, e.g. labor adjustment costs like …ring
or hiring costs.

6
4) On the other hand, kit is accumulated according to a dynamic investment process. Specif-
ically
Kit = Kit 1 + iit 1
where iit is the investment level chosen by the …rm in period t (after observing ! it ). Im-
portantly, note that kit depends on last period’s investment, not current investment. The
assumption here is that it takes full time period for new capital to be ordered, delivered, and
installed. This also implies that kit was actually decided by the …rm at time t 1. This is
what I refer to as a "timing assumption".

In summary:

– labor is a variable (decided at t), non-dynamic input


– capital is a …xed (decided at t 1), dynamic input
– We could also think about including …xed, non-dynamic inputs, or variable, dynamic
inputs. (see ABBP)

Given this setup, lets think about a …rm’s optimal investment choice iit . Given iit will a¤ect
future capital levels, a pro…t maximizing …rm will choose iit to maximize the PDV of its future
pro…ts. This is a dynamic programming problem, and will result in an dynamic investment
demand function of the form:
iit = ft (kit ; ! it ) (2)
Note that:

– kit and ! it are part of the state space, but lit does not enter the state space. Why?
– ft is indexed by t. This implicitly allows investment decisions to depend on other state
variables (e.g. input prices, demand conditions, industry structure) that are constant
across …rms.
– ft will likely be a complicated function because it is the solution to a dynamic pro-
gramming problem. Fortunately, we can estimate the production function parameters
without actually solving this DP problem (this is helpful not only computationally, but
also allows us to estimate the production function without having to specify large parts
of the …rms optimization problem (semiparametric)). This is a nice example of how
semiparametrics can help in terms of computation - literature based on Hotz and Miller
(1993, ReStud) is similar in nature.

One of the key ideas behind OP is that under some conditions, the investment demand
equation (2) can be inverted to obtain

! it = ft 1 (kit ; iit ) (3)

i.e. we can write the productivity shock ! it as a function of variables that are observed by
the econometrician (though the function is unknown)

What are these conditions/assumptions?

– 1) (strict monotonicity) ft is strictly monotonic in ! it : OP prove this formally under a set


of assumptions that include the assumption that p(! it+1 j! it ) is stochastically increasing
in ! it . This result is fairly intuitive.

7
– 2) (scalar unobservable) ! it is the only econometric unobservable in the investment
equation, i.e.
Essentially no unobserved input prices that vary across …rms (if there were observed
input prices that varied across …rms, they could be included as arguments of ft ).
There is one exception to this - labor input price shocks across …rms that are not
correlated across time.
No other structural unobservables that a¤ect …rms’optimal investment levels (e.g ef-
…ciency at doing investment, heterogeneity in adjustment costs, other heterogeneity
in the production function (e.g. random coe¢ cients))
No optimization or measurement error in i

2) is a fairly strong assumption, but it is crucial to being able to write ! it as an (unknown)


function of observables.

Suppose these conditions hold. Substitute (3) into (1) to get

yit = 0 + 1 kit + 2 lit + ft 1 (kit ; iit ) + it (4)

Since we don’t know the form of the function ft 1 (and it is a complicated solution to a
dynamic programming problem), let’s just treat it non-parametrically, e.g. a high order
polynomial in iit and kit , e.g.
2 2
yit = 0 + 1 kit + 2 lit + 0t + 1t kit + 2t iit + 3t kit + 4t iit + 5t kit iit + it (5)

Main point is that under the OP assumptions, we have eliminated the unobservable causing
the endogeneity problem

In this literature, iit is sometimes called a control variable and sometimes called a proxy
variable. Neither is perfect terminology.

So we can think about estimating this equation with a simple OLS regression of yit on kit ,
lit , and a polynomial in kit and iit :

Problem: 1 kit is collinear with the linear term in the polynomial, so we can’t separately
identify 1 from 1t . Intuitively, there is no way to separate out the e¤ect of kit on yit
through the production function, from the e¤ect of kit on yit through ft 1 .

But, there is no lit in the polynomial, so 2 can in principle be identi…ed (though see discussion
of Ackerberg, Caves, and Frazer (ACF, 2015, Ecta) below)

In summary, the "…rst stage" of OP involves OLS estimation of


2 2
yit = 2 lit + e0t + e1t kit + 2t iit + 3t kit + 4t iit + 5t kit iit + it (6)

where e0t = 0 + 0t and e1t = 1 + 1t . This produces an estimate of the labor coe¢ cient
b
2

and an estimate of the "composite" term 0 + 1 kit + ! it


b it = b
e0t + b 2
e1t kit + b2t iit + b3t kit + b4t i2it + b5t kit iit = 0 + d
1 kit + ! it

8
To estimate the coe¢ cient on capital, 1, we need a "second stage".

Recall that we can write

! it = g(! it 1) + it where E [ it j Iit 1] =0

Since kit was decided at t 1, kit 2 Iit 1. Hence

E[ it j kit ] = 0

and therefore
E[ it kit ] = 0
This moment condition can be used to estimate the capital coe¢ cient

More speci…cally, consider the following procedure:

– 1) Guess a candidate 1
– 2) Compute
b it (
! 1) = b it 1 kit

for all i and t. ! b it ( 1 ) are the "implied" ! it ’s given the guess of 1 . If our guess is
the true 1 , ! b it ( 1 ) will be the true ! it ’s (asymptotically). If our guess is not the true
1 , the b
! it ( 1 )’s will not be the true ! it ’s asymptotically. (Note: Actually, ! b it ( 1 ) is
really ! it + 0 , but the constant term ends up not mattering)
– 3) Given the implied ! b it ( 1 )’s, we now want to compute the implied innovations in ! it
i.e. implied it ’s. To do this, consider the equation

! it = g(! it 1) + it

Think about estimating this equation, i.e. non-parametrically regressing the implied
b it ( 1 )’s (from step 2) on the implied !
! b it 1 ( 1 )’s (also from step 2). Again, we can
think of representing g non-parametrically using a polynomial in ! b it 1 ( 1 ). Call the
residuals from this regression
b ( )
it 1

These are the implied innovations in ! it . Again, if our guess is the true 1 , bit ( 1 ) will
be the true it ’s (asymptotically). If our guess is not the true 1 , then the bit ( 1 )’s will
not be the true it ’s.
– 4) Lastly, evaluate the sample analogue of the moment condition E [ it kit ] = 0; i.e.
1 1 XXb
it ( 1 )kit =0
NT t i

Since E [ it kit ] = 0, this sample analogue should be approximately zero if we have


guessed the true 1 . For other 1 , this will generally not equal zero (identi…cation)
– 5) Use a computer to do a non-linear search for the b that sets 1

1 1 XXb b
it ( 1 )kit = 0
NT t i

9
– This is a version of the second stage of OP. It is essentially a non-linear GMM estimator

Notes

– 1) Recap of key assumptions:


First order markov assumption on ! it (again can be relaxed to higher order (but
Markov)) - note, for example, that the sum of two markov processes is not generally
…rst order markov (e.g. sum of two AR(1) processes with di¤erent AR coe¢ cients)
Timing assumptions on when inputs are chosen and information set assumptions
regarding when the …rm observes ! it (this can be strengthened or relaxed - see
Ackerberg (2016))
Strict monotonicity of investment demand in ! it (can be relaxed to weak monotonic-
ity - see below)
Scalar unobservable in investment demand (tough to relax, though one can allow
other observables to enter investment demand, e.g. input prices)
– 2) Alternative formulation of the second stage (more like OP paper)

yit = 0 + 1 kit + 2 lit + ! it + it (7)


yit 1 kit 2 lit = 0 + g(! it 1 ) + it + it (8)
yit b lit = b
1 kit 2 0 + g( it 1 0 1 kit 1 ) + it + it (9)
yit b lit = g( b it 1
1 kit 2 1 kit 1 ) + it + it (10)

So given a guess of b lit on a polynomial in ( b it 1


1, one can regress yit 1 kit 2
d
1 kit 1 ) to recover implied it + it ’s, i.e. it + it ( 1 ), and then use the moment condition

E [( it + it ) kit ]=0

and sample analogue


1 1 XX d
+ it (
it 1) kit = 0
NT ti

to estimate 1.
– 3) There are other formulations as well. For example, Wooldridge (2009, EcLet) suggests
estimating both …rst stage and second stage simultaneously. This has two potential
advantages: 1) e¢ ciency (though this is not always the case, see, e.g. Ackerberg, Chen,
Hahn, and Liao (2014, ReStud) , and 2) it makes it easier to compute standard errors
(with two-step procedure, it is typically easiest to bootstrap). On the other hand, a
disadvantage is that it requires a non-linear search over a larger set of parameters ( 1
plus the parameters of g and ft 1 ), whereas the above two step formulations only require
a non-linear search for 1 (or 1 and g)
– 4) Note that there are additional moments generated by the model. The assumptions
of the model imply that E [ it j Iit 1 ] = 0. This means that the implied it ’s should not
2 ......
only be uncorrelated with kit , but everything else in Iit 1 , e.g. kit 1 , kit 2 , lit 1 ,kit
(though not lit ). These additional moments can potentially add e¢ ciency, but also result
in an overidenti…ed model, which can lead to small sample bias. The extent to which
one utilizes these additional moments is typically a matter of taste.

10
– 5) Intuitive description of identi…cation
First stage: Compare output of …rms with same iit and kit (which imply the same
! it ) , but di¤erent lit . This variation in lit is uncorrelated with the remaining
unobservables determining yit ( it ), and so it identi…es the labor coe¢ cient. (But
again, see ACF section below)
Second stage: Compare output of …rms with same ! it 1 , but di¤erent kit ’s (note
that …rms can have the same ! it 1 , but di¤erent iit 1 and kit 1 ).
yit b lit = +
2 0 1 kit
+ g(! it 1 ) + it + it
= b
0 + 1 kit + g( it 1 0 1 kit 1 ) + it + it

This variation in kit is uncorrelated with the remaining unobservables determining


yit ( it and it ), so it identi…es the capital coe¢ cient (However, note that the "com-
parison of …rms with same ! it 1 " depends on the parameters themselves, so this is
not completely transparent intuition)
– 6) OP also deal with a selection problem due to the fact that unproductive …rms may
exit the market. The problem is that even if
E[ it j kit ] = 0
in the entire population of …rms,
E[ it j kit ; still in sample at t] may not equal 0 and be a function of kit
Speci…cally, if a …rm’s exit decision at t depends on ! it (and thus it ), then this second
expectation is likely > 0 and depends negatively on kit (since …rms with higher kit ’s may
be more apt to stay in the market for a given ! it or it ). OP develop a selection correction
to correct for this, which I dont think I will go through (see ABBP for discussion). On
the other hand, if exit decisions at t are made at time t 1 (a timing assumption like
that already being made on capital), then there is no selection problem, since in this
case the exit decision is just a function of Iit 1 .

6 Levinsohn and Petrin (2003, ReStud)


Levinsohn and Petrin worry about the assumption that investment is strictly monotonic in
! it . Intuitively, this assumption implies that any two …rms with the same kit and iit must
have the same ! it .
But in many datasets, especially in developing countries, iit is often 0 (e.g. in LP’s Chilean
dataset, approximately 50% of observations have 0 investment)
It seems like a strong assumption that all these …rms have the same ! it (given kit ). It seems
more likely that there is some threshold ! it below which …rms invest 0:
One can extend OP to allow weak monotonicity for the observations where iit = 0, but
this requires discarding these observations from the analysis (Aside: in this case, there is
no selection issue as long as one uses the second stage moment E [( it + it ) kit ] = 0 rather
than E [ it kit ] = 0 (see Gandhi, Navarro and Rivers (GNR, 2015)). This is because one
cannot compute implied it ’s for observations for which iit = 0 (but one can compute implied
( it + it ) for these observations))

11
Anyway, given these problems with 0 investment and an unwillingness to throw away data,
the basic idea of LP is to use a di¤erent "control" variable to learn about ! it , one that is
more likely to be strictly monotonic in ! it . They use an intermediate input, e.g. inputs like
materials, fuel, or electricity. These types of inputs rarely take the value 0.

Production Function:

yit = 0 + 1 kit + 2 lit + 3 mit + ! it + it (11)

where mit is an intermediate input. mit is assumed to be a variable, non-dynamic input, like
labor.

Consider a …rm’s optimal choice of mit . Like investment in OP, mit will be chosen as a
function of the state variables kit and ! it , i.e.

mit = ft (kit ; ! it ) (12)

Assuming strict monotonicity, this can be inverted and substituted into the production func-
tion
yit = 0 + 1 kit + 2 lit + 3 mit + ft 1 (kit ; mit ) + it (13)

The rest follows exactly as in OP

– Estimate b 2 in …rst stage ( 1 and 3 cannot be identi…ed because they are in ft 1 )


– Estimate b and b in second stage (Need additional moment here to identify the second
1 3
parameter. LP use. E [( it + it ) mit 1 ] = 0 or E [ it mit 1 ] = 0, though see Bond and
Soderbom (2005) and GNR )

7 Ackerberg, Caves, and Frazer (2015)


7.1 Critique
This paper examines the …rst stage of LP and OP

Our question: Under what conditions is the labor coe¢ cient b 2 identi…ed in the …rst stage?

The LP …rst stage regresses yit on lit and a non-parametric function of kit and mit . (call this
non-parametric function np(kit ; mit ))

yit = 2 lit + np(kit ; mit ) + it (14)

There is no endogeneity problem here (since it assumed uncorrelated with everything). But
our question is whether the labor input lit moves around independently of np(kit ; mit ). In
other words, can two …rms with the same kit and mit have di¤erent lit ? To analyze this, we
need to think about a model of how …rms chose lit

Most natural model seems as follows. Since mit and lit are both non-dynamic, variable inputs,
and since LP have already assumed that

mit = ft (kit ; ! it ) (15)

12
it seems logical to treat lit symmetrically and assume

lit = ht (kit ; ! it ) (16)

Of course, these will be di¤erent functions.

If this is the case, then note that

lit = ht (kit ; ! it )
= ht (kit ; ft 1 (kit ; mit ))
= e
ht (kit ; mit )

The last line implies that lit is a deterministic function of kit and mit

But this is a problem for the …rst stage estimating equation

yit = 2 lit + np(kit ; mit ) + it (17)

since it implies that lit is functionally dependent on ("collinear" with) np(kit ; mit ), i.e. lit
doesn’t move independently of np(kit ; mit ).

Another way of saying this is as follows: LP want to condition on kit and mit (i.e. condition
on ! it ) and look at remaining variation in lit to identify 2 . But according to the above, lit
is a deterministic function of kit and mit . Hence, there is no remaining variation in lit once
we condition on kit and mit !!!

Can also think about both e


ht (kit ; mit ) and np(kit ; mit ) being polynomials.

So if (16) is correct, then 2 should not be identi…ed in the …rst stage. If OLS does in fact
produce an estimate of b 2 , then some assumption of the model must be incorrect.

To get the …rst stage of LP to work, we need to …nd something that moves around lit inde-
pendently of np(kit ; mit ). Unfortunately, this is hard to do within the context of LP’s other
maintained assumptions. For example, suppose one assumes that there is some …rm-speci…c
unobserved shock to the price of labor, vit . This will clearly a¤ect …rms’optimal labor choices,
i.e..
lit = ht (kit ; ! it ; vit )
The problem is that this will also generally a¤ect the …rms’optimal choice of materials

mit = ft (kit ; ! it ; ; vit ) (18)

which then violates the scalar unobservable assumption necessary to invert ft and write ! it
as a function of observables.

Our paper thinks about various alternative models of lit (i.e. various data-generating processes
(DGPs)) that might "break" this functional dependence problem. We can only come up with
2 such DGPs, and neither seems all that general.

13
1) Suppose there is "optimization" error in lit , i.e.

lit = ht (kit ; ! it ) + uit

where uit is independent of (kit ; ! it ). In other words i.e. for some exogenous reason …rms
do not get the optimal choice of labor correct. This breaks the functional dependence prob-
lem (and does not seem completely unreasonable). On the other hand, one simultaneously
needs to assume that there is 0 optimization error in mit (otherwise, the scalar unobservable
assumption is violated). It seems challenging to argue that there is a signi…cant amount of
optimization error in lit , but almost no optimization error in mit . (One example might be if
one’s data measures "planned" or ordered materials, but actual labor (e.g. subject to sick
days or unexpected quits))

2) Suppose that mit is chosen at some point in time prior to lit , and that:

– a) The …rm knows ! it when choosing mit


– b) Between these two points in time, there is a shock to the price of labor, vit ; that varies
across …rms.
– c) vit is independent across time (and other variables in the model)

This second DGP also allows the labor coe¢ cient to be identi…ed in the …rst stage, because the
shock vit moves lit around conditional on mit and kit . Note that vit needs to be independent
across time, otherwise the choice of mit at t will optimally depend on vit 1 , violating the
scalar unobservable assumption needed for invertibility.

Again, this DGP does not seem very general. Why is mit chosen before lit (if anything, I
would tend to think the reverse)? And it seems like a stretch to assume that there are no
unobserved …rm speci…c input price shocks except for this very special vit shock that must be
realized between these two points in time.

Notes:

1) Parametric treatment of the intermediate input demand function does not rescue the LP
…rst stage identi…cation - see ACF for details (though unlike with iit it is not hard to do this,
and it likely adds e¢ ciency)

2) The OP estimator is also a¤ected by this critique. However, there is a 3rd DGP that
breaks the functional dependence problem. This involves lit being chosen with incomplete
knowledge of ! it , e.g. prior to the realization of ! it - see ACF for details.

3)Bond and Soderbom (2005) make a related argument that criticizes the second stage iden-
ti…cation of 3 (the coe¢ cient on the intermediate input) in LP. The crux of the implications
of their argument is that under the assumptions of the LP model the moment condition
E [( it + it ) mit 1 ] = 0 or E [ it mit 1 ] = 0 is not informative about the coe¢ cient 3 . The
intuition is that under the assumptions of the LP model, mit 1 is not correlated with mit
conditional on kit and ! it 1 - hence it is uninformative as an instrument. A serially corre-
lated, …rm-speci…c, unobserved price shock to the intermediate input could generate such
correlation, but it violates the LP scalar unobservable assumption. More generally, Bond
and Soderbom show that without …rm speci…c input price variation, coe¢ cients on perfectly

14
variable, non-dynamic inputs are not identi…ed in Cobb-Douglas production functions. GNR
extend this argument to more general production functions, essentially showing that if a per-
fectly variable, non-dynamic input is being used as a proxy/control variable in the context of
an OP/LP like procedure, its e¤ect on output cannot generally be identi…ed using the above
moments (and they argue that FOC approaches are needed for these inputs, see below).

7.2 Alternative estimator suggested by ACF


Given that we feel these identi…cation arguments rely on very particular data generating
processes, we suggest a slightly di¤erent approach where we abandon trying to identify the
labor coe¢ cient in the …rst stage. Instead, we try to identify 2 along with the capital coef-
…cient in the second stage.

To do this, assume that mit is chosen either at the same time or after lit is chosen. This
implies we can write
mit = ft (kit ; ! it ; lit ) (19)

This can be thought of as a "conditional" (on lit ) intermediate input demand function, in
contrast to LP’s "unconditional" (on lit ) intermediate input demand function:

Unconditional interm. input demand (LP) : mit = ft (kit ; ! it )


vs. Conditional interm. input demand (ACF) : mit = ft (kit ; ! it ; lit )

Assuming strict monotonicity, this conditional intermediate input demand function can be
inverted and substituted into the production function to get

yit = 0 + 1 kit + 2 lit + ft 1 (kit ; mit ; lit ) + it (20)

Treating ft 1 non-parametrically, it is obvious that now not even 2 can be identi…ed in the
…rst stage. However, we can still identify the composite term:
b it = + d
0 1 kit + 2 lit + ! it

(these are just the predicted values of yit from the regression)

Just like in OP, given guesses of 1 and 2 , we can a) compute implied ! b it ( 1 ; 2 )’s, then b)
regress the ! b it 1 ( 1 ; 2 )’s to obtain implied bit ( 1 ; 2 )’s (the residuals
b it ( 1 ; 2 )’s on the !
from the regression). and c) compute the sample moment
1 1 XXb
it ( 1; 2 )kit =0
NT t i

But since there is an additional parameter to estimate in the second stage, we need another
moment condition. We suggest
1 1 XXb
it ( 1; 2 )lit 1 =0
NT t i

which should be approximately zero at the true 1 and 2 since lit 1 2 Iit 1 and hence
E [ it lit 1 ] = 0:

15
So in summary, the second stage estimates of b 1 and b 2 are de…ned by
!
1 1 XX b ( b ; b )kit
it 1 2
b ( b ; b )lit 1 =0
NT t it 1 2
i

Alternatively, if we are willing to assume that like capital, lit is "…xed" and decided at t 1
(and thus lit 2 Iit 1 ), we could use
!
1 1 X X bit ( b 1 ; b 2 )kit
b ( b ; b )lit =0
NT t it 1 2
i

One can see how various other timing/information assumptions would determine the di¤erent
moment conditions here (e.g. Ackerberg (2016)).

Notes:

1) Even though …rst stage does not identify any parameters, it is still crucial in that it
"separates" ! it from it .

2) The procedure does not rely on labor being a non-dynamic input, i.e. labor choices could
have dynamic implications (e.g. hiring or …ring costs).

3) The procedure allows …rm speci…c, serially correlated, unobserved shocks to the price of
labor (as well as to capital costs). We cannot allow such shocks to the price of intermediate
inputs (it would violate the scalar unobservable assumption necessary for the inversion), but
in many cases intermediate inputs are commodities where we would expect very little price
variation across …rms.

– OP - rules out serially correlated, unobserved, …rm speci…c shocks to all input prices
(iit ; lit ,mit ) (note: can allow non-serially correlated shocks to prices of lit and mit )
– LP - allows serially correlated, unobserved, …rm speci…c input price shocks to iit , but
not to (lit ,mit )
– ACF (with intermediate input proxy) allows serially correlated, unobserved, …rm speci…c
input price shocks to iit and lit , but not to mit

4) Bond and Soderbom (2005) and GNR argument implies that we actually need some degree
of 2) or 3) for identi…cation of the labor coe¢ cient.

5) Can use iit rather than mit as the proxy variable in ACF procedure, but lose ability to
allow serially correlated, unobserved, …rm speci…c input price shocks to iit and lit .

6) Can also overidentify the model by adding further lags of inputs as instruments.

7) Also note that the production function (20) does not include mit . This is because the Bond-
Soderbom (and GNR) arguments imply that we cannot use mit 1 as an instrument to identify
the coe¢ cient on mit . Not including mit is implicitly using a value-added production function
(i.e. yit is de‡ated revenues minus de‡ated costs of intermediate inputs). One structural
interpretation of this is that the production function is Leontief in the intermediate input
(e.g. materials), i.e. n o
1 2 ! it
Yit = min 0 Kit Lit e ; 3 Mit e it

16
which implies that
yit = 0 + 1 kit + 2 lit + ! it + it (21)
An alternative would be to follow Appendix B of Levinsohn and Petrin, or Gandhi, Navarro
and Rivers (2015) and use a …rst order condition to obtain the coe¢ cient on the intermediate
input.

8) Less parametric generalizations

– Ackerberg and Hahn (2015) show that in this "value added model"

Yit = min fF (Kit ; Lit ; ! it ) ; 3 Mit g e


it

yit = min ff (kit ; lit ; ! it ); ln( 3) + mit g + it

if f is strictly monotonic in the scalar markov process ! it , it can be fully non-parametrically


identi…ed. We formally consider the generic model yit = f (xit ; ! it ) and show conditions
on timing and information sets under which f is non-parametrically identi…ed. We
describe the result as showing how the timing/information set assumptions crucial to
OP have power in a non-parametric context. Note that these assumptions are starting
to be used in other literatures, e.g. demand with endogenous product characteristics.
– Gandhi, Navarro and Rivers (2015) extend the …rst order condition approach to esti-
mating the e¤ect of the proxy variable (mit ) to a non-parametric setting and show that
in
yit = f (kit ; lit ; mit ) + ! it + it
f is non-parametrically identi…ed. I would describe this approach as combining the
timing/information set assumption identi…cation approach (for kit and lit ) with the …rst
order condition identi…cation approach (for mit )

9) More generally can mix-and-match the di¤erent identi…cation strategies (for di¤erent in-
puts), i.e.

– timing/information set assumptions (though as detailed above, this does not work for a
non-dynamic, variable inputs that is being used to proxy for unobserved productivity).
– …rst order conditions (at least for static inputs)
– observed …rm speci…c input price shocks as instruments

8 Dynamic Panel Approaches


These are econometric procedures (Arellano and Bond (1991, ReStud), Arellano and Bover
(1995. JoE), Blundell and Bond (1998, JoE, 2000, ER), Arellano and Honore (2001, Hand-
book)) that generalize the …xed e¤ects model to allow ! it to vary across time. These have
been used in many di¤erent applied contexts, including production functions (e.g. Blundell
Bond 2000, and emprical work by John Van Reenan, Nick Bloom and coauthors).

"Dynamic Panel" is somewhat of a misnomer in the context that I am using these methodolo-
gies, as there is no lagged dependent r.h.s. variable. Many of these methods were developed
in that context.

17
I will focus on one very simple example, to try to highlight the similarities and di¤erences
between this literature and the literature stemming from OP.

Production function
yit = 0 + 1 kit + 2 lit + ! it + it (22)
where
! it = ! it 1 + it

Suppose that it satis…es strict exogeneity, i.e. it ’s are uncorrelated with all input choices.

Suppose that ! it not observed until t, that kit is chosen at t 1, and that lit is chosen at t.

These assumptions are analagous to the timing/information set assumptions made in OP,
and imply the following orthogonality conditions

E [ it kis ] = E [ it lis ] = 0 8t; s


E[ it kis ] = 0 for s t
E[ it lis ] = 0 for s < t

Consider " -di¤erencing" the production function, i.e.

yit yit 1 = (1 ) 0 + 1 (kit kit 1) + 2 (lit lit 1) + it +( it it 1 )

or

yit = yit 1 + (1 ) 0 + 1 (kit kit 1) + 2 (lit lit 1) + it +( it it 1 )

Now, given a guess of the parameters ( ; 0; 1; 2) one can compute the implied values of
the term it + ( it it 1 ), i.e.

it + ( itd it 1 ) ( ; 0; 1; 2) = yit yit 1 (1 ) 0 1 (kit kit 1) 2 (lit lit 1)

Then estimation can proceed using, e.g. by setting the sample moment
0 0 11
1
1 1 XXB B CC
B it + ( itd it 1 ) ( ; 0 ; 1 ; 2 ) B kit CC = 0
NT @ @ kit 1 AA !
i t
lit 1

Again, there are actually many more potential moment conditions, since all values of k; l; and
y prior to (kit ; lit 1 ; yit 2 ) are also uncorrelated with it + ( it it 1 ).

Note:

– There are implicit assumptions here about what makes lags strong instruments. This
depends on dynamic issues (e.g. adjustment costs) and issues regarding serial correla-
tion in input prices. See Blundell and Bond (1999) and Bond and Soderbom (2005)
for discussion of when they are strong instruments, and when they are not so strong
instruments.

18
– One can extend the model to allow for an additional unobservable that is …xed across
time, i.e.
yit = 0 + 1 kit + 2 lit + i + ! it + it (23)
where i is allowed to be correlated with all input choices. This model requires "double
di¤erencing", i.e.

(yit yit 1) (yit 1 yit 2) = 1 [(kit kit 1) (kit 1 kit 2 )] + 2 [(lit lit 1) (lit 1
+ it it 1 +( it it 1 ) ( it 1 it 2 )

Again, one can appropriately lag k; l; and y to …nd valid moments. One problem is
that double di¤erencing can be demanding on the data, and estimates can be imprecise.
Arellano and Bover (1995) and Blundell and Bond (1998, 2000) suggest some additional
moment conditions based on stationarity assumptions that can help here, though these
assumptions may be strong.

So how do these dynamic panel approaches compare to the Olley-Pakes literature? I’d say
the main tradeo¤ is the following

– The dynamic panel approach does not require the scalar unobservable and strict monotonic-
ity assumptions that are required for the OP/LP/ACF inversions. So in the dynamic
panel literature, for example, one does not need to worry about unobserved …rm speci…c
input prices, nor other sorts of unobservables like optimization error.
– On the other hand, the dynamic panel approach requires that the serial correlation in
! it is linear, e.g. an AR or MA process. This is essential to being able to construct
usable moments. In contrast, the OP/LP/ACF literature can allow the productivity
shock to follow a completely general …rst order markov process.
– Dynamic panel literature can also allow additional …xed e¤ects, although precise estima-
tion appears to be challenging. Intermediate assumption of Arellano-Bover Blundell-
Bover may be helpful.

Given that theory may provide little guidance between choosing between these two sets of
assumptions (and because they are both strong), I would suggest trying both approaches.

9 Other Issues
1) Often observe …rm revenues as the output measure, not physical quantities. As pointed out
by Klette and Griliches (1996, JAE), this can be problematic when …rms operate in distinct
imperfectly competitive output markets (and one does not observe output price).

– Intuition: Suppose observe that …rms that (exogenously) use double the inputs of others
produce less than double revenue of other. There are two explanations - 1) declining
returns to scale (but, e.g., perfect competition), 2) constant returns to scale with a
downward sloping demand curve.
– Even if one doesn’t care about separating the above two e¤ects, there may now be two
distinct sources of unobservables in the (revenue) production function, which can be
problematic for the proxy based approaches.

19
– See, e.g. Klette (1999, JIE), Foster, Haltiwanger, and Syverson (2007, AER), DeLoecker
(2011, Ecta), DeLoecker and Warzynski (2012, AER).
– Note that just observing quantites is not a complete panacea - need to be equivalent
across …rms for these quantities to be meaningful.

2) Other types of information structures (e.g. Greenstreet (2007)).

3) Additional inputs (e.g. Griliches Knowledge Capital model, Doraszelski and Jaumandreu
(2014, ReStud))

– Griliches Knowledge capital

yit = 0 + 1 kit + 2 lit + 3 cit + ! it + it (24)

where
t
X
t
cit = c di
=0

where di is …rm’s chosen R&D expenditures in period


– Doraszelski and Jaumandreu (2014)

yit = 0 + 1 kit + 2 lit + ! it + it (25)

where
! it = g(! it 1 ; dit 1 )

– Advantages DJ
Doesn’t have has initial t = 0 R&D stock issue that Griliches has (assuming use
OP/LP/ACF related methods to estimate)
Explicitly has uncertainty in the contribution of R&D to productivity
– Disadvantages DJ
Unobserved component of "productivity" and R&D component of "productivity"
a¤ect future through scalar - this is restrictive, e.g.

! it = ! it 1 + dit 1 + it

dit 1 and it forced to depreciate at same rate. This is not the case in Griliches -
i.e. c di¤erent than .
Alternative model of physical capital stock

yit = 0 + 2 lit + ! it + it (26)

where
! it = g(! it 1 ; dit 1 ; iit 1 )

– Note: both "endogenize productivity" if think about de…ning productivity as ( 3 cit +! it )


in the Griliches model

4) Empirical questions -

20
– Determinants of productivity - deregulation (OP), trade openess, exporting - similar to
above
generally best to include these factors as inputs in the production function
– Allocative e¢ ciency - recently Hsieh and Klenow (2009, QJE), Asker, Collard-Wexler,
and DeLoecker (2014, JPE)

21

You might also like