0% found this document useful (0 votes)

44 views1,074 pages

Quantecon Python Advanced

The document is a comprehensive guide on Advanced Quantitative Economics using Python, authored by Thomas J. Sargent and John Stachurski. It covers various topics including orthogonal projections, continuous state Markov chains, reverse engineering, discrete state dynamic programming, and LQ control. Each section includes theoretical foundations, applications, and exercises to reinforce learning.

Uploaded by

Prakash Singh Koranga

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views1,074 pages

Quantecon Python Advanced

Uploaded by

Prakash Singh Koranga

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1074

Advanced Quantitative Economics with

Python

Thomas J. Sargent & John Stachurski

Feb 17, 2025

CONTENTS

I Tools and Techniques 3

1 Orthogonal Projections and Their Applications 5
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Key Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 The Orthogonal Projection Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Orthonormal Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.5 Projection Via Matrix Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.6 Least Squares Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.7 Orthogonalization and Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2 Continuous State Markov Chains 23

2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2 The Density Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3 Beyond Densities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.6 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3 Reverse Engineering a la Muth 45

3.1 Friedman (1956) and Muth (1960) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2 A Process for Which Adaptive Expectations are Optimal . . . . . . . . . . . . . . . . . . . . . . . . 46
3.3 Some Useful State-Space Math . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.4 Estimates of Unobservables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.5 Relationship of Unobservables to Observables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.6 MA and AR Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4 Discrete State Dynamic Programming 53

4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2 Discrete DPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.3 Solving Discrete DPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.4 Example: A Growth Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.6 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.7 Appendix: Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

II LQ Control 75
5 Information and Consumption Smoothing 77

i
5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.2 Two Representations of One Nonfinancial Income Process . . . . . . . . . . . . . . . . . . . . . . . 78
5.3 Application of Kalman filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.4 News Shocks and Less Informative Shocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.5 Representation of 𝜖𝑡 Shock in Terms of Future 𝑦𝑡 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.6 Representation in Terms of 𝑎𝑡 Shocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.7 Permanent Income Consumption-Smoothing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.8 State Space Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.9 Computations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.10 Simulating Income Process and Two Associated Shock Processes . . . . . . . . . . . . . . . . . . . . 89
5.11 Calculating Innovations in Another Way . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.12 Another Invertibility Issue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

6 Consumption Smoothing with Complete and Incomplete Markets 91

6.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.3 Linear State Space Version of Complete Markets Model . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.4 Model 1 (Complete Markets) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.5 Model 2 (One-Period Risk-Free Debt Only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

7 Tax Smoothing with Complete and Incomplete Markets 105

7.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.2 Tax Smoothing with Complete Markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
7.3 Returns on State-Contingent Debt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
7.4 More Finite Markov Chain Tax-Smoothing Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 115

8 Markov Jump Linear Quadratic Dynamic Programming 143

8.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
8.2 Review of useful LQ dynamic programming formulas . . . . . . . . . . . . . . . . . . . . . . . . . . 144
8.3 Linked Riccati equations for Markov LQ dynamic programming . . . . . . . . . . . . . . . . . . . . 144
8.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
8.5 Example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
8.6 Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
8.7 More examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

9 How to Pay for a War: Part 1 197

9.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
9.2 Public Finance Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
9.3 Barro (1979) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
9.4 Python Class to Solve Markov Jump Linear Quadratic Control Problems . . . . . . . . . . . . . . . . 203
9.5 Barro Model with a Time-varying Interest Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

10 How to Pay for a War: Part 2 207

10.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
10.2 Two example specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
10.3 One- and Two-period Bonds but No Restructuring . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
10.4 Mapping into an LQ Markov Jump Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
10.5 Penalty on Different Issues Across Maturities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
10.6 A Model with Restructuring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
10.7 Restructuring as a Markov Jump Linear Quadratic Control Problem . . . . . . . . . . . . . . . . . . . 216

11 How to Pay for a War: Part 3 221

11.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
11.2 Roll-Over Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
11.3 A Dead End . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222

ii
11.4 Better Representation of Roll-Over Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222

12 Optimal Taxation in an LQ Economy 227

12.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
12.2 The Ramsey Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
12.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
12.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
12.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

III Multiple Agent Models 249

13 Default Risk and Income Fluctuations 251
13.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
13.2 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
13.3 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
13.4 Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
13.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
13.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264

14 Globalization and Cycles 271

14.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
14.2 Key Ideas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
14.3 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
14.4 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
14.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285

15 Coase’s Theory of the Firm 289

15.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
15.2 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
15.3 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
15.4 Existence, Uniqueness and Computation of Equilibria . . . . . . . . . . . . . . . . . . . . . . . . . . 294
15.5 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
15.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299

16 Composite Sorting 303

16.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
16.2 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
16.3 Characterization of primal solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
16.4 Solving primal problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
16.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
16.6 Dual Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
16.7 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346

IV Dynamic Linear Economies 357

17 Recursive Models of Dynamic Linear Economies 359
17.1 A Suite of Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
17.2 Econometrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
17.3 Dynamic Demand Curves and Canonical Household Technologies . . . . . . . . . . . . . . . . . . . 378
17.4 Gorman Aggregation and Engel Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
17.5 Partial Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380
17.6 Equilibrium Investment Under Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
17.7 A Rosen-Topel Housing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382

iii
17.8 Cattle Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382
17.9 Models of Occupational Choice and Pay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
17.10 Permanent Income Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384
17.11 Gorman Heterogeneous Households . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
17.12 Non-Gorman Heterogeneous Households . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386

18 Growth in Dynamic Linear Economies 389

18.1 Common Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
18.2 A Planning Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390
18.3 Example Economies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391

19 Lucas Asset Pricing Using DLE 401

19.1 Asset Pricing Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402
19.2 Asset Pricing Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402

20 IRFs in Hall Models 409

20.1 Example 1: Hall (1978) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
20.2 Example 2: Higher Adjustment Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
20.3 Example 3: Durable Consumption Goods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415

21 Permanent Income Model using the DLE Class 419

21.1 The Permanent Income Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419

22 Rosen Schooling Model 423

22.1 A One-Occupation Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
22.2 Mapping into HS2013 Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424

23 Cattle Cycles 429

23.1 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
23.2 Mapping into HS2013 Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430

24 Shock Non Invertibility 437

24.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437
24.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438
24.3 Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440

V Risk, Model Uncertainty, and Robustness 443

25 Risk and Model Uncertainty 445
25.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
25.2 Basic objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446
25.3 Five preference specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
25.4 Expected utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
25.5 Constraint preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
25.6 Multiplier preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452
25.7 Risk-sensitive preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452
25.8 Ex post Bayesian preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
25.9 Comparing preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
25.10 Risk aversion and misspecification aversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
25.11 Indifference curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
25.12 State price deflators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461
25.13 Iso-utility and iso-entropy curves and expansion paths . . . . . . . . . . . . . . . . . . . . . . . . . . 464
25.14 Bounds on expected utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
25.15 Why entropy? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467

iv
26 Etymology of Entropy 469
26.1 Information Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
26.2 A Measure of Unpredictability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470
26.3 Mathematical Properties of Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470
26.4 Conditional Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471
26.5 Independence as Maximum Conditional Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471
26.6 Thermodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
26.7 Statistical Divergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
26.8 Continuous distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
26.9 Relative entropy and Gaussian distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473
26.10 Von Neumann Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473
26.11 Backus-Chernov-Zin Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474
26.12 Wiener-Kolmogorov Prediction Error Formula as Entropy . . . . . . . . . . . . . . . . . . . . . . . . 474
26.13 Multivariate Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475
26.14 Frequency Domain Robust Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475
26.15 Relative Entropy for a Continuous Random Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . 476

27 Robustness 479
27.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479
27.2 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482
27.3 Constructing More Robust Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483
27.4 Robustness as Outcome of a Two-Person Zero-Sum Game . . . . . . . . . . . . . . . . . . . . . . . 485
27.5 The Stochastic Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489
27.6 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491
27.7 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491
27.8 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 496

28 Robust Markov Perfect Equilibrium 499

28.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499
28.2 Linear Markov Perfect Equilibria with Robust Agents . . . . . . . . . . . . . . . . . . . . . . . . . . 500
28.3 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503

VI Time Series Models 519

29 Covariance Stationary Processes 521
29.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 521
29.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 522
29.3 Spectral Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526
29.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534

30 Estimation of Spectra 543

30.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543
30.2 Periodograms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543
30.3 Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 546
30.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 552

31 Additive and Multiplicative Functionals 559

31.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559
31.2 A Particular Additive Functional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 560
31.3 Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561
31.4 Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573
31.5 More About the Multiplicative Martingale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 576

32 Classical Control with Linear Algebra 583

v
32.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 583
32.2 A Control Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 584
32.3 Finite Horizon Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585
32.4 Infinite Horizon Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589
32.5 Undiscounted Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 591
32.6 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593
32.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601

33 Classical Prediction and Filtering With Linear Algebra 603

33.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 603
33.2 Finite Dimensional Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 604
33.3 Combined Finite Dimensional Control and Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . 615
33.4 Infinite Horizon Prediction and Filtering Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616
33.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 620

34 Knowing the Forecasts of Others 623

34.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623
34.2 The Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624
34.3 Tactics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625
34.4 Equilibrium Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 626
34.5 Equilibrium with 𝜃𝑡 stochastic but observed at 𝑡 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628
34.6 Guess-and-Verify Tactic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 631
34.7 Equilibrium with One Noisy Signal on 𝜃𝑡 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 631
34.8 Equilibrium with Two Noisy Signals on 𝜃𝑡 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 637
34.9 Key Step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 641
34.10 An observed common shock benchmark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 641
34.11 Comparison of All Signal Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643
34.12 Notes on History of the Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645

VII Asset Pricing and Finance 647

35 Asset Pricing II: The Lucas Asset Pricing Model 649
35.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 649
35.2 The Lucas Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 649
35.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 656

36 Elementary Asset Pricing Theory 659

36.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 659
36.2 Key Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 660
36.3 Implications of Key Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 660
36.4 Expected Return - Beta Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 661
36.5 Mean-Variance Frontier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 663
36.6 Sharpe Ratios and the Price of Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666
36.7 Mathematical Structure of Frontier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666
36.8 Multi-factor Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666
36.9 Empirical Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667
36.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 668

37 Two Modifications of Mean-Variance Portfolio Theory 673

37.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 673
37.2 Mean-Variance Portfolio Choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674
37.3 Estimating Mean and Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674
37.4 Black-Litterman Starting Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675
37.5 Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 676

vi
37.6 Adding Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 678
37.7 Bayesian Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 680
37.8 Curve Decolletage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 681
37.9 Black-Litterman Recommendation as Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . 685
37.10 A Robust Control Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 687
37.11 A Robust Mean-Variance Portfolio Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 688
37.12 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 689
37.13 Special Case – IID Sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 690
37.14 Dependence and Sampling Frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 690
37.15 Frequency and the Mean Estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 692

38 Irrelevance of Capital Structures with Complete Markets 697

38.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 697
38.2 Competitive equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 701
38.3 Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 707

39 Equilibrium Capital Structures with Incomplete Markets 717

39.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 717
39.2 Asset Markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 720
39.3 Equilibrium verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724
39.4 Pseudo Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724
39.5 Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 726
39.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 736
39.7 A picture worth a thousand words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 753

VIII Dynamic Programming Squared 755

40 Optimal Unemployment Insurance 757
40.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 757
40.2 Shavell and Weiss’s Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 757
40.3 Private Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 760
40.4 Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 767

41 Stackelberg Plans 771

41.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 771
41.2 Duopoly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 771
41.3 Stackelberg Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 774
41.4 Two Bellman Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 776
41.5 Stackelberg Plan for Duopoly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 777
41.6 Recursive Representation of Stackelberg Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 779
41.7 Dynamic Programming and Time Consistency of Follower’s Problem . . . . . . . . . . . . . . . . . . 780
41.8 Computing Stackelberg Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 782
41.9 Time Series for Price and Quantities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 783
41.10 Time Inconsistency of Stackelberg Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 785
41.11 Recursive Formulation of Follower’s Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 786
41.12 Markov Perfect Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 791
41.13 Comparing Markov Perfect Equilibrium and Stackelberg Outcome . . . . . . . . . . . . . . . . . . . 793

42 Machine Learning a Ramsey Plan 795

42.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 795
42.2 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 796
42.3 Model Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 796
42.4 Parameters and Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 798
42.5 Approximation and Truncation parameter 𝑇 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 799

vii
42.6 A Gradient Descent Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 800
42.7 A More Structured ML Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 804
42.8 Continuation Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 811
42.9 Adding Some Human Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 814
42.10 What has Machine Learning Taught Us? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 820

43 Time Inconsistency of Ramsey Plans 823

43.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 823
43.2 Model Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 824
43.3 Friedman’s Optimal Rate of Deflation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 825
43.4 Calvo’s Distortion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 826
43.5 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 827
43.6 Three Timing Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 828
43.7 Note on Dynamic Programming Squared . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 828
43.8 A Ramsey Planner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 829
43.9 Representation of Ramsey Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 831
43.10 Multiple roles of 𝜃𝑡 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 832
43.11 Time inconsistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 832
43.12 Constrained-to-Constant-Growth-Rate Ramsey Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . 832
43.13 Markov Perfect Governments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 833
43.14 Outcomes under Three Timing Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 834
43.15 Ramsey Planner’s Value Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 839
43.16 Perturbing Model Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 841

44 Sustainable Plans for a Calvo Model 847

44.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 847
44.2 Model Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 847
44.3 Another Timing Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 848
44.4 Sustainable or Credible Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 850
44.5 Whose Plan is It? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 857

45 Optimal Taxation with State-Contingent Debt 859

45.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 859
45.2 A Competitive Equilibrium with Distorting Taxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 860
45.3 Recursive Formulation of the Ramsey Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 871
45.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 878

46 Optimal Taxation without State-Contingent Debt 891

46.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 891
46.2 Competitive Equilibrium with Distorting Taxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 892
46.3 Recursive Version of AMSS Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 900
46.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 907

47 Fluctuating Interest Rates Deliver Fiscal Insurance 919

47.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 919
47.2 Forces at Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 920
47.3 Logical Flow of Lecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 921
47.4 Example Economy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 923
47.5 Reverse Engineering Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 933
47.6 Code for Reverse Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 934
47.7 Short Simulation for Reverse-engineered: Initial Debt . . . . . . . . . . . . . . . . . . . . . . . . . . 935
47.8 Long Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 944
47.9 BEGS Approximations of Limiting Debt and Convergence Rate . . . . . . . . . . . . . . . . . . . . . 946

48 Fiscal Risk and Government Debt 951

viii
48.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 951
48.2 The Economy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 952
48.3 Long Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 953
48.4 Asymptotic Mean and Rate of Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 978

49 Competitive Equilibria of a Model of Chang 987

49.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 987
49.2 Decisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 989
49.3 Competitive Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 991
49.4 Inventory of Objects in Play . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 992
49.5 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 993
49.6 Calculating all Promise-Value Pairs in CE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 996
49.7 Solving a Continuation Ramsey Planner’s Bellman Equation . . . . . . . . . . . . . . . . . . . . . . . 1012

50 Credible Government Policies in a Model of Chang 1021

50.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1021
50.2 The Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1022
50.3 Calculating the Set of Sustainable Promise-Value Pairs . . . . . . . . . . . . . . . . . . . . . . . . . 1028

IX Other 1045
51 Troubleshooting 1047
51.1 Fixing Your Local Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1047
51.2 Reporting an Issue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1048

52 References 1049

53 Execution Statistics 1051

Bibliography 1053

Proof Index 1059

Index 1061

ix
x
Advanced Quantitative Economics with Python

This website presents a set of advanced lectures on quantitative economic modeling.

• Tools and Techniques
– Orthogonal Projections and Their Applications
– Continuous State Markov Chains
– Reverse Engineering a la Muth
– Discrete State Dynamic Programming
• LQ Control
– Information and Consumption Smoothing
– Consumption Smoothing with Complete and Incomplete Markets
– Tax Smoothing with Complete and Incomplete Markets
– Markov Jump Linear Quadratic Dynamic Programming
– How to Pay for a War: Part 1
– How to Pay for a War: Part 2
– How to Pay for a War: Part 3
– Optimal Taxation in an LQ Economy
• Multiple Agent Models
– Default Risk and Income Fluctuations
– Globalization and Cycles
– Coase’s Theory of the Firm
– Composite Sorting
• Dynamic Linear Economies
– Recursive Models of Dynamic Linear Economies
– Growth in Dynamic Linear Economies
– Lucas Asset Pricing Using DLE
– IRFs in Hall Models
– Permanent Income Model using the DLE Class
– Rosen Schooling Model
– Cattle Cycles
– Shock Non Invertibility
• Risk, Model Uncertainty, and Robustness
– Risk and Model Uncertainty
– Etymology of Entropy
– Robustness
– Robust Markov Perfect Equilibrium
• Time Series Models
– Covariance Stationary Processes

CONTENTS 1
Advanced Quantitative Economics with Python

– Estimation of Spectra
– Additive and Multiplicative Functionals
– Classical Control with Linear Algebra
– Classical Prediction and Filtering With Linear Algebra
– Knowing the Forecasts of Others
• Asset Pricing and Finance
– Asset Pricing II: The Lucas Asset Pricing Model
– Elementary Asset Pricing Theory
– Two Modifications of Mean-Variance Portfolio Theory
– Irrelevance of Capital Structures with Complete Markets
– Equilibrium Capital Structures with Incomplete Markets
• Dynamic Programming Squared
– Optimal Unemployment Insurance
– Stackelberg Plans
– Machine Learning a Ramsey Plan
– Time Inconsistency of Ramsey Plans
– Sustainable Plans for a Calvo Model
– Optimal Taxation with State-Contingent Debt
– Optimal Taxation without State-Contingent Debt
– Fluctuating Interest Rates Deliver Fiscal Insurance
– Fiscal Risk and Government Debt
– Competitive Equilibria of a Model of Chang
– Credible Government Policies in a Model of Chang
• Other
– Troubleshooting
– References
– Execution Statistics

2 CONTENTS
Part I

Tools and Techniques

3
CHAPTER

ONE

ORTHOGONAL PROJECTIONS AND THEIR APPLICATIONS

1.1 Overview

Orthogonal projection is a cornerstone of vector space methods, with many diverse applications.
These include
• Least squares projection, also known as linear regression
• Conditional expectations for multivariate normal (Gaussian) distributions
• Gram–Schmidt orthogonalization
• QR decomposition
• Orthogonal polynomials
• etc
In this lecture, we focus on
• key ideas
• least squares regression
We’ll require the following imports:

import numpy as np
from scipy.linalg import qr

1.1.1 Further Reading

For background and foundational concepts, see our lecture on linear algebra.
For more proofs and greater theoretical detail, see A Primer in Econometric Theory.
For a complete set of proofs in a general setting, see, for example, [Roman, 2005].
For an advanced treatment of projection in the context of least squares prediction, see this book chapter.

5
Advanced Quantitative Economics with Python

1.2 Key Definitions

Assume 𝑥, 𝑧 ∈ ℝ𝑛 .
Define ⟨𝑥, 𝑧⟩ = ∑𝑖 𝑥𝑖 𝑧𝑖 .
Recall ‖𝑥‖2 = ⟨𝑥, 𝑥⟩.
The law of cosines states that ⟨𝑥, 𝑧⟩ = ‖𝑥‖‖𝑧‖ cos(𝜃) where 𝜃 is the angle between the vectors 𝑥 and 𝑧.
When ⟨𝑥, 𝑧⟩ = 0, then cos(𝜃) = 0 and 𝑥 and 𝑧 are said to be orthogonal and we write 𝑥 ⟂ 𝑧.

For a linear subspace 𝑆 ⊂ ℝ𝑛 , we call 𝑥 ∈ ℝ𝑛 orthogonal to 𝑆 if 𝑥 ⟂ 𝑧 for all 𝑧 ∈ 𝑆, and write 𝑥 ⟂ 𝑆.

The orthogonal complement of linear subspace 𝑆 ⊂ ℝ𝑛 is the set 𝑆 ⟂ ∶= {𝑥 ∈ ℝ𝑛 ∶ 𝑥 ⟂ 𝑆}.
𝑆 ⟂ is a linear subspace of ℝ𝑛
• To see this, fix 𝑥, 𝑦 ∈ 𝑆 ⟂ and 𝛼, 𝛽 ∈ ℝ.
• Observe that if 𝑧 ∈ 𝑆, then
⟨𝛼𝑥 + 𝛽𝑦, 𝑧⟩ = 𝛼⟨𝑥, 𝑧⟩ + 𝛽⟨𝑦, 𝑧⟩ = 𝛼 × 0 + 𝛽 × 0 = 0
• Hence 𝛼𝑥 + 𝛽𝑦 ∈ 𝑆 ⟂ , as was to be shown

6 Chapter 1. Orthogonal Projections and Their Applications

Advanced Quantitative Economics with Python

1.2. Key Definitions 7

Advanced Quantitative Economics with Python

8 Chapter 1. Orthogonal Projections and Their Applications

Advanced Quantitative Economics with Python

A set of vectors {𝑥1 , … , 𝑥𝑘 } ⊂ ℝ𝑛 is called an orthogonal set if 𝑥𝑖 ⟂ 𝑥𝑗 whenever 𝑖 ≠ 𝑗.

If {𝑥1 , … , 𝑥𝑘 } is an orthogonal set, then the Pythagorean Law states that

‖𝑥1 + ⋯ + 𝑥𝑘 ‖2 = ‖𝑥1 ‖2 + ⋯ + ‖𝑥𝑘 ‖2

For example, when 𝑘 = 2, 𝑥1 ⟂ 𝑥2 implies

‖𝑥1 + 𝑥2 ‖2 = ⟨𝑥1 + 𝑥2 , 𝑥1 + 𝑥2 ⟩ = ⟨𝑥1 , 𝑥1 ⟩ + 2⟨𝑥2 , 𝑥1 ⟩ + ⟨𝑥2 , 𝑥2 ⟩ = ‖𝑥1 ‖2 + ‖𝑥2 ‖2

1.2.1 Linear Independence vs Orthogonality

If 𝑋 ⊂ ℝ𝑛 is an orthogonal set and 0 ∉ 𝑋, then 𝑋 is linearly independent.

Proving this is a nice exercise.
While the converse is not true, a kind of partial converse holds, as we’ll see below.

1.3 The Orthogonal Projection Theorem

What vector within a linear subspace of ℝ𝑛 best approximates a given vector in ℝ𝑛 ?

The next theorem answers this question.
Theorem (OPT) Given 𝑦 ∈ ℝ𝑛 and linear subspace 𝑆 ⊂ ℝ𝑛 , there exists a unique solution to the minimization problem

𝑦 ̂ ∶= arg min ‖𝑦 − 𝑧‖
𝑧∈𝑆

The minimizer 𝑦 ̂ is the unique vector in ℝ𝑛 that satisfies

• 𝑦̂ ∈ 𝑆
• 𝑦 − 𝑦̂ ⟂ 𝑆
The vector 𝑦 ̂ is called the orthogonal projection of 𝑦 onto 𝑆.
The next figure provides some intuition

1.3.1 Proof of Sufficiency

We’ll omit the full proof.

But we will prove sufficiency of the asserted conditions.
To this end, let 𝑦 ∈ ℝ𝑛 and let 𝑆 be a linear subspace of ℝ𝑛 .
Let 𝑦 ̂ be a vector in ℝ𝑛 such that 𝑦 ̂ ∈ 𝑆 and 𝑦 − 𝑦 ̂ ⟂ 𝑆.
Let 𝑧 be any other point in 𝑆 and use the fact that 𝑆 is a linear subspace to deduce

‖𝑦 − 𝑧‖2 = ‖(𝑦 − 𝑦)̂ + (𝑦 ̂ − 𝑧)‖2 = ‖𝑦 − 𝑦‖̂ 2 + ‖𝑦 ̂ − 𝑧‖2

Hence ‖𝑦 − 𝑧‖ ≥ ‖𝑦 − 𝑦‖,
̂ which completes the proof.

1.3. The Orthogonal Projection Theorem 9

Advanced Quantitative Economics with Python

10 Chapter 1. Orthogonal Projections and Their Applications

Advanced Quantitative Economics with Python

1.3.2 Orthogonal Projection as a Mapping

For a linear space 𝑌 and a fixed linear subspace 𝑆, we have a functional relationship

𝑦 ∈ 𝑌 ↦ its orthogonal projection 𝑦 ̂ ∈ 𝑆

By the OPT, this is a well-defined mapping or operator from ℝ𝑛 to ℝ𝑛 .

In what follows we denote this operator by a matrix 𝑃
• 𝑃 𝑦 represents the projection 𝑦.̂
• This is sometimes expressed as 𝐸𝑆̂ 𝑦 = 𝑃 𝑦, where 𝐸̂ denotes a wide-sense expectations operator and the sub-
script 𝑆 indicates that we are projecting 𝑦 onto the linear subspace 𝑆.
The operator 𝑃 is called the orthogonal projection mapping onto 𝑆.

It is immediate from the OPT that for any 𝑦 ∈ ℝ𝑛

1. 𝑃 𝑦 ∈ 𝑆 and
2. 𝑦 − 𝑃 𝑦 ⟂ 𝑆
From this, we can deduce additional useful properties, such as
1. ‖𝑦‖2 = ‖𝑃 𝑦‖2 + ‖𝑦 − 𝑃 𝑦‖2 and
2. ‖𝑃 𝑦‖ ≤ ‖𝑦‖
For example, to prove 1, observe that 𝑦 = 𝑃 𝑦 + 𝑦 − 𝑃 𝑦 and apply the Pythagorean law.

1.3. The Orthogonal Projection Theorem 11

Advanced Quantitative Economics with Python

Orthogonal Complement

Let 𝑆 ⊂ ℝ𝑛 .
The orthogonal complement of 𝑆 is the linear subspace 𝑆 ⟂ that satisfies 𝑥1 ⟂ 𝑥2 for every 𝑥1 ∈ 𝑆 and 𝑥2 ∈ 𝑆 ⟂ .
Let 𝑌 be a linear space with linear subspace 𝑆 and its orthogonal complement 𝑆 ⟂ .
We write

𝑌 = 𝑆 ⊕ 𝑆⟂

to indicate that for every 𝑦 ∈ 𝑌 there is unique 𝑥1 ∈ 𝑆 and a unique 𝑥2 ∈ 𝑆 ⟂ such that 𝑦 = 𝑥1 + 𝑥2 .
Moreover, 𝑥1 = 𝐸𝑆̂ 𝑦 and 𝑥2 = 𝑦 − 𝐸𝑆̂ 𝑦.
This amounts to another version of the OPT:
Theorem. If 𝑆 is a linear subspace of ℝ𝑛 , 𝐸𝑆̂ 𝑦 = 𝑃 𝑦 and 𝐸𝑆̂ ⟂ 𝑦 = 𝑀 𝑦, then

𝑃 𝑦 ⟂ 𝑀𝑦 and 𝑦 = 𝑃 𝑦 + 𝑀 𝑦 for all 𝑦 ∈ ℝ𝑛

The next figure illustrates

12 Chapter 1. Orthogonal Projections and Their Applications

Advanced Quantitative Economics with Python

1.4 Orthonormal Basis

An orthogonal set of vectors 𝑂 ⊂ ℝ𝑛 is called an orthonormal set if ‖𝑢‖ = 1 for all 𝑢 ∈ 𝑂.

Let 𝑆 be a linear subspace of ℝ𝑛 and let 𝑂 ⊂ 𝑆.
If 𝑂 is orthonormal and span 𝑂 = 𝑆, then 𝑂 is called an orthonormal basis of 𝑆.
𝑂 is necessarily a basis of 𝑆 (being independent by orthogonality and the fact that no element is the zero vector).
One example of an orthonormal set is the canonical basis {𝑒1 , … , 𝑒𝑛 } that forms an orthonormal basis of ℝ𝑛 , where 𝑒𝑖
is the 𝑖 th unit vector.
If {𝑢1 , … , 𝑢𝑘 } is an orthonormal basis of linear subspace 𝑆, then
𝑘
𝑥 = ∑⟨𝑥, 𝑢𝑖 ⟩𝑢𝑖 for all 𝑥∈𝑆
𝑖=1

To see this, observe that since 𝑥 ∈ span{𝑢1 , … , 𝑢𝑘 }, we can find scalars 𝛼1 , … , 𝛼𝑘 that verify
𝑘
𝑥 = ∑ 𝛼𝑗 𝑢𝑗 (1.1)
𝑗=1

Taking the inner product with respect to 𝑢𝑖 gives

𝑘
⟨𝑥, 𝑢𝑖 ⟩ = ∑ 𝛼𝑗 ⟨𝑢𝑗 , 𝑢𝑖 ⟩ = 𝛼𝑖
𝑗=1

Combining this result with (1.1) verifies the claim.

1.4.1 Projection onto an Orthonormal Basis

When a subspace onto which we project is orthonormal, computing the projection simplifies:
Theorem If {𝑢1 , … , 𝑢𝑘 } is an orthonormal basis for 𝑆, then
𝑘
𝑃 𝑦 = ∑⟨𝑦, 𝑢𝑖 ⟩𝑢𝑖 , ∀ 𝑦 ∈ ℝ𝑛 (1.2)
𝑖=1

Proof: Fix 𝑦 ∈ ℝ𝑛 and let 𝑃 𝑦 be defined as in (1.2).

Clearly, 𝑃 𝑦 ∈ 𝑆.
We claim that 𝑦 − 𝑃 𝑦 ⟂ 𝑆 also holds.
It sufficies to show that 𝑦 − 𝑃 𝑦 ⟂ any basis vector 𝑢𝑖 .
This is true because
𝑘 𝑘
⟨𝑦 − ∑⟨𝑦, 𝑢𝑖 ⟩𝑢𝑖 , 𝑢𝑗 ⟩ = ⟨𝑦, 𝑢𝑗 ⟩ − ∑⟨𝑦, 𝑢𝑖 ⟩⟨𝑢𝑖 , 𝑢𝑗 ⟩ = 0
𝑖=1 𝑖=1

(Why is this sufficient to establish the claim that 𝑦 − 𝑃 𝑦 ⟂ 𝑆?)

1.4. Orthonormal Basis 13

Advanced Quantitative Economics with Python

1.5 Projection Via Matrix Algebra

Let 𝑆 be a linear subspace of ℝ𝑛 and let 𝑦 ∈ ℝ𝑛 .

We want to compute the matrix 𝑃 that verifies

𝐸𝑆̂ 𝑦 = 𝑃 𝑦

Evidently 𝑃 𝑦 is a linear function from 𝑦 ∈ ℝ𝑛 to 𝑃 𝑦 ∈ ℝ𝑛 .

This reference is useful.
Theorem. Let the columns of 𝑛 × 𝑘 matrix 𝑋 form a basis of 𝑆. Then

𝑃 = 𝑋(𝑋 ′ 𝑋)−1 𝑋 ′

Proof: Given arbitrary 𝑦 ∈ ℝ𝑛 and 𝑃 = 𝑋(𝑋 ′ 𝑋)−1 𝑋 ′ , our claim is that

1. 𝑃 𝑦 ∈ 𝑆, and
2. 𝑦 − 𝑃 𝑦 ⟂ 𝑆
Claim 1 is true because

𝑃 𝑦 = 𝑋(𝑋 ′ 𝑋)−1 𝑋 ′ 𝑦 = 𝑋𝑎 when 𝑎 ∶= (𝑋 ′ 𝑋)−1 𝑋 ′ 𝑦

An expression of the form 𝑋𝑎 is precisely a linear combination of the columns of 𝑋 and hence an element of 𝑆.
Claim 2 is equivalent to the statement

𝑦 − 𝑋(𝑋 ′ 𝑋)−1 𝑋 ′ 𝑦 ⟂ 𝑋𝑏 for all 𝑏 ∈ ℝ𝐾

To verify this, notice that if 𝑏 ∈ ℝ𝐾 , then

(𝑋𝑏)′ [𝑦 − 𝑋(𝑋 ′ 𝑋)−1 𝑋 ′ 𝑦] = 𝑏′ [𝑋 ′ 𝑦 − 𝑋 ′ 𝑦] = 0

The proof is now complete.

1.5.1 Starting with the Basis

It is common in applications to start with 𝑛 × 𝑘 matrix 𝑋 with linearly independent columns and let

𝑆 ∶= span 𝑋 ∶= span{col1 𝑋, … , col𝑘 𝑋}

Then the columns of 𝑋 form a basis of 𝑆.

From the preceding theorem, 𝑃 = 𝑋(𝑋 ′ 𝑋)−1 𝑋 ′ 𝑦 projects 𝑦 onto 𝑆.
In this context, 𝑃 is often called the projection matrix
• The matrix 𝑀 = 𝐼 − 𝑃 satisfies 𝑀 𝑦 = 𝐸𝑆̂ ⟂ 𝑦 and is sometimes called the annihilator matrix.

1.5.2 The Orthonormal Case

Suppose that 𝑈 is 𝑛 × 𝑘 with orthonormal columns.

Let 𝑢𝑖 ∶= col 𝑈𝑖 for each 𝑖, let 𝑆 ∶= span 𝑈 and let 𝑦 ∈ ℝ𝑛 .

14 Chapter 1. Orthogonal Projections and Their Applications

Advanced Quantitative Economics with Python

We know that the projection of 𝑦 onto 𝑆 is

𝑃 𝑦 = 𝑈 (𝑈 ′ 𝑈 )−1 𝑈 ′ 𝑦

Since 𝑈 has orthonormal columns, we have 𝑈 ′ 𝑈 = 𝐼.

Hence
𝑘
𝑃 𝑦 = 𝑈 𝑈 ′ 𝑦 = ∑⟨𝑢𝑖 , 𝑦⟩𝑢𝑖
𝑖=1

We have recovered our earlier result about projecting onto the span of an orthonormal basis.

1.5.3 Application: Overdetermined Systems of Equations

Let 𝑦 ∈ ℝ𝑛 and let 𝑋 be 𝑛 × 𝑘 with linearly independent columns.

Given 𝑋 and 𝑦, we seek 𝑏 ∈ ℝ𝑘 that satisfies the system of linear equations 𝑋𝑏 = 𝑦.
If 𝑛 > 𝑘 (more equations than unknowns), then 𝑏 is said to be overdetermined.
Intuitively, we may not be able to find a 𝑏 that satisfies all 𝑛 equations.
The best approach here is to
• Accept that an exact solution may not exist.
• Look instead for an approximate solution.
By approximate solution, we mean a 𝑏 ∈ ℝ𝑘 such that 𝑋𝑏 is close to 𝑦.
The next theorem shows that a best approximation is well defined and unique.
The proof uses the OPT.
Theorem The unique minimizer of ‖𝑦 − 𝑋𝑏‖ over 𝑏 ∈ ℝ𝐾 is

𝛽 ̂ ∶= (𝑋 ′ 𝑋)−1 𝑋 ′ 𝑦

Proof: Note that

𝑋 𝛽 ̂ = 𝑋(𝑋 ′ 𝑋)−1 𝑋 ′ 𝑦 = 𝑃 𝑦

Since 𝑃 𝑦 is the orthogonal projection onto span(𝑋) we have

‖𝑦 − 𝑃 𝑦‖ ≤ ‖𝑦 − 𝑧‖ for any 𝑧 ∈ span(𝑋)

Because 𝑋𝑏 ∈ span(𝑋)

‖𝑦 − 𝑋 𝛽‖̂ ≤ ‖𝑦 − 𝑋𝑏‖ for any 𝑏 ∈ ℝ𝐾

This is what we aimed to show.

1.6 Least Squares Regression

Let’s apply the theory of orthogonal projection to least squares regression.

This approach provides insights about many geometric properties of linear regression.
We treat only some examples.

1.6. Least Squares Regression 15

Advanced Quantitative Economics with Python

1.6.1 Squared Risk Measures

Given pairs (𝑥, 𝑦) ∈ ℝ𝐾 × ℝ, consider choosing 𝑓 ∶ ℝ𝐾 → ℝ to minimize the risk

𝑅(𝑓) ∶= 𝔼 [(𝑦 − 𝑓(𝑥))2 ]

If probabilities and hence 𝔼 are unknown, we cannot solve this problem directly.
However, if a sample is available, we can estimate the risk with the empirical risk:

1 𝑁
min ∑(𝑦𝑛 − 𝑓(𝑥𝑛 ))2
𝑓∈ℱ 𝑁
𝑛=1

Minimizing this expression is called empirical risk minimization.

The set ℱ is sometimes called the hypothesis space.
The theory of statistical learning tells us that to prevent overfitting we should take the set ℱ to be relatively simple.
If we let ℱ be the class of linear functions, the problem is
𝑁
min ∑(𝑦𝑛 − 𝑏′ 𝑥𝑛 )2
𝑏∈ℝ𝐾
𝑛=1

This is the sample linear least squares problem.

1.6.2 Solution

Define the matrices

𝑦1 𝑥𝑛1
⎛
⎜ 𝑦2 ⎞
⎟ ⎛
⎜ 𝑥𝑛2 ⎞
⎟
𝑦 ∶= ⎜
⎜ ⋮ ⎟
⎟, 𝑥𝑛 ∶= ⎜
⎜ ⋮ ⎟
⎟ = 𝑛-th obs on all regressors
⎝ 𝑦𝑁 ⎠ ⎝ 𝑥𝑛𝐾 ⎠
and
𝑥′1 𝑥11 𝑥12 ⋯ 𝑥1𝐾
⎛ 𝑥′2 ⎞ ⎛ 𝑥21 𝑥22 ⋯ 𝑥2𝐾 ⎞
𝑋 ∶= ⎜
⎜
⎜ ⋮
⎟
⎟ ⎜
⎟ ∶=∶ ⎜
⎜ ⋮
⎟
⎟
⎟
⋮ ⋮
⎝ 𝑥′𝑁 ⎠ ⎝ 𝑥𝑁1 𝑥𝑁2 ⋯ 𝑥𝑁𝐾 ⎠
We assume throughout that 𝑁 > 𝐾 and 𝑋 is full column rank.
𝑁
If you work through the algebra, you will be able to verify that ‖𝑦 − 𝑋𝑏‖2 = ∑𝑛=1 (𝑦𝑛 − 𝑏′ 𝑥𝑛 )2 .
Since monotone transforms don’t affect minimizers, we have
𝑁
arg min ∑(𝑦𝑛 − 𝑏′ 𝑥𝑛 )2 = arg min ‖𝑦 − 𝑋𝑏‖
𝑏∈ℝ𝐾 𝑛=1 𝑏∈ℝ𝐾

By our results about overdetermined linear systems of equations, the solution is

𝛽 ̂ ∶= (𝑋 ′ 𝑋)−1 𝑋 ′ 𝑦

Let 𝑃 and 𝑀 be the projection and annihilator associated with 𝑋:

𝑃 ∶= 𝑋(𝑋 ′ 𝑋)−1 𝑋 ′ and 𝑀 ∶= 𝐼 − 𝑃

16 Chapter 1. Orthogonal Projections and Their Applications

Advanced Quantitative Economics with Python

The vector of fitted values is

𝑦 ̂ ∶= 𝑋 𝛽 ̂ = 𝑃 𝑦

The vector of residuals is

𝑢̂ ∶= 𝑦 − 𝑦 ̂ = 𝑦 − 𝑃 𝑦 = 𝑀 𝑦

Here are some more standard definitions:

• The total sum of squares is ∶= ‖𝑦‖2 .
• The sum of squared residuals is ∶= ‖𝑢‖̂ 2 .
• The explained sum of squares is ∶= ‖𝑦‖̂ 2 .
TSS = ESS + SSR
We can prove this easily using the OPT.
From the OPT we have 𝑦 = 𝑦 ̂ + 𝑢̂ and 𝑢̂ ⟂ 𝑦.̂
Applying the Pythagorean law completes the proof.

1.7 Orthogonalization and Decomposition

Let’s return to the connection between linear independence and orthogonality touched on above.
A result of much interest is a famous algorithm for constructing orthonormal sets from linearly independent sets.
The next section gives details.

1.7.1 Gram-Schmidt Orthogonalization

Theorem For each linearly independent set {𝑥1 , … , 𝑥𝑘 } ⊂ ℝ𝑛 , there exists an orthonormal set {𝑢1 , … , 𝑢𝑘 } with

span{𝑥1 , … , 𝑥𝑖 } = span{𝑢1 , … , 𝑢𝑖 } for 𝑖 = 1, … , 𝑘

The Gram-Schmidt orthogonalization procedure constructs an orthogonal set {𝑢1 , 𝑢2 , … , 𝑢𝑛 }.

One description of this procedure is as follows:
• For 𝑖 = 1, … , 𝑘, form 𝑆𝑖 ∶= span{𝑥1 , … , 𝑥𝑖 } and 𝑆𝑖⟂
• Set 𝑣1 = 𝑥1
• For 𝑖 ≥ 2 set 𝑣𝑖 ∶= 𝐸𝑆̂ 𝑖−1
⟂ 𝑥𝑖 and 𝑢𝑖 ∶= 𝑣𝑖 /‖𝑣𝑖 ‖

The sequence 𝑢1 , … , 𝑢𝑘 has the stated properties.

A Gram-Schmidt orthogonalization construction is a key idea behind the Kalman filter described in A First Look at the
Kalman filter.
In some exercises below, you are asked to implement this algorithm and test it using projection.

1.7. Orthogonalization and Decomposition 17

Advanced Quantitative Economics with Python

1.7.2 QR Decomposition

The following result uses the preceding algorithm to produce a useful decomposition.
Theorem If 𝑋 is 𝑛 × 𝑘 with linearly independent columns, then there exists a factorization 𝑋 = 𝑄𝑅 where
• 𝑅 is 𝑘 × 𝑘, upper triangular, and nonsingular
• 𝑄 is 𝑛 × 𝑘 with orthonormal columns
Proof sketch: Let
• 𝑥𝑗 ∶= col𝑗 (𝑋)
• {𝑢1 , … , 𝑢𝑘 } be orthonormal with the same span as {𝑥1 , … , 𝑥𝑘 } (to be constructed using Gram–Schmidt)
• 𝑄 be formed from cols 𝑢𝑖
Since 𝑥𝑗 ∈ span{𝑢1 , … , 𝑢𝑗 }, we have

𝑗
𝑥𝑗 = ∑⟨𝑢𝑖 , 𝑥𝑗 ⟩𝑢𝑖 for 𝑗 = 1, … , 𝑘
𝑖=1

Some rearranging gives 𝑋 = 𝑄𝑅.

1.7.3 Linear Regression via QR Decomposition

For matrices 𝑋 and 𝑦 that overdetermine 𝛽 in the linear equation system 𝑦 = 𝑋𝛽, we found the least squares approximator
𝛽 ̂ = (𝑋 ′ 𝑋)−1 𝑋 ′ 𝑦.
Using the QR decomposition 𝑋 = 𝑄𝑅 gives

𝛽 ̂ = (𝑅′ 𝑄′ 𝑄𝑅)−1 𝑅′ 𝑄′ 𝑦
= (𝑅′ 𝑅)−1 𝑅′ 𝑄′ 𝑦
= 𝑅−1 (𝑅′ )−1 𝑅′ 𝑄′ 𝑦 = 𝑅−1 𝑄′ 𝑦

Numerical routines would in this case use the alternative form 𝑅𝛽 ̂ = 𝑄′ 𝑦 and back substitution.

1.8 Exercises

Exercise 1.8.1
Show that, for any linear subspace 𝑆 ⊂ ℝ𝑛 , 𝑆 ∩ 𝑆 ⟂ = {0}.

Solution to Exercise 1.8.1

If 𝑥 ∈ 𝑆 and 𝑥 ∈ 𝑆 ⟂ , then we have in particular that ⟨𝑥, 𝑥⟩ = 0, but then 𝑥 = 0.

Exercise 1.8.2
Let 𝑃 = 𝑋(𝑋 ′ 𝑋)−1 𝑋 ′ and let 𝑀 = 𝐼 − 𝑃 . Show that 𝑃 and 𝑀 are both idempotent and symmetric. Can you give
any intuition as to why they should be idempotent?

18 Chapter 1. Orthogonal Projections and Their Applications

Advanced Quantitative Economics with Python

Solution to Exercise 1.8.2

Symmetry and idempotence of 𝑀 and 𝑃 can be established using standard rules for matrix algebra. The intuition behind
idempotence of 𝑀 and 𝑃 is that both are orthogonal projections. After a point is projected into a given subspace, applying
the projection again makes no difference (A point inside the subspace is not shifted by orthogonal projection onto that
space because it is already the closest point in the subspace to itself).

Exercise 1.8.3
Using Gram-Schmidt orthogonalization, produce a linear projection of 𝑦 onto the column space of 𝑋 and verify this
using the projection matrix 𝑃 ∶= 𝑋(𝑋 ′ 𝑋)−1 𝑋 ′ and also using QR decomposition, where:

1
𝑦 ∶= ⎛
⎜ 3 ⎞⎟,
⎝ −3 ⎠
and
1 0
𝑋 ∶= ⎛
⎜ 0 −6 ⎞
⎟
⎝ 2 2 ⎠

Solution to Exercise 1.8.3

Here’s a function that computes the orthonormal vectors using the GS algorithm given in the lecture

def gram_schmidt(X):
"""
Implements Gram-Schmidt orthogonalization.

Parameters
----------
X : an n x k array with linearly independent columns

Returns
-------
U : an n x k array with orthonormal columns

"""

# Set up
n, k = X.shape
U = np.empty((n, k))
I = np.eye(n)

# The first col of U is just the normalized first col of X

v1 = X[:,0]
U[:, 0] = v1 / np.sqrt(np.sum(v1 * v1))

for i in range(1, k):

# Set up
b = X[:, i] # The vector we're going to project
Z = X[:, 0:i] # First i-1 columns of X
(continues on next page)

1.8. Exercises 19
Advanced Quantitative Economics with Python

(continued from previous page)

# Project onto the orthogonal complement of the col span of Z

M = I - Z @ np.linalg.inv(Z.T @ Z) @ Z.T
u = M @ b

# Normalize
U[:, i] = u / np.sqrt(np.sum(u * u))

return U

Here are the arrays we’ll work with

y = [1, 3, -3]

X = [[1, 0],
[0, -6],
[2, 2]]

X, y = [np.asarray(z) for z in (X, y)]

First, let’s try projection of 𝑦 onto the column space of 𝑋 using the ordinary matrix expression:

Py1 = X @ np.linalg.inv(X.T @ X) @ X.T @ y

Py1

array([-0.56521739, 3.26086957, -2.2173913 ])

Now let’s do the same using an orthonormal basis created from our gram_schmidt function

U = gram_schmidt(X)
U

array([[ 0.4472136 , -0.13187609],

[ 0. , -0.98907071],
[ 0.89442719, 0.06593805]])

Py2 = U @ U.T @ y
Py2

array([-0.56521739, 3.26086957, -2.2173913 ])

This is the same answer. So far so good. Finally, let’s try the same thing but with the basis obtained via QR decomposition:

Q, R = qr(X, mode='economic')
Q

array([[-0.4472136 , -0.13187609],
[-0. , -0.98907071],
[-0.89442719, 0.06593805]])

20 Chapter 1. Orthogonal Projections and Their Applications

Advanced Quantitative Economics with Python

Py3 = Q @ Q.T @ y
Py3

array([-0.56521739, 3.26086957, -2.2173913 ])

Again, we obtain the same answer.

1.8. Exercises 21
Advanced Quantitative Economics with Python

22 Chapter 1. Orthogonal Projections and Their Applications

CHAPTER

TWO

CONTINUOUS STATE MARKOV CHAINS

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install --upgrade quantecon

2.1 Overview

In a previous lecture, we learned about finite Markov chains, a relatively elementary class of stochastic dynamic models.
The present lecture extends this analysis to continuous (i.e., uncountable) state Markov chains.
Most stochastic dynamic models studied by economists either fit directly into this class or can be represented as continuous
state Markov chains after minor modifications.
In this lecture, our focus will be on continuous Markov models that
• evolve in discrete-time
• are often nonlinear
The fact that we accommodate nonlinear models here is significant, because linear stochastic models have their own highly
developed toolset, as we’ll see later on.
The question that interests us most is: Given a particular stochastic dynamic model, how will the state of the system
evolve over time?
In particular,
• What happens to the distribution of the state variables?
• Is there anything we can say about the “average behavior” of these variables?
• Is there a notion of “steady state” or “long-run equilibrium” that’s applicable to the model?
– If so, how can we compute it?
Answering these questions will lead us to revisit many of the topics that occupied us in the finite state case, such as
simulation, distribution dynamics, stability, ergodicity, etc.

Note: For some people, the term “Markov chain” always refers to a process with a finite or discrete state space. We
follow the mainstream mathematical literature (e.g., [Meyn and Tweedie, 2009]) in using the term to refer to any discrete
time Markov process.

Let’s begin with some imports:

23
Advanced Quantitative Economics with Python

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import lognorm, beta
from quantecon import LAE
from scipy.stats import norm, gaussian_kde

2.2 The Density Case

You are probably aware that some distributions can be represented by densities and some cannot.
(For example, distributions on the real numbers ℝ that put positive probability on individual points have no density
representation)
We are going to start our analysis by looking at Markov chains where the one-step transition probabilities have density
representations.
The benefit is that the density case offers a very direct parallel to the finite case in terms of notation and intuition.
Once we’ve built some intuition we’ll cover the general case.

2.2.1 Definitions and Basic Properties

In our lecture on finite Markov chains, we studied discrete-time Markov chains that evolve on a finite state space 𝑆.
In this setting, the dynamics of the model are described by a stochastic matrix — a nonnegative square matrix 𝑃 = 𝑃 [𝑖, 𝑗]
such that each row 𝑃 [𝑖, ⋅] sums to one.
The interpretation of 𝑃 is that 𝑃 [𝑖, 𝑗] represents the probability of transitioning from state 𝑖 to state 𝑗 in one unit of time.
In symbols,

ℙ{𝑋𝑡+1 = 𝑗 | 𝑋𝑡 = 𝑖} = 𝑃 [𝑖, 𝑗]

Equivalently,
• 𝑃 can be thought of as a family of distributions 𝑃 [𝑖, ⋅], one for each 𝑖 ∈ 𝑆
• 𝑃 [𝑖, ⋅] is the distribution of 𝑋𝑡+1 given 𝑋𝑡 = 𝑖
(As you probably recall, when using NumPy arrays, 𝑃 [𝑖, ⋅] is expressed as P[i,:])
In this section, we’ll allow 𝑆 to be a subset of ℝ, such as
• ℝ itself
• the positive reals (0, ∞)
• a bounded interval (𝑎, 𝑏)
The family of discrete distributions 𝑃 [𝑖, ⋅] will be replaced by a family of densities 𝑝(𝑥, ⋅), one for each 𝑥 ∈ 𝑆.
Analogous to the finite state case, 𝑝(𝑥, ⋅) is to be understood as the distribution (density) of 𝑋𝑡+1 given 𝑋𝑡 = 𝑥.
More formally, a stochastic kernel on 𝑆 is a function 𝑝 ∶ 𝑆 × 𝑆 → ℝ with the property that
1. 𝑝(𝑥, 𝑦) ≥ 0 for all 𝑥, 𝑦 ∈ 𝑆
2. ∫ 𝑝(𝑥, 𝑦)𝑑𝑦 = 1 for all 𝑥 ∈ 𝑆

24 Chapter 2. Continuous State Markov Chains

Advanced Quantitative Economics with Python

(Integrals are over the whole space unless otherwise specified)

For example, let 𝑆 = ℝ and consider the particular stochastic kernel 𝑝𝑤 defined by

1 (𝑦 − 𝑥)2
𝑝𝑤 (𝑥, 𝑦) ∶= √ exp {− } (2.1)
2𝜋 2

What kind of model does 𝑝𝑤 represent?

The answer is, the (normally distributed) random walk
IID
𝑋𝑡+1 = 𝑋𝑡 + 𝜉𝑡+1 where {𝜉𝑡 } ∼ 𝑁 (0, 1) (2.2)

To see this, let’s find the stochastic kernel 𝑝 corresponding to (2.2).

Recall that 𝑝(𝑥, ⋅) represents the distribution of 𝑋𝑡+1 given 𝑋𝑡 = 𝑥.
Letting 𝑋𝑡 = 𝑥 in (2.2) and considering the distribution of 𝑋𝑡+1 , we see that 𝑝(𝑥, ⋅) = 𝑁 (𝑥, 1).
In other words, 𝑝 is exactly 𝑝𝑤 , as defined in (2.1).

2.2.2 Connection to Stochastic Difference Equations

In the previous section, we made the connection between stochastic difference equation (2.2) and stochastic kernel (2.1).
In economics and time-series analysis we meet stochastic difference equations of all different shapes and sizes.
It will be useful for us if we have some systematic methods for converting stochastic difference equations into stochastic
kernels.
To this end, consider the generic (scalar) stochastic difference equation given by

𝑋𝑡+1 = 𝜇(𝑋𝑡 ) + 𝜎(𝑋𝑡 ) 𝜉𝑡+1 (2.3)

Here we assume that

IID
• {𝜉𝑡 } ∼ 𝜙, where 𝜙 is a given density on ℝ
• 𝜇 and 𝜎 are given functions on 𝑆, with 𝜎(𝑥) > 0 for all 𝑥
Example 1: The random walk (2.2) is a special case of (2.3), with 𝜇(𝑥) = 𝑥 and 𝜎(𝑥) = 1.
Example 2: Consider the ARCH model

𝑋𝑡+1 = 𝛼𝑋𝑡 + 𝜎𝑡 𝜉𝑡+1 , 𝜎𝑡2 = 𝛽 + 𝛾𝑋𝑡2 , 𝛽, 𝛾 > 0

Alternatively, we can write the model as

𝑋𝑡+1 = 𝛼𝑋𝑡 + (𝛽 + 𝛾𝑋𝑡2 )1/2 𝜉𝑡+1 (2.4)

This is a special case of (2.3) with 𝜇(𝑥) = 𝛼𝑥 and 𝜎(𝑥) = (𝛽 + 𝛾𝑥2 )1/2 .
Example 3: With stochastic production and a constant savings rate, the one-sector neoclassical growth model leads to a
law of motion for capital per worker such as

𝑘𝑡+1 = 𝑠𝐴𝑡+1 𝑓(𝑘𝑡 ) + (1 − 𝛿)𝑘𝑡 (2.5)

Here
• 𝑠 is the rate of savings
• 𝐴𝑡+1 is a production shock

2.2. The Density Case 25

Advanced Quantitative Economics with Python

– The 𝑡 + 1 subscript indicates that 𝐴𝑡+1 is not visible at time 𝑡

• 𝛿 is a depreciation rate
• 𝑓 ∶ ℝ+ → ℝ+ is a production function satisfying 𝑓(𝑘) > 0 whenever 𝑘 > 0
(The fixed savings rate can be rationalized as the optimal policy for a particular set of technologies and preferences (see
[Ljungqvist and Sargent, 2018], section 3.1.2), although we omit the details here).
Equation (2.5) is a special case of (2.3) with 𝜇(𝑥) = (1 − 𝛿)𝑥 and 𝜎(𝑥) = 𝑠𝑓(𝑥).
Now let’s obtain the stochastic kernel corresponding to the generic model (2.3).
To find it, note first that if 𝑈 is a random variable with density 𝑓𝑈 , and 𝑉 = 𝑎 + 𝑏𝑈 for some constants 𝑎, 𝑏 with 𝑏 > 0,
then the density of 𝑉 is given by
1 𝑣−𝑎
𝑓𝑉 (𝑣) = 𝑓 ( ) (2.6)
𝑏 𝑈 𝑏
(The proof is below. For a multidimensional version see EDTC, theorem 8.1.3).
Taking (2.6) as given for the moment, we can obtain the stochastic kernel 𝑝 for (2.3) by recalling that 𝑝(𝑥, ⋅) is the
conditional density of 𝑋𝑡+1 given 𝑋𝑡 = 𝑥.
In the present case, this is equivalent to stating that 𝑝(𝑥, ⋅) is the density of 𝑌 ∶= 𝜇(𝑥) + 𝜎(𝑥) 𝜉𝑡+1 when 𝜉𝑡+1 ∼ 𝜙.
Hence, by (2.6),

1 𝑦 − 𝜇(𝑥)
𝑝(𝑥, 𝑦) = 𝜙( ) (2.7)
𝜎(𝑥) 𝜎(𝑥)

For example, the growth model in (2.5) has stochastic kernel

1 𝑦 − (1 − 𝛿)𝑥
𝑝(𝑥, 𝑦) = 𝜙( ) (2.8)
𝑠𝑓(𝑥) 𝑠𝑓(𝑥)

where 𝜙 is the density of 𝐴𝑡+1 .

(Regarding the state space 𝑆 for this model, a natural choice is (0, ∞) — in which case 𝜎(𝑥) = 𝑠𝑓(𝑥) is strictly positive
for all 𝑠 as required)

2.2.3 Distribution Dynamics

In this section of our lecture on finite Markov chains, we asked the following question: If
1. {𝑋𝑡 } is a Markov chain with stochastic matrix 𝑃
2. the distribution of 𝑋𝑡 is known to be 𝜓𝑡
then what is the distribution of 𝑋𝑡+1 ?
Letting 𝜓𝑡+1 denote the distribution of 𝑋𝑡+1 , the answer we gave was that

𝜓𝑡+1 [𝑗] = ∑ 𝑃 [𝑖, 𝑗]𝜓𝑡 [𝑖]

𝑖∈𝑆

This intuitive equality states that the probability of being at 𝑗 tomorrow is the probability of visiting 𝑖 today and then
going on to 𝑗, summed over all possible 𝑖.
In the density case, we just replace the sum with an integral and probability mass functions with densities, yielding

𝜓𝑡+1 (𝑦) = ∫ 𝑝(𝑥, 𝑦)𝜓𝑡 (𝑥) 𝑑𝑥, ∀𝑦 ∈ 𝑆 (2.9)

26 Chapter 2. Continuous State Markov Chains

Advanced Quantitative Economics with Python

It is convenient to think of this updating process in terms of an operator.

(An operator is just a function, but the term is usually reserved for a function that sends functions into functions)
Let 𝒟 be the set of all densities on 𝑆, and let 𝑃 be the operator from 𝒟 to itself that takes density 𝜓 and sends it into
new density 𝜓𝑃 , where the latter is defined by

(𝜓𝑃 )(𝑦) = ∫ 𝑝(𝑥, 𝑦)𝜓(𝑥)𝑑𝑥 (2.10)

This operator is usually called the Markov operator corresponding to 𝑝

Note: Unlike most operators, we write 𝑃 to the right of its argument, instead of to the left (i.e., 𝜓𝑃 instead of 𝑃 𝜓).
This is a common convention, with the intention being to maintain the parallel with the finite case — see here

With this notation, we can write (2.9) more succinctly as 𝜓𝑡+1 (𝑦) = (𝜓𝑡 𝑃 )(𝑦) for all 𝑦, or, dropping the 𝑦 and letting
“=” indicate equality of functions,

𝜓𝑡+1 = 𝜓𝑡 𝑃 (2.11)

Equation (2.11) tells us that if we specify a distribution for 𝜓0 , then the entire sequence of future distributions can be
obtained by iterating with 𝑃 .
It’s interesting to note that (2.11) is a deterministic difference equation.
Thus, by converting a stochastic difference equation such as (2.3) into a stochastic kernel 𝑝 and hence an operator 𝑃 , we
convert a stochastic difference equation into a deterministic one (albeit in a much higher dimensional space).

Note: Some people might be aware that discrete Markov chains are in fact a special case of the continuous Markov
chains we have just described. The reason is that probability mass functions are densities with respect to the counting
measure.

2.2.4 Computation

To learn about the dynamics of a given process, it’s useful to compute and study the sequences of densities generated by
the model.
One way to do this is to try to implement the iteration described by (2.10) and (2.11) using numerical integration.
However, to produce 𝜓𝑃 from 𝜓 via (2.10), you would need to integrate at every 𝑦, and there is a continuum of such 𝑦.
Another possibility is to discretize the model, but this introduces errors of unknown size.
A nicer alternative in the present setting is to combine simulation with an elegant estimator called the look-ahead estimator.
Let’s go over the ideas with reference to the growth model discussed above, the dynamics of which we repeat here for
convenience:

𝑘𝑡+1 = 𝑠𝐴𝑡+1 𝑓(𝑘𝑡 ) + (1 − 𝛿)𝑘𝑡 (2.12)

Our aim is to compute the sequence {𝜓𝑡 } associated with this model and fixed initial condition 𝜓0 .
To approximate 𝜓𝑡 by simulation, recall that, by definition, 𝜓𝑡 is the density of 𝑘𝑡 given 𝑘0 ∼ 𝜓0 .
If we wish to generate observations of this random variable, all we need to do is
1. draw 𝑘0 from the specified initial condition 𝜓0

2.2. The Density Case 27

Advanced Quantitative Economics with Python

2. draw the shocks 𝐴1 , … , 𝐴𝑡 from their specified density 𝜙

3. compute 𝑘𝑡 iteratively via (2.12)
If we repeat this 𝑛 times, we get 𝑛 independent observations 𝑘𝑡1 , … , 𝑘𝑡𝑛 .
With these draws in hand, the next step is to generate some kind of representation of their distribution 𝜓𝑡 .
A naive approach would be to use a histogram, or perhaps a smoothed histogram using SciPy’s gaussian_kde function.
However, in the present setting, there is a much better way to do this, based on the look-ahead estimator.
With this estimator, to construct an estimate of 𝜓𝑡 , we actually generate 𝑛 observations of 𝑘𝑡−1 , rather than 𝑘𝑡 .
1 𝑛
Now we take these 𝑛 observations 𝑘𝑡−1 , … , 𝑘𝑡−1 and form the estimate

1 𝑛
𝜓𝑡𝑛 (𝑦) = 𝑖
∑ 𝑝(𝑘𝑡−1 , 𝑦) (2.13)
𝑛 𝑖=1

where 𝑝 is the growth model stochastic kernel in (2.8).

What is the justification for this slightly surprising estimator?
The idea is that, by the strong law of large numbers,

1 𝑛 𝑖 𝑖
∑ 𝑝(𝑘𝑡−1 , 𝑦) → 𝔼𝑝(𝑘𝑡−1 , 𝑦) = ∫ 𝑝(𝑥, 𝑦)𝜓𝑡−1 (𝑥) 𝑑𝑥 = 𝜓𝑡 (𝑦)
𝑛 𝑖=1

with probability one as 𝑛 → ∞.

Here the first equality is by the definition of 𝜓𝑡−1 , and the second is by (2.9).
We have just shown that our estimator 𝜓𝑡𝑛 (𝑦) in (2.13) converges almost surely to 𝜓𝑡 (𝑦), which is just what we want to
compute.
In fact, much stronger convergence results are true (see, for example, this paper).

2.2.5 Implementation

A class called LAE for estimating densities by this technique can be found in lae.py.
Given our use of the __call__ method, an instance of LAE acts as a callable object, which is essentially a function that
can store its own data (see this discussion).
This function returns the right-hand side of (2.13) using
• the data and stochastic kernel that it stores as its instance data
• the value 𝑦 as its argument
The function is vectorized, in the sense that if psi is such an instance and y is an array, then the call psi(y) acts
elementwise.
(This is the reason that we reshaped X and y inside the class — to make vectorization work)
Because the implementation is fully vectorized, it is about as efficient as it would be in C or Fortran.

28 Chapter 2. Continuous State Markov Chains

Advanced Quantitative Economics with Python

2.2.6 Example

The following code is an example of usage for the stochastic growth model described above

# == Define parameters == #
s = 0.2
δ = 0.1
a_σ = 0.4 # A = exp(B) where B ~ N(0, a_σ)
α = 0.4 # We set f(k) = k**α
ψ_0 = beta(5, 5, scale=0.5) # Initial distribution
ϕ = lognorm(a_σ)

def p(x, y):

"""
Stochastic kernel for the growth model with Cobb-Douglas production.
Both x and y must be strictly positive.
"""
d = s * x**α
return ϕ.pdf((y - (1 - δ) * x) / d) / d

n = 10000 # Number of observations at each date t

T = 30 # Compute density of k_t at 1,...,T+1

# == Generate matrix s.t. t-th column is n observations of k_t == #

k = np.empty((n, T))
A = ϕ.rvs((n, T))
k[:, 0] = ψ_0.rvs(n) # Draw first column from initial distribution
for t in range(T-1):
k[:, t+1] = s * A[:, t] * k[:, t]**α + (1 - δ) * k[:, t]

# == Generate T instances of LAE using this data, one for each date t == #
laes = [LAE(p, k[:, t]) for t in range(T)]

# == Plot == #
fig, ax = plt.subplots()
ygrid = np.linspace(0.01, 4.0, 200)
greys = [str(g) for g in np.linspace(0.0, 0.8, T)]
greys.reverse()
for ψ, g in zip(laes, greys):
ax.plot(ygrid, ψ(ygrid), color=g, lw=2, alpha=0.6)
ax.set_xlabel('capital')
ax.set_title(f'Density of $k_1$ (lighter) to $k_T$ (darker) for $T={T}$')
plt.show()

2.2. The Density Case 29

Advanced Quantitative Economics with Python

The figure shows part of the density sequence {𝜓𝑡 }, with each density computed via the look-ahead estimator.
Notice that the sequence of densities shown in the figure seems to be converging — more on this in just a moment.
Another quick comment is that each of these distributions could be interpreted as a cross-sectional distribution (recall
this discussion).

2.3 Beyond Densities

Up until now, we have focused exclusively on continuous state Markov chains where all conditional distributions 𝑝(𝑥, ⋅)
are densities.
As discussed above, not all distributions can be represented as densities.
If the conditional distribution of 𝑋𝑡+1 given 𝑋𝑡 = 𝑥 cannot be represented as a density for some 𝑥 ∈ 𝑆, then we need
a slightly different theory.
The ultimate option is to switch from densities to probability measures, but not all readers will be familiar with measure
theory.
We can, however, construct a fairly general theory using distribution functions.

30 Chapter 2. Continuous State Markov Chains

Advanced Quantitative Economics with Python

2.3.1 Example and Definitions

To illustrate the issues, recall that Hopenhayn and Rogerson [Hopenhayn and Rogerson, 1993] study a model of firm
dynamics where individual firm productivity follows the exogenous process
IID
𝑋𝑡+1 = 𝑎 + 𝜌𝑋𝑡 + 𝜉𝑡+1 , where {𝜉𝑡 } ∼ 𝑁 (0, 𝜎2 )

As is, this fits into the density case we treated above.

However, the authors wanted this process to take values in [0, 1], so they added boundaries at the endpoints 0 and 1.
One way to write this is

𝑋𝑡+1 = ℎ(𝑎 + 𝜌𝑋𝑡 + 𝜉𝑡+1 ) where ℎ(𝑥) ∶= 𝑥 1{0 ≤ 𝑥 ≤ 1} + 1{𝑥 > 1}

If you think about it, you will see that for any given 𝑥 ∈ [0, 1], the conditional distribution of 𝑋𝑡+1 given 𝑋𝑡 = 𝑥 puts
positive probability mass on 0 and 1.
Hence it cannot be represented as a density.
What we can do instead is use cumulative distribution functions (cdfs).
To this end, set

𝐺(𝑥, 𝑦) ∶= ℙ{ℎ(𝑎 + 𝜌𝑥 + 𝜉𝑡+1 ) ≤ 𝑦} (0 ≤ 𝑥, 𝑦 ≤ 1)

This family of cdfs 𝐺(𝑥, ⋅) plays a role analogous to the stochastic kernel in the density case.
The distribution dynamics in (2.9) are then replaced by

𝐹𝑡+1 (𝑦) = ∫ 𝐺(𝑥, 𝑦)𝐹𝑡 (𝑑𝑥) (2.14)

Here 𝐹𝑡 and 𝐹𝑡+1 are cdfs representing the distribution of the current state and next period state.
The intuition behind (2.14) is essentially the same as for (2.9).

2.3.2 Computation

If you wish to compute these cdfs, you cannot use the look-ahead estimator as before.
Indeed, you should not use any density estimator, since the objects you are estimating/computing are not densities.
One good option is simulation as before, combined with the empirical distribution function.

2.4 Stability

In our lecture on finite Markov chains, we also studied stationarity, stability and ergodicity.
Here we will cover the same topics for the continuous case.
We will, however, treat only the density case (as in this section), where the stochastic kernel is a family of densities.
The general case is relatively similar — references are given below.

2.4. Stability 31
Advanced Quantitative Economics with Python

2.4.1 Theoretical Results

Analogous to the finite case, given a stochastic kernel 𝑝 and corresponding Markov operator as defined in (2.10), a density
𝜓∗ on 𝑆 is called stationary for 𝑃 if it is a fixed point of the operator 𝑃 .
In other words,

𝜓∗ (𝑦) = ∫ 𝑝(𝑥, 𝑦)𝜓∗ (𝑥) 𝑑𝑥, ∀𝑦 ∈ 𝑆 (2.15)

As with the finite case, if 𝜓∗ is stationary for 𝑃 , and the distribution of 𝑋0 is 𝜓∗ , then, in view of (2.11), 𝑋𝑡 will have
this same distribution for all 𝑡.
Hence 𝜓∗ is the stochastic equivalent of a steady state.
In the finite case, we learned that at least one stationary distribution exists, although there may be many.
When the state space is infinite, the situation is more complicated.
Even existence can fail very easily.
For example, the random walk model has no stationary density (see, e.g., EDTC, p. 210).
However, there are well-known conditions under which a stationary density 𝜓∗ exists.
With additional conditions, we can also get a unique stationary density (𝜓 ∈ 𝒟 and 𝜓 = 𝜓𝑃 ⟹ 𝜓 = 𝜓∗ ), and also
global convergence in the sense that

∀ 𝜓 ∈ 𝒟, 𝜓𝑃 𝑡 → 𝜓∗ as 𝑡 → ∞ (2.16)

This combination of existence, uniqueness and global convergence in the sense of (2.16) is often referred to as global
stability.
Under very similar conditions, we get ergodicity, which means that

1 𝑛
∑ ℎ(𝑋𝑡 ) → ∫ ℎ(𝑥)𝜓∗ (𝑥)𝑑𝑥 as 𝑛 → ∞ (2.17)
𝑛 𝑡=1

for any (measurable) function ℎ ∶ 𝑆 → ℝ such that the right-hand side is finite.
Note that the convergence in (2.17) does not depend on the distribution (or value) of 𝑋0 .
This is actually very important for simulation — it means we can learn about 𝜓∗ (i.e., approximate the right-hand side of
(2.17) via the left-hand side) without requiring any special knowledge about what to do with 𝑋0 .
So what are these conditions we require to get global stability and ergodicity?
In essence, it must be the case that
1. Probability mass does not drift off to the “edges” of the state space.
2. Sufficient “mixing” obtains.
For one such set of conditions see theorem 8.2.14 of EDTC.
In addition
• [Stokey et al., 1989] contains a classic (but slightly outdated) treatment of these topics.
• From the mathematical literature, [Lasota and MacKey, 1994] and [Meyn and Tweedie, 2009] give outstanding
in-depth treatments.
• Section 8.1.2 of EDTC provides detailed intuition, and section 8.3 gives additional references.
• EDTC, section 11.3.4 provides a specific treatment for the growth model we considered in this lecture.

32 Chapter 2. Continuous State Markov Chains

Advanced Quantitative Economics with Python

2.4.2 An Example of Stability

As stated above, the growth model treated here is stable under mild conditions on the primitives.
• See EDTC, section 11.3.4 for more details.
We can see this stability in action — in particular, the convergence in (2.16) — by simulating the path of densities from
various initial conditions.
Here is such a figure.

All sequences are converging towards the same limit, regardless of their initial condition.
The details regarding initial conditions and so on are given in this exercise, where you are asked to replicate the figure.

2.4.3 Computing Stationary Densities

In the preceding figure, each sequence of densities is converging towards the unique stationary density 𝜓∗ .
Even from this figure, we can get a fair idea what 𝜓∗ looks like, and where its mass is located.
However, there is a much more direct way to estimate the stationary density, and it involves only a slight modification of
the look-ahead estimator.
Let’s say that we have a model of the form (2.3) that is stable and ergodic.
Let 𝑝 be the corresponding stochastic kernel, as given in (2.7).

2.4. Stability 33
Advanced Quantitative Economics with Python

To approximate the stationary density 𝜓∗ , we can simply generate a long time-series 𝑋0 , 𝑋1 , … , 𝑋𝑛 and estimate 𝜓∗ via

1 𝑛
𝜓𝑛∗ (𝑦) = ∑ 𝑝(𝑋𝑡 , 𝑦) (2.18)
𝑛 𝑡=1

This is essentially the same as the look-ahead estimator (2.13), except that now the observations we generate are a single
time-series, rather than a cross-section.
The justification for (2.18) is that, with probability one as 𝑛 → ∞,

1 𝑛
∑ 𝑝(𝑋𝑡 , 𝑦) → ∫ 𝑝(𝑥, 𝑦)𝜓∗ (𝑥) 𝑑𝑥 = 𝜓∗ (𝑦)
𝑛 𝑡=1

where the convergence is by (2.17) and the equality on the right is by (2.15).
The right-hand side is exactly what we want to compute.
On top of this asymptotic result, it turns out that the rate of convergence for the look-ahead estimator is very good.
The first exercise helps illustrate this point.

2.5 Exercises

Exercise 2.5.1
Consider the simple threshold autoregressive model
IID
𝑋𝑡+1 = 𝜃|𝑋𝑡 | + (1 − 𝜃2 )1/2 𝜉𝑡+1 where {𝜉𝑡 } ∼ 𝑁 (0, 1) (2.19)

This is one of those rare nonlinear stochastic models where an analytical expression for the stationary density is available.
In particular, provided that |𝜃| < 1, there is a unique stationary density 𝜓∗ given by

𝜃𝑦
𝜓∗ (𝑦) = 2 𝜙(𝑦) Φ [ ] (2.20)
(1 − 𝜃2 )1/2

Here 𝜙 is the standard normal density and Φ is the standard normal cdf.
As an exercise, compute the look-ahead estimate of 𝜓∗ , as defined in (2.18), and compare it with 𝜓∗ in (2.20) to see
whether they are indeed close for large 𝑛.
In doing so, set 𝜃 = 0.8 and 𝑛 = 500.
The next figure shows the result of such a computation
The additional density (black line) is a nonparametric kernel density estimate, added to the solution for illustration.
(You can try to replicate it before looking at the solution if you want to)
As you can see, the look-ahead estimator is a much tighter fit than the kernel density estimator.
If you repeat the simulation you will see that this is consistently the case.

Solution to Exercise 2.5.1

Look-ahead estimation of a TAR stationary density, where the TAR model is

𝑋𝑡+1 = 𝜃|𝑋𝑡 | + (1 − 𝜃2 )1/2 𝜉𝑡+1

34 Chapter 2. Continuous State Markov Chains

Advanced Quantitative Economics with Python

2.5. Exercises 35
Advanced Quantitative Economics with Python

and 𝜉𝑡 ∼ 𝑁 (0, 1).

Try running at n = 10, 100, 1000, 10000 to get an idea of the speed of convergence

ϕ = norm()
n = 500
θ = 0.8
# == Frequently used constants == #
d = np.sqrt(1 - θ**2)
δ = θ / d

def ψ_star(y):
"True stationary density of the TAR Model"
return 2 * norm.pdf(y) * norm.cdf(δ * y)

def p(x, y):

"Stochastic kernel for the TAR model."
return ϕ.pdf((y - θ * np.abs(x)) / d) / d

Z = ϕ.rvs(n)
X = np.empty(n)
for t in range(n-1):
X[t+1] = θ * np.abs(X[t]) + d * Z[t]
ψ_est = LAE(p, X)
k_est = gaussian_kde(X)

fig, ax = plt.subplots(figsize=(10, 7))

ys = np.linspace(-3, 3, 200)
ax.plot(ys, ψ_star(ys), 'b-', lw=2, alpha=0.6, label='true')
ax.plot(ys, ψ_est(ys), 'g-', lw=2, alpha=0.6, label='look-ahead estimate')
ax.plot(ys, k_est(ys), 'k-', lw=2, alpha=0.6, label='kernel based estimate')
ax.legend(loc='upper left')
plt.show()

36 Chapter 2. Continuous State Markov Chains

Advanced Quantitative Economics with Python

Exercise 2.5.2
Replicate the figure on global convergence shown above.
The densities come from the stochastic growth model treated at the start of the lecture.
Begin with the code found above.
Use the same parameters.
For the four initial distributions, use the shifted beta distributions

ψ_0 = beta(5, 5, scale=0.5, loc=i*2)

Solution to Exercise 2.5.2

Here’s one program that does the job

# == Define parameters == #
s = 0.2
δ = 0.1
a_σ = 0.4 # A = exp(B) where B ~ N(0, a_σ)
α = 0.4 # f(k) = k**α

ϕ = lognorm(a_σ)
(continues on next page)

2.5. Exercises 37
Advanced Quantitative Economics with Python

(continued from previous page)

def p(x, y):

"Stochastic kernel, vectorized in x. Both x and y must be positive."
d = s * x**α
return ϕ.pdf((y - (1 - δ) * x) / d) / d

n = 1000 # Number of observations at each date t

T = 40 # Compute density of k_t at 1,...,T

fig, axes = plt.subplots(2, 2, figsize=(11, 8))

axes = axes.flatten()
xmax = 6.5

for i in range(4):
ax = axes[i]
ax.set_xlim(0, xmax)
ψ_0 = beta(5, 5, scale=0.5, loc=i*2) # Initial distribution

# == Generate matrix s.t. t-th column is n observations of k_t == #

k = np.empty((n, T))
A = ϕ.rvs((n, T))
k[:, 0] = ψ_0.rvs(n)
for t in range(T-1):
k[:, t+1] = s * A[:,t] * k[:, t]**α + (1 - δ) * k[:, t]

# == Generate T instances of lae using this data, one for each t == #

laes = [LAE(p, k[:, t]) for t in range(T)]

ygrid = np.linspace(0.01, xmax, 150)

greys = [str(g) for g in np.linspace(0.0, 0.8, T)]
greys.reverse()
for ψ, g in zip(laes, greys):
ax.plot(ygrid, ψ(ygrid), color=g, lw=2, alpha=0.6)
ax.set_xlabel('capital')
plt.show()

38 Chapter 2. Continuous State Markov Chains

Advanced Quantitative Economics with Python

Exercise 2.5.3
A common way to compare distributions visually is with boxplots.
To illustrate, let’s generate three artificial data sets and compare them with a boxplot.
The three data sets we will use are:

{𝑋1 , … , 𝑋𝑛 } ∼ 𝐿𝑁 (0, 1), {𝑌1 , … , 𝑌𝑛 } ∼ 𝑁 (2, 1), and {𝑍1 , … , 𝑍𝑛 } ∼ 𝑁 (4, 1),

Here is the code and figure:

n = 500
x = np.random.randn(n) # N(0, 1)
x = np.exp(x) # Map x to lognormal
y = np.random.randn(n) + 2.0 # N(2, 1)
z = np.random.randn(n) + 4.0 # N(4, 1)

fig, ax = plt.subplots(figsize=(10, 6.6))

ax.boxplot([x, y, z])
ax.set_xticks((1, 2, 3))
ax.set_ylim(-2, 14)
ax.set_xticklabels(('$X$', '$Y$', '$Z$'), fontsize=16)
plt.show()

2.5. Exercises 39
Advanced Quantitative Economics with Python

Each data set is represented by a box, where the top and bottom of the box are the third and first quartiles of the data,
and the red line in the center is the median.
The boxes give some indication as to
• the location of probability mass for each sample
• whether the distribution is right-skewed (as is the lognormal distribution), etc
Now let’s put these ideas to use in a simulation.
Consider the threshold autoregressive model in (2.19).
We know that the distribution of 𝑋𝑡 will converge to (2.20) whenever |𝜃| < 1.
Let’s observe this convergence from different initial conditions using boxplots.
In particular, the exercise is to generate J boxplot figures, one for each initial condition 𝑋0 in

initial_conditions = np.linspace(8, 0, J)

For each 𝑋0 in this set,

1. Generate 𝑘 time-series of length 𝑛, each starting at 𝑋0 and obeying (2.19).
2. Create a boxplot representing 𝑛 distributions, where the 𝑡-th distribution shows the 𝑘 observations of 𝑋𝑡 .
Use 𝜃 = 0.9, 𝑛 = 20, 𝑘 = 5000, 𝐽 = 8

Solution to Exercise 2.5.3

Here’s a possible solution.

40 Chapter 2. Continuous State Markov Chains

Advanced Quantitative Economics with Python

Note the way we use vectorized code to simulate the 𝑘 time series for one boxplot all at once

n = 20
k = 5000
J = 8

θ = 0.9
d = np.sqrt(1 - θ**2)
δ = θ / d

fig, axes = plt.subplots(J, 1, figsize=(10, 4*J))

initial_conditions = np.linspace(8, 0, J)
X = np.empty((k, n))

for j in range(J):

axes[j].set_ylim(-4, 8)
axes[j].set_title(f'time series from t = {initial_conditions[j]}')

Z = np.random.randn(k, n)
X[:, 0] = initial_conditions[j]
for t in range(1, n):
X[:, t] = θ * np.abs(X[:, t-1]) + d * Z[:, t]
axes[j].boxplot(X)

plt.show()

2.5. Exercises 41
Advanced Quantitative Economics with Python

42 Chapter 2. Continuous State Markov Chains

Advanced Quantitative Economics with Python

2.6 Appendix

Here’s the proof of (2.6).

Let 𝐹𝑈 and 𝐹𝑉 be the cumulative distributions of 𝑈 and 𝑉 respectively.
By the definition of 𝑉 , we have 𝐹𝑉 (𝑣) = ℙ{𝑎 + 𝑏𝑈 ≤ 𝑣} = ℙ{𝑈 ≤ (𝑣 − 𝑎)/𝑏}.
In other words, 𝐹𝑉 (𝑣) = 𝐹𝑈 ((𝑣 − 𝑎)/𝑏).
Differentiating with respect to 𝑣 yields (2.6).

2.6. Appendix 43
Advanced Quantitative Economics with Python

44 Chapter 2. Continuous State Markov Chains

CHAPTER

THREE

REVERSE ENGINEERING A LA MUTH

In addition to what’s in Anaconda, this lecture uses the quantecon library.

!pip install --upgrade quantecon

We’ll also need the following imports:

import matplotlib.pyplot as plt

import numpy as np

from quantecon import Kalman

from quantecon import LinearStateSpace
np.set_printoptions(linewidth=120, precision=4, suppress=True)

This lecture uses the Kalman filter to reformulate John F. Muth’s first paper [Muth, 1960] about rational expectations.
Muth used classical prediction methods to reverse engineer a stochastic process that renders optimal Milton Friedman’s
[Friedman, 1956] “adaptive expectations” scheme.

3.1 Friedman (1956) and Muth (1960)

Milton Friedman [Friedman, 1956] (1956) posited that consumer’s forecast their future disposable income with the adap-
tive expectations scheme
∞
∗
𝑦𝑡+𝑖,𝑡 = 𝐾 ∑(1 − 𝐾)𝑗 𝑦𝑡−𝑗 (3.1)
𝑗=0

∗
where 𝐾 ∈ (0, 1) and 𝑦𝑡+𝑖,𝑡 is a forecast of future 𝑦 over horizon 𝑖.
Milton Friedman justified the exponential smoothing forecasting scheme (3.1) informally, noting that it seemed a plau-
sible way to use past income to forecast future income.
In his first paper about rational expectations, John F. Muth [Muth, 1960] reverse-engineered a univariate stochastic
∞
process {𝑦𝑡 }𝑡=−∞ for which Milton Friedman’s adaptive expectations scheme gives linear least forecasts of 𝑦𝑡+𝑗 for any
horizon 𝑖.
Muth sought a setting and a sense in which Friedman’s forecasting scheme is optimal.
That is, Muth asked for what optimal forecasting question is Milton Friedman’s adaptive expectation scheme the answer.
Muth (1960) used classical prediction methods based on lag-operators and 𝑧-transforms to find the answer to his question.
Please see lectures Classical Control with Linear Algebra and Classical Filtering and Prediction with Linear Algebra for an
introduction to the classical tools that Muth used.

45
Advanced Quantitative Economics with Python

Rather than using those classical tools, in this lecture we apply the Kalman filter to express the heart of Muth’s analysis
concisely.
The lecture First Look at Kalman Filter describes the Kalman filter.
We’ll use limiting versions of the Kalman filter corresponding to what are called stationary values in that lecture.

3.2 A Process for Which Adaptive Expectations are Optimal

Suppose that an observable 𝑦𝑡 is the sum of an unobserved random walk 𝑥𝑡 and an IID shock 𝜖2,𝑡 :

𝑥𝑡+1 = 𝑥𝑡 + 𝜎𝑥 𝜖1,𝑡+1
(3.2)
𝑦𝑡 = 𝑥𝑡 + 𝜎𝑦 𝜖2,𝑡

where
𝜖
[ 1,𝑡+1 ] ∼ 𝒩(0, 𝐼)
𝜖2,𝑡

is an IID process.

Note: A property of the state-space representation (3.2) is that in general neither 𝜖1,𝑡 nor 𝜖2,𝑡 is in the space spanned by
square-summable linear combinations of 𝑦𝑡 , 𝑦𝑡−1 , ….

𝜖
In general [ 1,𝑡 ] has more information about future 𝑦𝑡+𝑗 ’s than is contained in 𝑦𝑡 , 𝑦𝑡−1 , ….
𝜖2𝑡
We can use the asymptotic or stationary values of the Kalman gain and the one-step-ahead conditional state covariance
matrix to compute a time-invariant innovations representation

𝑥𝑡+1
̂ = 𝑥𝑡̂ + 𝐾𝑎𝑡
(3.3)
𝑦𝑡 = 𝑥𝑡̂ + 𝑎𝑡

where 𝑥𝑡̂ = 𝐸[𝑥𝑡 |𝑦𝑡−1 , 𝑦𝑡−2 , …] and 𝑎𝑡 = 𝑦𝑡 − 𝐸[𝑦𝑡 |𝑦𝑡−1 , 𝑦𝑡−2 , …].

Note: A key property about an innovations representation is that 𝑎𝑡 is in the space spanned by square summable linear
combinations of 𝑦𝑡 , 𝑦𝑡−1 , ….

For more ramifications of this property, see the lectures Shock Non-Invertibility and Recursive Models of Dynamic Linear
Economies.
Later we’ll stack these state-space systems (3.2) and (3.3) to display some classic findings of Muth.
But first, let’s create an instance of the state-space system (3.2) then apply the quantecon Kalman class, then uses it to
construct the associated “innovations representation”

# Make some parameter choices

# sigx/sigy are state noise std err and measurement noise std err
μ_0, σ_x, σ_y = 10, 1, 5

# Create a LinearStateSpace object

A, C, G, H = 1, σ_x, 1, σ_y
ss = LinearStateSpace(A, C, G, H, mu_0=μ_0)
(continues on next page)

46 Chapter 3. Reverse Engineering a la Muth

Advanced Quantitative Economics with Python

(continued from previous page)

# Set prior and initialize the Kalman type

x_hat_0, Σ_0 = 10, 1
kmuth = Kalman(ss, x_hat_0, Σ_0)

# Computes stationary values which we need for the innovation

# representation
S1, K1 = kmuth.stationary_values()

# Extract scalars from nested arrays

S1, K1 = S1.item(), K1.item()

# Form innovation representation state-space

Ak, Ck, Gk, Hk = A, K1, G, 1

ssk = LinearStateSpace(Ak, Ck, Gk, Hk, mu_0=x_hat_0)

3.3 Some Useful State-Space Math

Now we want to map the time-invariant innovations representation (3.3) and the original state-space system (3.2) into a
convenient form for deducing the impulse responses from the original shocks to the 𝑥𝑡 and 𝑥𝑡̂ .
Putting both of these representations into a single state-space system is yet another application of the insight that “finding
the state is an art”.
We’ll define a state vector and appropriate state-space matrices that allow us to represent both systems in one fell swoop.
Note that

𝑎𝑡 = 𝑥𝑡 + 𝜎𝑦 𝜖2,𝑡 − 𝑥𝑡̂

so that
𝑥𝑡+1
̂ = 𝑥𝑡̂ + 𝐾(𝑥𝑡 + 𝜎𝑦 𝜖2,𝑡 − 𝑥𝑡̂ )
= (1 − 𝐾)𝑥𝑡̂ + 𝐾𝑥𝑡 + 𝐾𝜎𝑦 𝜖2,𝑡

The stacked system

𝑥𝑡+1 1 0 0 𝑥𝑡 𝜎𝑥 0
⎤ = ⎡𝐾 𝜖1,𝑡+1
⎡ 𝑥̂
⎢ 𝑡+1 ⎥ ⎢ (1 − 𝐾) 𝐾𝜎𝑦 ⎤ ⎡ 𝑥̂ ⎤ + ⎡ 0
⎥⎢ 𝑡 ⎥ ⎢ 0⎤
⎥ [𝜖 ]
⎣𝜖2,𝑡+1 ⎦ ⎣ 0 0 0 ⎦ ⎣𝜖2,𝑡 ⎦ ⎣ 0 1⎦ 2,𝑡+1

𝑥
𝑦 1 0 𝜎𝑦 ⎡ 𝑡 ⎤
[ 𝑡] = [ ] ⎢ 𝑥𝑡̂ ⎥
𝑎𝑡 1 −1 𝜎𝑦
⎣𝜖2,𝑡 ⎦
𝜖
is a state-space system that tells us how the shocks [ 1,𝑡+1 ] affect states 𝑥𝑡+1
̂ , 𝑥𝑡 , the observable 𝑦𝑡 , and the innovation
𝜖2,𝑡+1
𝑎𝑡 .
With this tool at our disposal, let’s form the composite system and simulate it

3.3. Some Useful State-Space Math 47

Advanced Quantitative Economics with Python

# Create grand state-space for y_t, a_t as observed vars -- Use

# stacking trick above
Af = np.array([[ 1, 0, 0],
[K1, 1 - K1, K1 * σ_y],
[ 0, 0, 0]])
Cf = np.array([[σ_x, 0],
[ 0, K1 * σ_y],
[ 0, 1]])
Gf = np.array([[1, 0, σ_y],
[1, -1, σ_y]])

μ_true, μ_prior = 10, 10

μ_f = np.array([μ_true, μ_prior, 0]).reshape(3, 1)

# Create the state-space

ssf = LinearStateSpace(Af, Cf, Gf, mu_0=μ_f)

# Draw observations of y from the state-space model

N = 50
xf, yf = ssf.simulate(N)

print(f"Kalman gain = {K1}")

print(f"Conditional variance = {S1}")

Kalman gain = 0.1809975124224177

Conditional variance = 5.524937810560442

Now that we have simulated our joint system, we have 𝑥𝑡 , 𝑥𝑡̂ , and 𝑦𝑡 .
We can now investigate how these variables are related by plotting some key objects.

3.4 Estimates of Unobservables

First, let’s plot the hidden state 𝑥𝑡 and the filtered version 𝑥𝑡̂ that is linear-least squares projection of 𝑥𝑡 on the history
𝑦𝑡−1 , 𝑦𝑡−2 , …

fig, ax = plt.subplots()
ax.plot(xf[0, :], label="$x_t$")
ax.plot(xf[1, :], label="Filtered $x_t$")
ax.legend()
ax.set_xlabel("Time")
ax.set_title(r"$x$ vs $\hat{x}$")
plt.show()

48 Chapter 3. Reverse Engineering a la Muth

Advanced Quantitative Economics with Python

Note how 𝑥𝑡 and 𝑥𝑡̂ differ.

For Friedman, 𝑥𝑡̂ and not 𝑥𝑡 is the consumer’s idea about her/his permanent income.

3.5 Relationship of Unobservables to Observables

Now let’s plot 𝑥𝑡 and 𝑦𝑡 .

Recall that 𝑦𝑡 is just 𝑥𝑡 plus white noise

fig, ax = plt.subplots()
ax.plot(yf[0, :], label="y")
ax.plot(xf[0, :], label="x")
ax.legend()
ax.set_title(r"$x$ and $y$")
ax.set_xlabel("Time")
plt.show()

3.5. Relationship of Unobservables to Observables 49

Advanced Quantitative Economics with Python

We see above that 𝑦 seems to look like white noise around the values of 𝑥.

3.5.1 Innovations

Recall that we wrote down the innovation representation that depended on 𝑎𝑡 . We now plot the innovations {𝑎𝑡 }:

fig, ax = plt.subplots()
ax.plot(yf[1, :], label="a")
ax.legend()
ax.set_title(r"Innovation $a_t$")
ax.set_xlabel("Time")
plt.show()

50 Chapter 3. Reverse Engineering a la Muth

Advanced Quantitative Economics with Python

3.6 MA and AR Representations

Now we shall extract from the Kalman instance kmuth coefficients of

• a fundamental moving average representation that represents 𝑦𝑡 as a one-sided moving sum of current and past 𝑎𝑡 s
that are square summable linear combinations of 𝑦𝑡 , 𝑦𝑡−1 , ….
• a univariate autoregression representation that depicts the coefficients in a linear least square projection of 𝑦𝑡 on
the semi-infinite history 𝑦𝑡−1 , 𝑦𝑡−2 , ….
Then we’ll plot each of them

# Kalman Methods for MA and VAR

coefs_ma = kmuth.stationary_coefficients(5, "ma")
coefs_var = kmuth.stationary_coefficients(5, "var")

# Coefficients come in a list of arrays, but we

# want to plot them and so need to stack into an array
coefs_ma_array = np.vstack(coefs_ma)
coefs_var_array = np.vstack(coefs_var)

fig, ax = plt.subplots(2)
ax[0].plot(coefs_ma_array, label="MA")
ax[0].legend()
ax[1].plot(coefs_var_array, label="VAR")
(continues on next page)

3.6. MA and AR Representations 51

Advanced Quantitative Economics with Python

(continued from previous page)

ax[1].legend()

plt.show()

The moving average coefficients in the top panel show tell-tale signs of 𝑦𝑡 being a process whose first difference is a
first-order autoregression.
The autoregressive coefficients decline geometrically with decay rate (1 − 𝐾).
These are exactly the target outcomes that Muth (1960) aimed to reverse engineer

print(f'decay parameter 1 - K1 = {1 - K1}')

decay parameter 1 - K1 = 0.8190024875775823

52 Chapter 3. Reverse Engineering a la Muth

CHAPTER

FOUR

DISCRETE STATE DYNAMIC PROGRAMMING

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install --upgrade quantecon

4.1 Overview

In this lecture we discuss a family of dynamic programming problems with the following features:
1. a discrete state space and discrete choices (actions)
2. an infinite horizon
3. discounted rewards
4. Markov state transitions
We call such problems discrete dynamic programs or discrete DPs.
Discrete DPs are the workhorses in much of modern quantitative economics, including
• monetary economics
• search and labor economics
• household savings and consumption theory
• investment theory
• asset pricing
• industrial organization, etc.
When a given model is not inherently discrete, it is common to replace it with a discretized version in order to use discrete
DP techniques.
This lecture covers
• the theory of dynamic programming in a discrete setting, plus examples and applications
• a powerful set of routines for solving discrete DPs from the QuantEcon code library
Let’s start with some imports:

import numpy as np
import matplotlib.pyplot as plt
import quantecon as qe
import scipy.sparse as sparse
(continues on next page)

53
Advanced Quantitative Economics with Python

(continued from previous page)

from quantecon import compute_fixed_point
from quantecon.markov import DiscreteDP

4.1.1 How to Read this Lecture

We use dynamic programming many applied lectures, such as

• The shortest path lecture
• The McCall search model lecture
The objective of this lecture is to provide a more systematic and theoretical treatment, including algorithms and imple-
mentation while focusing on the discrete case.

4.1.2 Code

Among other things, it offers

• a flexible, well-designed interface
• multiple solution methods, including value function and policy function iteration
• high-speed operations via carefully optimized JIT-compiled functions
• the ability to scale to large problems by minimizing vectorized operators and allowing operations on sparse matrices
JIT compilation relies on Numba, which should work seamlessly if you are using Anaconda as suggested.

4.1.3 References

For background reading on dynamic programming and additional applications, see, for example,
• [Ljungqvist and Sargent, 2018]
• [Hernandez-Lerma and Lasserre, 1996], section 3.5
• [Puterman, 2005]
• [Stokey et al., 1989]
• [Rust, 1996]
• [Miranda and Fackler, 2002]
• EDTC, chapter 5

4.2 Discrete DPs

Loosely speaking, a discrete DP is a maximization problem with an objective function of the form
∞
𝔼 ∑ 𝛽 𝑡 𝑟(𝑠𝑡 , 𝑎𝑡 ) (4.1)
𝑡=0

where
• 𝑠𝑡 is the state variable

54 Chapter 4. Discrete State Dynamic Programming

Advanced Quantitative Economics with Python

• 𝑎𝑡 is the action
• 𝛽 is a discount factor
• 𝑟(𝑠𝑡 , 𝑎𝑡 ) is interpreted as a current reward when the state is 𝑠𝑡 and the action chosen is 𝑎𝑡
Each pair (𝑠𝑡 , 𝑎𝑡 ) pins down transition probabilities 𝑄(𝑠𝑡 , 𝑎𝑡 , 𝑠𝑡+1 ) for the next period state 𝑠𝑡+1 .
Thus, actions influence not only current rewards but also the future time path of the state.
The essence of dynamic programming problems is to trade off current rewards vs favorable positioning of the future state
(modulo randomness).
Examples:
• consuming today vs saving and accumulating assets
• accepting a job offer today vs seeking a better one in the future
• exercising an option now vs waiting

4.2.1 Policies

The most fruitful way to think about solutions to discrete DP problems is to compare policies.
In general, a policy is a randomized map from past actions and states to current action.
In the setting formalized below, it suffices to consider so-called stationary Markov policies, which consider only the current
state.
In particular, a stationary Markov policy is a map 𝜎 from states to actions
• 𝑎𝑡 = 𝜎(𝑠𝑡 ) indicates that 𝑎𝑡 is the action to be taken in state 𝑠𝑡
It is known that, for any arbitrary policy, there exists a stationary Markov policy that dominates it at least weakly.
• See section 5.5 of [Puterman, 2005] for discussion and proofs.
In what follows, stationary Markov policies are referred to simply as policies.
The aim is to find an optimal policy, in the sense of one that maximizes (4.1).
Let’s now step through these ideas more carefully.

4.2.2 Formal Definition

Formally, a discrete dynamic program consists of the following components:

1. A finite set of states 𝑆 = {0, … , 𝑛 − 1}.
2. A finite set of feasible actions 𝐴(𝑠) for each state 𝑠 ∈ 𝑆, and a corresponding set of feasible state-action pairs.

SA ∶= {(𝑠, 𝑎) ∣ 𝑠 ∈ 𝑆, 𝑎 ∈ 𝐴(𝑠)}

3. A reward function 𝑟 ∶ SA → ℝ.
4. A transition probability function 𝑄 ∶ SA → Δ(𝑆), where Δ(𝑆) is the set of probability distributions over 𝑆.
5. A discount factor 𝛽 ∈ [0, 1).

4.2. Discrete DPs 55

Advanced Quantitative Economics with Python

We also use the notation 𝐴 ∶= ⋃𝑠∈𝑆 𝐴(𝑠) = {0, … , 𝑚 − 1} and call this set the action space.
A policy is a function 𝜎 ∶ 𝑆 → 𝐴.
A policy is called feasible if it satisfies 𝜎(𝑠) ∈ 𝐴(𝑠) for all 𝑠 ∈ 𝑆.
Denote the set of all feasible policies by Σ.
If a decision-maker uses a policy 𝜎 ∈ Σ, then
• the current reward at time 𝑡 is 𝑟(𝑠𝑡 , 𝜎(𝑠𝑡 ))
• the probability that 𝑠𝑡+1 = 𝑠′ is 𝑄(𝑠𝑡 , 𝜎(𝑠𝑡 ), 𝑠′ )
For each 𝜎 ∈ Σ, define
• 𝑟𝜎 by 𝑟𝜎 (𝑠) ∶= 𝑟(𝑠, 𝜎(𝑠)))
• 𝑄𝜎 by 𝑄𝜎 (𝑠, 𝑠′ ) ∶= 𝑄(𝑠, 𝜎(𝑠), 𝑠′ )
Notice that 𝑄𝜎 is a stochastic matrix on 𝑆.
It gives transition probabilities of the controlled chain when we follow policy 𝜎.
If we think of 𝑟𝜎 as a column vector, then so is 𝑄𝑡𝜎 𝑟𝜎 , and the 𝑠-th row of the latter has the interpretation

(𝑄𝑡𝜎 𝑟𝜎 )(𝑠) = 𝔼[𝑟(𝑠𝑡 , 𝜎(𝑠𝑡 )) ∣ 𝑠0 = 𝑠] when {𝑠𝑡 } ∼ 𝑄𝜎 (4.2)

Comments
• {𝑠𝑡 } ∼ 𝑄𝜎 means that the state is generated by stochastic matrix 𝑄𝜎 .
• See this discussion on computing expectations of Markov chains for an explanation of the expression in (4.2).
Notice that we’re not really distinguishing between functions from 𝑆 to ℝ and vectors in ℝ𝑛 .
This is natural because they are in one to one correspondence.

4.2.3 Value and Optimality

Let 𝑣𝜎 (𝑠) denote the discounted sum of expected reward flows from policy 𝜎 when the initial state is 𝑠.
To calculate this quantity we pass the expectation through the sum in (4.1) and use (4.2) to get
∞
𝑣𝜎 (𝑠) = ∑ 𝛽 𝑡 (𝑄𝑡𝜎 𝑟𝜎 )(𝑠) (𝑠 ∈ 𝑆)
𝑡=0

This function is called the policy value function for the policy 𝜎.
The optimal value function, or simply value function, is the function 𝑣∗ ∶ 𝑆 → ℝ defined by

𝑣∗ (𝑠) = max 𝑣𝜎 (𝑠) (𝑠 ∈ 𝑆)

𝜎∈Σ

(We can use max rather than sup here because the domain is a finite set)
A policy 𝜎 ∈ Σ is called optimal if 𝑣𝜎 (𝑠) = 𝑣∗ (𝑠) for all 𝑠 ∈ 𝑆.
Given any 𝑤 ∶ 𝑆 → ℝ, a policy 𝜎 ∈ Σ is called 𝑤-greedy if

𝜎(𝑠) ∈ arg max {𝑟(𝑠, 𝑎) + 𝛽 ∑ 𝑤(𝑠′ )𝑄(𝑠, 𝑎, 𝑠′ )} (𝑠 ∈ 𝑆)

𝑎∈𝐴(𝑠) 𝑠′ ∈𝑆

As discussed in detail below, optimal policies are precisely those that are 𝑣∗ -greedy.

56 Chapter 4. Discrete State Dynamic Programming

Advanced Quantitative Economics with Python

4.2.4 Two Operators

It is useful to define the following operators:

• The Bellman operator 𝑇 ∶ ℝ𝑆 → ℝ𝑆 is defined by

(𝑇 𝑣)(𝑠) = max {𝑟(𝑠, 𝑎) + 𝛽 ∑ 𝑣(𝑠′ )𝑄(𝑠, 𝑎, 𝑠′ )} (𝑠 ∈ 𝑆)

𝑎∈𝐴(𝑠)
𝑠′ ∈𝑆

• For any policy function 𝜎 ∈ Σ, the operator 𝑇𝜎 ∶ ℝ𝑆 → ℝ𝑆 is defined by

(𝑇𝜎 𝑣)(𝑠) = 𝑟(𝑠, 𝜎(𝑠)) + 𝛽 ∑ 𝑣(𝑠′ )𝑄(𝑠, 𝜎(𝑠), 𝑠′ ) (𝑠 ∈ 𝑆)
𝑠′ ∈𝑆
This can be written more succinctly in operator notation as
𝑇𝜎 𝑣 = 𝑟𝜎 + 𝛽𝑄𝜎 𝑣
The two operators are both monotone
• 𝑣 ≤ 𝑤 implies 𝑇 𝑣 ≤ 𝑇 𝑤 pointwise on 𝑆, and similarly for 𝑇𝜎
They are also contraction mappings with modulus 𝛽
• ‖𝑇 𝑣 − 𝑇 𝑤‖ ≤ 𝛽‖𝑣 − 𝑤‖ and similarly for 𝑇𝜎 , where ‖⋅‖ is the max norm
For any policy 𝜎, its value 𝑣𝜎 is the unique fixed point of 𝑇𝜎 .
For proofs of these results and those in the next section, see, for example, EDTC, chapter 10.

4.2.5 The Bellman Equation and the Principle of Optimality

The main principle of the theory of dynamic programming is that

• the optimal value function 𝑣∗ is a unique solution to the Bellman equation

𝑣(𝑠) = max {𝑟(𝑠, 𝑎) + 𝛽 ∑ 𝑣(𝑠′ )𝑄(𝑠, 𝑎, 𝑠′ )} (𝑠 ∈ 𝑆)

𝑎∈𝐴(𝑠)
𝑠′ ∈𝑆
or in other words, 𝑣∗ is the unique fixed point of 𝑇 , and
• 𝜎∗ is an optimal policy function if and only if it is 𝑣∗ -greedy
By the definition of greedy policies given above, this means that

𝜎∗ (𝑠) ∈ arg max {𝑟(𝑠, 𝑎) + 𝛽 ∑ 𝑣∗ (𝑠′ )𝑄(𝑠, 𝑎, 𝑠′ )} (𝑠 ∈ 𝑆)

𝑎∈𝐴(𝑠) 𝑠′ ∈𝑆

4.3 Solving Discrete DPs

Now that the theory has been set out, let’s turn to solution methods.
The code for solving discrete DPs is available in ddp.py from the QuantEcon.py code library.
It implements the three most important solution methods for discrete dynamic programs, namely
• value function iteration
• policy function iteration
• modified policy function iteration
Let’s briefly review these algorithms and their implementation.

4.3. Solving Discrete DPs 57

Advanced Quantitative Economics with Python

4.3.1 Value Function Iteration

Perhaps the most familiar method for solving all manner of dynamic programs is value function iteration.
This algorithm uses the fact that the Bellman operator 𝑇 is a contraction mapping with fixed point 𝑣∗ .
Hence, iterative application of 𝑇 to any initial function 𝑣0 ∶ 𝑆 → ℝ converges to 𝑣∗ .
The details of the algorithm can be found in the appendix.

4.3.2 Policy Function Iteration

This routine, also known as Howard’s policy improvement algorithm, exploits more closely the particular structure of a
discrete DP problem.
Each iteration consists of
1. A policy evaluation step that computes the value 𝑣𝜎 of a policy 𝜎 by solving the linear equation 𝑣 = 𝑇𝜎 𝑣.
2. A policy improvement step that computes a 𝑣𝜎 -greedy policy.
In the current setting, policy iteration computes an exact optimal policy in finitely many iterations.
• See theorem 10.2.6 of EDTC for a proof.
The details of the algorithm can be found in the appendix.

4.3.3 Modified Policy Function Iteration

Modified policy iteration replaces the policy evaluation step in policy iteration with “partial policy evaluation”.
The latter computes an approximation to the value of a policy 𝜎 by iterating 𝑇𝜎 for a specified number of times.
This approach can be useful when the state space is very large and the linear system in the policy evaluation step of policy
iteration is correspondingly difficult to solve.
The details of the algorithm can be found in the appendix.

4.4 Example: A Growth Model

Let’s consider a simple consumption-saving model.

A single household either consumes or stores its own output of a single consumption good.
The household starts each period with current stock 𝑠.
Next, the household chooses a quantity 𝑎 to store and consumes 𝑐 = 𝑠 − 𝑎
• Storage is limited by a global upper bound 𝑀 .
• Flow utility is 𝑢(𝑐) = 𝑐𝛼 .
Output is drawn from a discrete uniform distribution on {0, … , 𝐵}.
The next period stock is therefore

𝑠′ = 𝑎 + 𝑈 where 𝑈 ∼ 𝑈 [0, … , 𝐵]

The discount factor is 𝛽 ∈ [0, 1).

58 Chapter 4. Discrete State Dynamic Programming

Advanced Quantitative Economics with Python

4.4.1 Discrete DP Representation

We want to represent this model in the format of a discrete dynamic program.

To this end, we take
• the state variable to be the stock 𝑠
• the state space to be 𝑆 = {0, … , 𝑀 + 𝐵}
– hence 𝑛 = 𝑀 + 𝐵 + 1
• the action to be the storage quantity 𝑎
• the set of feasible actions at 𝑠 to be 𝐴(𝑠) = {0, … , min{𝑠, 𝑀 }}
– hence 𝐴 = {0, … , 𝑀 } and 𝑚 = 𝑀 + 1
• the reward function to be 𝑟(𝑠, 𝑎) = 𝑢(𝑠 − 𝑎)
• the transition probabilities to be
1
if 𝑎 ≤ 𝑠′ ≤ 𝑎 + 𝐵
𝑄(𝑠, 𝑎, 𝑠′ ) ∶= { 𝐵+1 (4.3)
0 otherwise

4.4.2 Defining a DiscreteDP Instance

This information will be used to create an instance of DiscreteDP by passing the following information
1. An 𝑛 × 𝑚 reward array 𝑅.
2. An 𝑛 × 𝑚 × 𝑛 transition probability array 𝑄.
3. A discount factor 𝛽.
For 𝑅 we set 𝑅[𝑠, 𝑎] = 𝑢(𝑠 − 𝑎) if 𝑎 ≤ 𝑠 and −∞ otherwise.
For 𝑄 we follow the rule in (4.3).

Note:
• The feasibility constraint is embedded into 𝑅 by setting 𝑅[𝑠, 𝑎] = −∞ for 𝑎 ∉ 𝐴(𝑠).
• Probability distributions for (𝑠, 𝑎) with 𝑎 ∉ 𝐴(𝑠) can be arbitrary.

The following code sets up these objects for us

class SimpleOG:

def init(self, B=10, M=5, α=0.5, β=0.9):

"""
Set up R, Q and β, the three elements that define an instance of
the DiscreteDP class.
"""

self.B, self.M, self.α, self.β = B, M, α, β

self.n = B + M + 1
self.m = M + 1

self.R = np.empty((self.n, self.m))

(continues on next page)

4.4. Example: A Growth Model 59

Advanced Quantitative Economics with Python

(continued from previous page)

self.Q = np.zeros((self.n, self.m, self.n))

self.populate_Q()
self.populate_R()

def u(self, c):

return c**self.α

def populate_R(self):
"""
Populate the R matrix, with R[s, a] = -np.inf for infeasible
state-action pairs.
"""
for s in range(self.n):
for a in range(self.m):
self.R[s, a] = self.u(s - a) if a <= s else -np.inf

def populate_Q(self):
"""
Populate the Q matrix by setting

Q[s, a, s'] = 1 / (1 + B) if a <= s' <= a + B

and zero otherwise.

"""

for a in range(self.m):
self.Q[:, a, a:(a + self.B + 1)] = 1.0 / (self.B + 1)

Let’s run this code and create an instance of SimpleOG.

g = SimpleOG() # Use default parameters

Instances of DiscreteDP are created using the signature DiscreteDP(R, Q, β).

Let’s create an instance using the objects stored in g

ddp = qe.markov.DiscreteDP(g.R, g.Q, g.β)

Now that we have an instance ddp of DiscreteDP we can solve it as follows

results = ddp.solve(method='policy_iteration')

Let’s see what we’ve got here

dir(results)

['max_iter', 'mc', 'method', 'num_iter', 'sigma', 'v']

(In IPython version 4.0 and above you can also type results. and hit the tab key)
The most important attributes are v, the value function, and σ, the optimal policy

results.v

60 Chapter 4. Discrete State Dynamic Programming

Advanced Quantitative Economics with Python

array([19.01740222, 20.01740222, 20.43161578, 20.74945302, 21.04078099,

21.30873018, 21.54479816, 21.76928181, 21.98270358, 22.18824323,
22.3845048 , 22.57807736, 22.76109127, 22.94376708, 23.11533996,
23.27761762])

results.sigma

array([0, 0, 0, 0, 1, 1, 1, 2, 2, 3, 3, 4, 5, 5, 5, 5])

Since we’ve used policy iteration, these results will be exact unless we hit the iteration bound max_iter.
Let’s make sure this didn’t happen

results.max_iter

250

results.num_iter

Another interesting object is results.mc, which is the controlled chain defined by 𝑄𝜎∗ , where 𝜎∗ is the optimal policy.
In other words, it gives the dynamics of the state when the agent follows the optimal policy.
Since this object is an instance of MarkovChain from QuantEcon.py (see this lecture for more discussion), we can easily
simulate it, compute its stationary distribution and so on.

results.mc.stationary_distributions

array([[0.01732187, 0.04121063, 0.05773956, 0.07426848, 0.08095823,

0.09090909, 0.09090909, 0.09090909, 0.09090909, 0.09090909,
0.09090909, 0.07358722, 0.04969846, 0.03316953, 0.01664061,
0.00995086]])

Here’s the same information in a bar graph

What happens if the agent is more patient?

ddp = qe.markov.DiscreteDP(g.R, g.Q, 0.99) # Increase β to 0.99

results = ddp.solve(method='policy_iteration')
results.mc.stationary_distributions

array([[0.00546913, 0.02321342, 0.03147788, 0.04800681, 0.05627127,

0.09090909, 0.09090909, 0.09090909, 0.09090909, 0.09090909,
0.09090909, 0.08543996, 0.06769567, 0.05943121, 0.04290228,
0.03463782]])

If we look at the bar graph we can see the rightward shift in probability mass

4.4. Example: A Growth Model 61

Advanced Quantitative Economics with Python

62 Chapter 4. Discrete State Dynamic Programming

Advanced Quantitative Economics with Python

4.4.3 State-Action Pair Formulation

The DiscreteDP class in fact, provides a second interface to set up an instance.

One of the advantages of this alternative set up is that it permits the use of a sparse matrix for Q.
(An example of using sparse matrices is given in the exercises below)
The call signature of the second formulation is DiscreteDP(R, Q, β, s_indices, a_indices) where
• s_indices and a_indices are arrays of equal length L enumerating all feasible state-action pairs
• R is an array of length L giving corresponding rewards
• Q is an L x n transition probability array
Here’s how we could set up these objects for the preceding example

B, M, α, β = 10, 5, 0.5, 0.9

n = B + M + 1
m = M + 1

def u(c):
return c**α

s_indices = []
a_indices = []
Q = []
R = []
b = 1.0 / (B + 1)

for s in range(n):
for a in range(min(M, s) + 1): # All feasible a at this s
s_indices.append(s)
a_indices.append(a)
q = np.zeros(n)
q[a:(a + B + 1)] = b # b on these values, otherwise 0
Q.append(q)
R.append(u(s - a))

ddp = qe.markov.DiscreteDP(R, Q, β, s_indices, a_indices)

For larger problems, you might need to write this code more efficiently by vectorizing or using Numba.

4.5 Exercises

In the stochastic optimal growth lecture from our introductory lecture series, we solve a benchmark model that has an
analytical solution.
The exercise is to replicate this solution using DiscreteDP.

4.5. Exercises 63
Advanced Quantitative Economics with Python

4.6 Solutions

4.6.1 Setup

Details of the model can be found in the lecture on optimal growth.

We let 𝑓(𝑘) = 𝑘𝛼 with 𝛼 = 0.65, 𝑢(𝑐) = log 𝑐, and 𝛽 = 0.95

α = 0.65
f = lambda k: k**α
u = np.log
β = 0.95

Here we want to solve a finite state version of the continuous state model above.
We discretize the state space into a grid of size grid_size=500, from 10−6 to grid_max=2

grid_max = 2
grid_size = 500
grid = np.linspace(1e-6, grid_max, grid_size)

We choose the action to be the amount of capital to save for the next period (the state is the capital stock at the beginning
of the period).
Thus the state indices and the action indices are both 0, …, grid_size-1.
Action (indexed by) a is feasible at state (indexed by) s if and only if grid[a] < f([grid[s]) (zero consumption
is not allowed because of the log utility).
Thus the Bellman equation is:

𝑣(𝑘) = max 𝑢(𝑓(𝑘) − 𝑘′ ) + 𝛽𝑣(𝑘′ ),

0<𝑘′ <𝑓(𝑘)

where 𝑘′ is the capital stock in the next period.

The transition probability array Q will be highly sparse (in fact it is degenerate as the model is deterministic), so we
formulate the problem with state-action pairs, to represent Q in scipy sparse matrix format.
We first construct indices for state-action pairs:

# Consumption matrix, with nonpositive consumption included

C = f(grid).reshape(grid_size, 1) - grid.reshape(1, grid_size)

# State-action indices
s_indices, a_indices = np.where(C > 0)

# Number of state-action pairs

L = len(s_indices)

print(L)
print(s_indices)
print(a_indices)

118841
[ 0 1 1 ... 499 499 499]
[ 0 0 1 ... 389 390 391]

64 Chapter 4. Discrete State Dynamic Programming

Advanced Quantitative Economics with Python

Reward vector R (of length L):

R = u(C[s_indices, a_indices])

(Degenerate) transition probability matrix Q (of shape (L, grid_size)), where we choose the scipy.sparse.lil_matrix
format, while any format will do (internally it will be converted to the csr format):

Q = sparse.lil_matrix((L, grid_size))
Q[np.arange(L), a_indices] = 1

(If you are familiar with the data structure of scipy.sparse.csr_matrix, the following is the most efficient way to create the
Q matrix in the current case)

# data = np.ones(L)
# indptr = np.arange(L+1)
# Q = sparse.csr_matrix((data, a_indices, indptr), shape=(L, grid_size))

Discrete growth model:

ddp = DiscreteDP(R, Q, β, s_indices, a_indices)

Notes
Here we intensively vectorized the operations on arrays to simplify the code.
As noted, however, vectorization is memory consumptive, and it can be prohibitively so for grids with large size.

4.6.2 Solving the Model

Solve the dynamic optimization problem:

res = ddp.solve(method='policy_iteration')
v, σ, num_iter = res.v, res.sigma, res.num_iter
num_iter

Note that sigma contains the indices of the optimal capital stocks to save for the next period. The following translates
sigma to the corresponding consumption vector.

# Optimal consumption in the discrete version

c = f(grid) - grid[σ]

# Exact solution of the continuous version

ab = α * β
c1 = (np.log(1 - ab) + np.log(ab) * ab / (1 - ab)) / (1 - β)
c2 = α / (1 - ab)

def v_star(k):
return c1 + c2 * np.log(k)

def c_star(k):
return (1 - ab) * k**α

Let us compare the solution of the discrete model with that of the original continuous model

4.6. Solutions 65
Advanced Quantitative Economics with Python

fig, ax = plt.subplots(1, 2, figsize=(14, 4))

ax[0].set_ylim(-40, -32)
ax[0].set_xlim(grid[0], grid[-1])
ax[1].set_xlim(grid[0], grid[-1])

lb0 = 'discrete value function'

ax[0].plot(grid, v, lw=2, alpha=0.6, label=lb0)

lb0 = 'continuous value function'

ax[0].plot(grid, v_star(grid), 'k-', lw=1.5, alpha=0.8, label=lb0)
ax[0].legend(loc='upper left')

lb1 = 'discrete optimal consumption'

ax[1].plot(grid, c, 'b-', lw=2, alpha=0.6, label=lb1)

lb1 = 'continuous optimal consumption'

ax[1].plot(grid, c_star(grid), 'k-', lw=1.5, alpha=0.8, label=lb1)
ax[1].legend(loc='upper left')
plt.show()

The outcomes appear very close to those of the continuous version.

Except for the “boundary” point, the value functions are very close:

np.abs(v - v_star(grid)).max()

121.49819147053378

np.abs(v - v_star(grid))[1:].max()

0.012681735127500815

The optimal consumption functions are close as well:

np.abs(c - c_star(grid)).max()

0.003826523100010082

In fact, the optimal consumption obtained in the discrete version is not really monotone, but the decrements are quite
small:

66 Chapter 4. Discrete State Dynamic Programming

Advanced Quantitative Economics with Python

diff = np.diff(c)
(diff >= 0).all()

False

dec_ind = np.where(diff < 0)[0]

len(dec_ind)

174

np.abs(diff[dec_ind]).max()

0.001961853339766839

The value function is monotone:

(np.diff(v) > 0).all()

True

4.6.3 Comparison of the Solution Methods

Let us solve the problem with the other two methods.

Value Iteration

ddp.epsilon = 1e-4
ddp.max_iter = 500
res1 = ddp.solve(method='value_iteration')
res1.num_iter

294

np.array_equal(σ, res1.sigma)

True

4.6. Solutions 67
Advanced Quantitative Economics with Python

Modified Policy Iteration

res2 = ddp.solve(method='modified_policy_iteration')
res2.num_iter

np.array_equal(σ, res2.sigma)

True

Speed Comparison

%timeit ddp.solve(method='value_iteration')
%timeit ddp.solve(method='policy_iteration')
%timeit ddp.solve(method='modified_policy_iteration')

94.9 ms ± 360 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)

9.34 ms ± 16.9 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

11.3 ms ± 59.9 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

As is often the case, policy iteration and modified policy iteration are much faster than value iteration.

4.6.4 Replication of the Figures

Using DiscreteDP we replicate the figures shown in the lecture.

Convergence of Value Iteration

Let us first visualize the convergence of the value iteration algorithm as in the lecture, where we use ddp.
bellman_operator implemented as a method of DiscreteDP

w = 5 * np.log(grid) - 25 # Initial condition

n = 35
fig, ax = plt.subplots(figsize=(8,5))
ax.set_ylim(-40, -20)
ax.set_xlim(np.min(grid), np.max(grid))
lb = 'initial condition'
ax.plot(grid, w, color=plt.cm.jet(0), lw=2, alpha=0.6, label=lb)
for i in range(n):
w = ddp.bellman_operator(w)
ax.plot(grid, w, color=plt.cm.jet(i / n), lw=2, alpha=0.6)
lb = 'true value function'
ax.plot(grid, v_star(grid), 'k-', lw=2, alpha=0.8, label=lb)
ax.legend(loc='upper left')
(continues on next page)

68 Chapter 4. Discrete State Dynamic Programming

Advanced Quantitative Economics with Python

(continued from previous page)

plt.show()

We next plot the consumption policies along with the value iteration

w = 5 * u(grid) - 25 # Initial condition

fig, ax = plt.subplots(3, 1, figsize=(8, 10))

true_c = c_star(grid)

for i, n in enumerate((2, 4, 6)):

ax[i].set_ylim(0, 1)
ax[i].set_xlim(0, 2)
ax[i].set_yticks((0, 1))
ax[i].set_xticks((0, 2))

w = 5 * u(grid) - 25 # Initial condition

compute_fixed_point(ddp.bellman_operator, w, max_iter=n, print_skip=1)
σ = ddp.compute_greedy(w) # Policy indices
c_policy = f(grid) - grid[σ]

ax[i].plot(grid, c_policy, 'b-', lw=2, alpha=0.8,

label='approximate optimal consumption policy')
ax[i].plot(grid, true_c, 'k-', lw=2, alpha=0.8,
label='true optimal consumption policy')
ax[i].legend(loc='upper left')
ax[i].set_title(f'{n} value function iterations')
plt.show()

4.6. Solutions 69
Advanced Quantitative Economics with Python

Iteration Distance Elapsed (seconds)

---------------------------------------------
1 5.518e+00 6.032e-04
2 4.070e+00 9.885e-04
Iteration Distance Elapsed (seconds)
---------------------------------------------
1 5.518e+00 3.893e-04
2 4.070e+00 7.424e-04
3 3.866e+00 1.085e-03
4 3.673e+00 1.462e-03
Iteration Distance Elapsed (seconds)
---------------------------------------------
1 5.518e+00 3.693e-04
2 4.070e+00 7.381e-04
3 3.866e+00 1.083e-03
4 3.673e+00 1.423e-03
5 3.489e+00 1.779e-03
6 3.315e+00 2.120e-03

/home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages/quantecon/_
↪compute_fp.py:152: RuntimeWarning: max_iter attained before convergence in␣

↪compute_fixed_point

warnings.warn(_non_convergence_msg, RuntimeWarning)

70 Chapter 4. Discrete State Dynamic Programming

Advanced Quantitative Economics with Python

4.6. Solutions 71
Advanced Quantitative Economics with Python

Dynamics of the Capital Stock

Finally, let us work on Exercise 2, where we plot the trajectories of the capital stock for three different discount factors,
0.9, 0.94, and 0.98, with initial condition 𝑘0 = 0.1.

discount_factors = (0.9, 0.94, 0.98)

k_init = 0.1

# Search for the index corresponding to k_init

k_init_ind = np.searchsorted(grid, k_init)

sample_size = 25

fig, ax = plt.subplots(figsize=(8,5))
ax.set_xlabel("time")
ax.set_ylabel("capital")
ax.set_ylim(0.10, 0.30)

# Create a new instance, not to modify the one used above

ddp0 = DiscreteDP(R, Q, β, s_indices, a_indices)

for beta in discount_factors:

ddp0.beta = beta
res0 = ddp0.solve()
k_path_ind = res0.mc.simulate(init=k_init_ind, ts_length=sample_size)
k_path = grid[k_path_ind]
ax.plot(k_path, 'o-', lw=2, alpha=0.75, label=fr'$\beta = {beta}$')

ax.legend(loc='lower right')
plt.show()

72 Chapter 4. Discrete State Dynamic Programming

Advanced Quantitative Economics with Python

4.7 Appendix: Algorithms

This appendix covers the details of the solution algorithms implemented for DiscreteDP.
We will make use of the following notions of approximate optimality:
• For 𝜀 > 0, 𝑣 is called an 𝜀-approximation of 𝑣∗ if ‖𝑣 − 𝑣∗ ‖ < 𝜀.
• A policy 𝜎 ∈ Σ is called 𝜀-optimal if 𝑣𝜎 is an 𝜀-approximation of 𝑣∗ .

4.7.1 Value Iteration

The DiscreteDP value iteration method implements value function iteration as follows
1. Choose any 𝑣0 ∈ ℝ𝑛 , and specify 𝜀 > 0; set 𝑖 = 0.
2. Compute 𝑣𝑖+1 = 𝑇 𝑣𝑖 .
3. If ‖𝑣𝑖+1 − 𝑣𝑖 ‖ < [(1 − 𝛽)/(2𝛽)]𝜀, then go to step 4; otherwise, set 𝑖 = 𝑖 + 1 and go to step 2.
4. Compute a 𝑣𝑖+1 -greedy policy 𝜎, and return 𝑣𝑖+1 and 𝜎.
Given 𝜀 > 0, the value iteration algorithm
• terminates in a finite number of iterations
• returns an 𝜀/2-approximation of the optimal value function and an 𝜀-optimal policy function (unless iter_max
is reached)
(While not explicit, in the actual implementation each algorithm is terminated if the number of iterations reaches
iter_max)

4.7.2 Policy Iteration

The DiscreteDP policy iteration method runs as follows

1. Choose any 𝑣0 ∈ ℝ𝑛 and compute a 𝑣0 -greedy policy 𝜎0 ; set 𝑖 = 0.
2. Compute the value 𝑣𝜎𝑖 by solving the equation 𝑣 = 𝑇𝜎𝑖 𝑣.
3. Compute a 𝑣𝜎𝑖 -greedy policy 𝜎𝑖+1 ; let 𝜎𝑖+1 = 𝜎𝑖 if possible.
4. If 𝜎𝑖+1 = 𝜎𝑖 , then return 𝑣𝜎𝑖 and 𝜎𝑖+1 ; otherwise, set 𝑖 = 𝑖 + 1 and go to step 2.
The policy iteration algorithm terminates in a finite number of iterations.
It returns an optimal value function and an optimal policy function (unless iter_max is reached).

4.7.3 Modified Policy Iteration

The DiscreteDP modified policy iteration method runs as follows:

1. Choose any 𝑣0 ∈ ℝ𝑛 , and specify 𝜀 > 0 and 𝑘 ≥ 0; set 𝑖 = 0.
2. Compute a 𝑣𝑖 -greedy policy 𝜎𝑖+1 ; let 𝜎𝑖+1 = 𝜎𝑖 if possible (for 𝑖 ≥ 1).
3. Compute 𝑢 = 𝑇 𝑣𝑖 (= 𝑇𝜎𝑖+1 𝑣𝑖 ). If span(𝑢 − 𝑣𝑖 ) < [(1 − 𝛽)/𝛽]𝜀, then go to step 5; otherwise go to step 4.
• Span is defined by span(𝑧) = max(𝑧) − min(𝑧).
4. Compute 𝑣𝑖+1 = (𝑇𝜎𝑖+1 )𝑘 𝑢 (= (𝑇𝜎𝑖+1 )𝑘+1 𝑣𝑖 ); set 𝑖 = 𝑖 + 1 and go to step 2.

4.7. Appendix: Algorithms 73

Advanced Quantitative Economics with Python

5. Return 𝑣 = 𝑢 + [𝛽/(1 − 𝛽)][(min(𝑢 − 𝑣𝑖 ) + max(𝑢 − 𝑣𝑖 ))/2]1 and 𝜎𝑖+1 .

Given 𝜀 > 0, provided that 𝑣0 is such that 𝑇 𝑣0 ≥ 𝑣0 , the modified policy iteration algorithm terminates in a finite number
of iterations.
It returns an 𝜀/2-approximation of the optimal value function and an 𝜀-optimal policy function (unless iter_max is
reached).
See also the documentation for DiscreteDP.

74 Chapter 4. Discrete State Dynamic Programming

Part II

LQ Control

75
CHAPTER

FIVE

INFORMATION AND CONSUMPTION SMOOTHING

In addition to what’s in Anaconda, this lecture employs the following libraries:

!pip install --upgrade quantecon

5.1 Overview

In the linear-quadratic permanent income of consumption smoothing model described in this quantecon lecture, a scalar
parameter 𝛽 ∈ (0, 1) plays two roles:
• it is a discount factor that the consumer applies to future utilities from consumption
• it is the reciprocal of the gross interest rate on risk-free one-period loans
That 𝛽 plays these two roles is essential in delivering the outcome that, regardless of the stochastic process that describes
his non-financial income, the consumer chooses to make consumption follow a random walk (see [Hall, 1978]).
In this lecture, we assign a third role to 𝛽:
• it describes a first-order moving average process for the growth in non-financial income

5.1.1 Same non-financial incomes, different information

We study two consumers who have exactly the same nonfinancial income process and who both conform to the linear-
quadratic permanent income of consumption smoothing model described here.
The two consumers have different information about their future nonfinancial incomes.
A better informed consumer each period receives news in the form of a shock that simultaneously affects both today’s
nonfinancial income and the present value of future nonfinancial incomes in a particular way.
A less informed consumer each period receives a shock that equals the part of today’s nonfinancial income that could not
be forecast from past values of nonfinancial income.
Even though they receive exactly the same nonfinancial incomes each period, our two consumers behave differently
because they have different information about their future nonfinancial incomes.
The second consumer receives less information about future nonfinancial incomes in a sense that we shall make precise.
This difference in their information sets manifests itself in their responding differently to what they regard as time 𝑡
information shocks.
Thus, although at each date they receive exactly the same histories of nonfinancial income, our two consumers receive
different shocks or news about their future nonfinancial incomes.

77
Advanced Quantitative Economics with Python

We use the different behaviors of our consumers as a way to learn about

• operating characteristics of a linear-quadratic permanent income model
• how the Kalman filter introduced in this lecture and/or another representation of the theory of optimal forecasting
introduced in this lecture embody lessons that can be applied to the news and noise literature
• ways of representing and computing optimal decision rules in the linear-quadratic permanent income model
• a Ricardian equivalence outcome that describes effects on optimal consumption of a tax cut at time 𝑡 accompanied
by a foreseen permanent increases in taxes that is just sufficient to cover the interest payments used to service the
risk-free government bonds that are issued to finance the tax cut
• a simple application of alternative ways to factor a covariance generating function along lines described in this
lecture
This lecture can be regarded as an introduction to invertibility issues that take center stage in the analysis of fiscal
foresight by Eric Leeper, Todd Walker, and Susan Yang [Leeper et al., 2013], as well as in chapter 4 of [Sargent et al.,
1991].

5.2 Two Representations of One Nonfinancial Income Process

We study consequences of endowing a consumer with one of two alternative representations for the change in the con-
sumer’s nonfinancial income 𝑦𝑡+1 − 𝑦𝑡 .
For both types of consumer, a parameter 𝛽 ∈ (0, 1) plays three roles.
It appears
• as a discount factor applied to future expected one-period utilities,
• as the reciprocal of a gross interest rate on one-period loans, and
• as a parameter in a first-order moving average that equals the increment in a consumer’s non-financial income
The first representation, which we shall sometimes refer to as the more informative representation, is

𝑦𝑡+1 − 𝑦𝑡 = 𝜖𝑡+1 − 𝛽 −1 𝜖𝑡 (5.1)

where {𝜖𝑡 } is an i.i.d. normally distributed scalar process with means of zero and contemporaneous variances 𝜎𝜖2 .
This representation of the process is used by a consumer who at time 𝑡 knows both 𝑦𝑡 and the shock 𝜖𝑡 and can use both
of them to forecast future 𝑦𝑡+𝑗 ’s.
As we’ll see below, representation (5.1) has the peculiar property that a positive shock 𝜖𝑡+1 leaves the discounted present
value of the consumer’s financial income at time 𝑡 + 1 unaltered.
The second representation of the same {𝑦𝑡 } process is

𝑦𝑡+1 − 𝑦𝑡 = 𝑎𝑡+1 − 𝛽𝑎𝑡 (5.2)

where {𝑎𝑡 } is another i.i.d. normally distributed scalar process, with means of zero and now variances 𝜎𝑎2 > 𝜎𝜖2 .
The i.i.d. shock variances are related by

𝜎𝑎2 = 𝛽 −2 𝜎𝜖2 > 𝜎𝜖2

so that the variance of the innovation exceeds the variance of the original shock by a multiplicative factor 𝛽 −2 .
Representation (5.2) is the innovations representation of equation (5.1) associated with Kalman filtering theory.

78 Chapter 5. Information and Consumption Smoothing

Advanced Quantitative Economics with Python

To see how this works, note that equating representations (5.1) and (5.2) for 𝑦𝑡+1 −𝑦𝑡 implies 𝜖𝑡+1 −𝛽 −1 𝜖𝑡 = 𝑎𝑡+1 −𝛽𝑎𝑡 ,
which in turn implies

𝑎𝑡+1 = 𝛽𝑎𝑡 + 𝜖𝑡+1 − 𝛽 −1 𝜖𝑡 .

Solving this difference equation backwards for 𝑎𝑡+1 gives, after a few lines of algebra,
∞
𝑎𝑡+1 = 𝜖𝑡+1 + (𝛽 − 𝛽 −1 ) ∑ 𝛽 𝑗 𝜖𝑡−𝑗 (5.3)
𝑗=0

which we can also write as

∞
𝑎𝑡+1 = ∑ ℎ𝑗 𝜖𝑡+1−𝑗 ≡ ℎ(𝐿)𝜖𝑡+1
𝑗=0

∞
where 𝐿 is the one-period lag operator, ℎ(𝐿) = ∑𝑗=0 ℎ𝑗 𝐿𝑗 , 𝐼 is the identity operator, and

𝐼 − 𝛽 −1 𝐿
ℎ(𝐿) =
𝐼 − 𝛽𝐿
Let 𝑔𝑗 ≡ 𝐸𝑧𝑡 𝑧𝑡−𝑗 be the 𝑗th autocovariance of the {𝑦𝑡 − 𝑦𝑡−1 } process.
Using calculations in the quantecon lecture, where 𝑧 ∈ 𝐶 is a complex variable, the covariance generating function
∞
𝑔(𝑧) = ∑𝑗=−∞ 𝑔𝑗 𝑧 𝑗 of the {𝑦𝑡 − 𝑦𝑡−1 } process equals

𝑔(𝑧) = 𝜎𝜖2 ℎ(𝑧)ℎ(𝑧 −1 ) = 𝛽 −2 𝜎𝜖2 > 𝜎𝜖2 ,

which confirms that {𝑎𝑡 } is a serially uncorrelated process with variance

𝜎𝑎2 = 𝛽 −1 𝜎𝜖2 .

To verify these claims, just notice that 𝑔(𝑧) = 𝛽 −2 𝜎𝜖2 implies that
• 𝑔0 = 𝛽 −2 𝜎𝜖2 , and
• 𝑔𝑗 = 0 for 𝑗 ≠ 0.
Alternatively, if you are uncomfortable with covariance generating functions, note that we can directly calculate 𝜎𝑎2 from
formula (5.3) according to
∞
𝜎𝑎2 = 𝜎𝜖2 + [1 + (𝛽 − 𝛽 −1 )2 ∑ 𝛽 2𝑗 ] = 𝛽 −1 𝜎𝜖2 .
𝑗=0

5.3 Application of Kalman filter

We can also use the the Kalman filter to obtain representation (5.2) from representation (5.1).
Thus, from equations associated with the Kalman filter, it can be verified that the steady-state Kalman gain 𝐾 = 𝛽 2 and
the steady state conditional covariance

Σ = 𝐸[(𝜖𝑡 − 𝜖𝑡̂ )2 |𝑦𝑡−1 , 𝑦𝑡−2 , …] = (1 − 𝛽 2 )𝜎𝜖2

In a little more detail, let 𝑧𝑡 = 𝑦𝑡 − 𝑦𝑡−1 and form the state-space representation

𝜖𝑡+1 = 0𝜖𝑡 + 𝜖𝑡+1

𝑧𝑡+1 = −𝛽 −1 𝜖𝑡 + 𝜖𝑡+1

5.3. Application of Kalman filter 79

Advanced Quantitative Economics with Python

and assume that 𝜎𝜖 = 1 for convenience

Let’s compute the steady-state Kalman filter for this system.
Let 𝐾 be the steady-state gain and 𝑎𝑡+1 the one-step ahead innovation.
The steady-state innovations representation is

𝜖𝑡+1
̂ = 0𝜖𝑡̂ + 𝐾𝑎𝑡+1
𝑧𝑡+1 = −𝛽𝑎𝑡 + 𝑎𝑡+1

By applying formulas for the steady-state Kalman filter, by hand it is possible to verify that 𝐾 = 𝛽 2 , 𝜎𝑎2 = 𝛽 −2 𝜎𝜖2 = 𝛽 −2 ,
and Σ = (1 − 𝛽 2 )𝜎𝜖2 .
Alternatively, we can obtain these formulas via the classical filtering theory described in this lecture.

5.4 News Shocks and Less Informative Shocks

Representation (5.1) is cast in terms of a news shock 𝜖𝑡+1 that represents a shock to nonfinancial income coming from
taxes, transfers, and other random sources of income changes known to a well-informed person who perhaps has all sorts
of information about the income process.
Representation (5.2) for the same income process is driven by shocks 𝑎𝑡 that contain less information than the news shock
𝜖𝑡 .
Representation (5.2) is called the innovations representation for the {𝑦𝑡 − 𝑦𝑡−1 } process.
It is cast in terms of what time series statisticians call the innovation or fundamental shock that emerges from apply-
ing the theory of optimally predicting nonfinancial income based solely on the information in past levels of growth in
nonfinancial income.
Fundamental for the 𝑦𝑡 process means that the shock 𝑎𝑡 can be expressed as a square-summable linear combination of
𝑦𝑡 , 𝑦𝑡−1 , ….
The shock 𝜖𝑡 is not fundamental because it has more information about the future of the {𝑦𝑡 − 𝑦𝑡−1 } process than is
contained in 𝑎𝑡 .
Representation (5.3) reveals the important fact that the original shock 𝜖𝑡 contains more information about future 𝑦’s than
is contained in the semi-infinite history 𝑦𝑡 = [𝑦𝑡 , 𝑦𝑡−1 , …].
Staring at representation (5.3) for 𝑎𝑡+1 shows that it consists both of new news 𝜖𝑡+1 as well as a long moving average
∞
(𝛽 − 𝛽 −1 ) ∑𝑗=0 𝛽 𝑗 𝜖𝑡−𝑗 of old news.
The more information representation (5.1) asserts that a shock 𝜖𝑡 results in an impulse response to nonfinancial income
of 𝜖𝑡 times the sequence

1, 1 − 𝛽 −1 , 1 − 𝛽 −1 , …

so that a shock that increases nonfinancial income 𝑦𝑡 by 𝜖𝑡 at time 𝑡 is followed by a change in future 𝑦 of 𝜖𝑡 times
1 − 𝛽 −1 < 0 in all subsequent periods.
Because 1 − 𝛽 −1 < 0, this means that a positive shock of 𝜖𝑡 today raises income at time 𝑡 by 𝜖𝑡 and then permanently
decreases all future incomes by (𝛽 −1 − 1)𝜖𝑡 .
This pattern precisely describes the following mental experiment:
• The consumer receives a government transfer of 𝜖𝑡 at time 𝑡.
• The government finances the transfer by issuing a one-period bond on which it pays a gross one-period risk-free
interest rate equal to 𝛽 −1 .

80 Chapter 5. Information and Consumption Smoothing

Advanced Quantitative Economics with Python

• In each future period, the government rolls over the one-period bond and so continues to borrow 𝜖𝑡 forever.
• The government imposes a lump-sum tax on the consumer in order to pay just the current interest on the original
bond and its rolled over successors.
• Thus, in periods 𝑡 + 1, 𝑡 + 2, …, the government levies a lump-sum tax on the consumer of 𝛽 −1 − 1 that is just
enough to pay the interest on the bond.
0
The present value of the impulse response or moving average coefficients equals 𝑑𝜖 (𝐿) = 1−𝛽 = 0, a fact that we’ll see
again below.
Representation (5.2), i.e., the innovations representation, asserts that a shock 𝑎𝑡 results in an impulse response to nonfi-
nancial income of 𝑎𝑡 times

1, 1 − 𝛽, 1 − 𝛽, …

so that a shock that increases income 𝑦𝑡 by 𝑎𝑡 at time 𝑡 can be expected to be followed by an increase in 𝑦𝑡+𝑗 of 𝑎𝑡 times
1 − 𝛽 > 0 in all future periods 𝑗 = 1, 2, ….
1−𝛽2
The present value of the impulse response or moving average coefficients for representation (5.2) is 𝑑𝑎 (𝛽) = 1−𝛽 =
(1 + 𝛽), another fact that will be important below.

5.5 Representation of 𝜖𝑡 Shock in Terms of Future 𝑦𝑡

Notice that reprentation (5.1), namely, 𝑦𝑡+1 − 𝑦𝑡 = −𝛽 −1 𝜖𝑡 + 𝜖𝑡+1 implies the linear difference equation

𝜖𝑡 = 𝛽𝜖𝑡+1 − 𝛽(𝑦𝑡+1 − 𝑦𝑡 ).

Solving forward we obtain

∞
𝜖𝑡 = 𝛽(𝑦𝑡 − (1 − 𝛽) ∑ 𝛽 𝑗 𝑦𝑡+𝑗+1 )
𝑗=0

This equation shows that 𝜖𝑡 equals 𝛽 times the one-step-backwards error in optimally backcasting 𝑦𝑡 based on the semi-
𝑡
infinite future 𝑦+ ≡ [𝑦𝑡+1 , 𝑦𝑡+2 , …] via the optimal backcasting formula
∞
𝑡
𝐸[𝑦𝑡 |𝑦+ ] = (1 − 𝛽) ∑ 𝛽 𝑗 𝑦𝑡+𝑗+1
𝑗=0

𝑡
Thus, 𝜖𝑡 exactly reveals the gap between 𝑦𝑡 and 𝐸[𝑦𝑡 |𝑦+ ].

5.6 Representation in Terms of 𝑎𝑡 Shocks

Next notice that representation (5.2), namely, 𝑦𝑡+1 − 𝑦𝑡 = −𝛽𝑎𝑡 + 𝑎𝑡+1 implies the linear difference equation

𝑎𝑡+1 = 𝛽𝑎𝑡 + (𝑦𝑡+1 − 𝑦𝑡 )

Solving this equation backward establishes that the one-step-prediction error 𝑎𝑡+1 is
∞
𝑎𝑡+1 = 𝑦𝑡+1 − (1 − 𝛽) ∑ 𝛽 𝑗 𝑦𝑡−𝑗 .
𝑗=0

Here the information set is 𝑦𝑡 = [𝑦𝑡 , 𝑦𝑡−1 , …] and a one step-ahead optimal prediction is
∞
𝐸[𝑦𝑡+1 |𝑦𝑡 ] = (1 − 𝛽) ∑ 𝛽 𝑗 𝑦𝑡−𝑗
𝑗=0

5.5. Representation of 𝜖𝑡 Shock in Terms of Future 𝑦𝑡 81

Advanced Quantitative Economics with Python

5.7 Permanent Income Consumption-Smoothing Model

When we computed optimal consumption-saving policies for our two representations (5.1) and (5.2) by using formulas
obtained with the difference equation approach described in quantecon lecture, we obtained:
for a consumer having the information assumed in the news representation (5.1):

𝑐𝑡+1 − 𝑐𝑡 = 0
𝑏𝑡+1 − 𝑏𝑡 = −𝛽 −1 𝜖𝑡

for a consumer having the more limited information associated with the innovations representation (5.2):

𝑐𝑡+1 − 𝑐𝑡 = (1 − 𝛽 2 )𝑎𝑡+1
𝑏𝑡+1 − 𝑏𝑡 = −𝛽𝑎𝑡

These formulas agree with outcomes from Python programs below that deploy state-space representations and dynamic
programming.
Evidently, although they receive exactly the same histories of nonfinancial incomethe two consumers behave differently.
The better informed consumer who has the information sets associated with representation (5.1) responds to each shock
𝜖𝑡+1 by leaving his consumption unaltered and saving all of 𝜖𝑡+1 in anticipation of the permanently increased taxes that he
will bear in order to service the permanent interest payments on the risk-free bonds that the government has presumably
issued to pay for the one-time addition 𝜖𝑡+1 to his time 𝑡 + 1 nonfinancial income.
The less well informed consumer who has information sets associated with representation (5.2) responds to a shock 𝑎𝑡+1
by increasing his consumption by what he perceives to be the permanent part of the increase in consumption and by
increasing his saving by what he perceives to be the temporary part.
The behavior of the better informed consumer sharply illustrates the behavior predicted in a classic Ricardian equivalence
experiment.

5.8 State Space Representations

We now cast our representations (5.1) and (5.2), respectively, in terms of the following two state space systems:

𝑦𝑡+1 1 −𝛽 −1 𝑦𝑡 𝜎
[ ]=[ ] [ ] + [ 𝜖 ] 𝑣𝑡+1
𝜖𝑡+1 0 0 𝜖𝑡 𝜎𝜖
(5.4)
𝑦
𝑦𝑡 = [1 0] [ 𝑡 ]
𝜖𝑡

and
𝑦𝑡+1 1 −𝛽 𝑦𝑡 𝜎
[ ]=[ ] [ ] + [ 𝑎 ] 𝑢𝑡+1
𝑎𝑡+1 0 0 𝑎𝑡 𝜎𝑎
(5.5)
𝑦
𝑦𝑡 = [1 0] [ 𝑡 ]
𝑎𝑡

where {𝑣𝑡 } and {𝑢𝑡 } are both i.i.d. sequences of univariate standardized normal random variables.
These two alternative income processes are ready to be used in the framework presented in the section “Comparison with
the Difference Equation Approach” in thid quantecon lecture.
All the code that we shall use below is presented in that lecture.

82 Chapter 5. Information and Consumption Smoothing

Advanced Quantitative Economics with Python

5.9 Computations

We shall use Python to form two state-space representations (5.4) and (5.5).
We set the following parameter values 𝜎𝜖 = 1, 𝜎𝑎 = 𝛽 −1 𝜎𝜖 = 𝛽 −1 where 𝛽 is the same value as the discount factor in
the household’s problem in the LQ savings problem in the lecture.
For these two representations, we use the code in this lecture to
• compute optimal decision rules for 𝑐𝑡 , 𝑏𝑡 for the two types of consumers associated with our two representations
of nonfinancial income
• use the value function objects 𝑃 , 𝑑 returned by the code to compute optimal values for the two representations
when evaluated at the initial condition
10
𝑥0 = [ ]
0
for each representation.
• create instances of the LinearStateSpace class for the two representations of the {𝑦𝑡 } process and use them to
obtain impulse response functions of 𝑐𝑡 and 𝑏𝑡 to the respective shocks 𝜖𝑡 and 𝑎𝑡 for the two representations.
• run simulations of {𝑦𝑡 , 𝑐𝑡 , 𝑏𝑡 } of length 𝑇 under both of the representations
We formulae the problem:
∞
2
min ∑ 𝛽 𝑡 (𝑐𝑡 − 𝛾)
𝑡=0

subject to a sequence of constraints

1
𝑐𝑡 + 𝑏 𝑡 = 𝑏 + 𝑦𝑡 , 𝑡≥0
1 + 𝑟 𝑡+1
where 𝑦𝑡 follows one of the representations defined above.
Define the control as 𝑢𝑡 ≡ 𝑐𝑡 − 𝛾.
(For simplicity we can assume 𝛾 = 0 below because 𝛾 has no effect on the impulse response functions that interest us.)
The state transition equations under our two representations for the nonfinancial income process {𝑦𝑡 } can be written as

𝑦𝑡+1 1 −𝛽 −1 0 𝑦𝑡 0 𝜎𝜖
⎡ 𝜖 ⎤= ⎡ 0 0 0 ⎤ ⎡ 𝜖 ⎤ + ⎡ 0 ⎤ [ 𝑐 ] + ⎡ 𝜎 ⎤𝜈 ,
⎢ 𝑡+1 ⎥ ⎢ ⎥⎢ 𝑡 ⎥ ⎢ ⎥ 𝑡 ⎢ 𝜖 ⎥ 𝑡+1
⎣ − (1 + 𝑟)
⎣ 𝑏𝑡+1 ⎦ ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ 0 1 + 𝑟 ⎦ ⎣ 𝑏𝑡 ⎦ ⏟⎣⏟1⏟
+⏟𝑟⏟
⎦ ⎣
⏟ 0 ⎦
≡𝐴1 ≡𝐵1 ≡𝐶1

and
𝑦𝑡+1 1 −𝛽 0 𝑦𝑡 0 𝜎𝑎
⎡ 𝑎 ⎤ ⎡ 0 0 0 ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
⎢ 𝑡+1 ⎥ = ⎢ ⎥ ⎢ 𝑎𝑡 ⎥ + ⎢ 0 ⎥ [ 𝑐𝑡 ] + ⎢ 𝜎𝑎 ⎥𝑢𝑡+1 .
⎣ 𝑏𝑡+1 ⎦ ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟
⎣ − (1 + 𝑟) 0 1 + 𝑟 ⎦ ⎣ 𝑏𝑡 ⎦ ⏟ ⎣⏟1⏟
+⏟𝑟⏟
⎦ ⎣ 0 ⎦
⏟
≡𝐴2 ≡𝐵2 ≡𝐶2

As usual, we start by importing packages.

import numpy as np
import quantecon as qe
import matplotlib.pyplot as plt

5.9. Computations 83
Advanced Quantitative Economics with Python

# Set parameters
β, σϵ = 0.95, 1
σa = σϵ / β

R = 1 / β

# Payoff matrices are the same for two representations

RLQ = np.array([[0, 0, 0],
[0, 0, 0],
[0, 0, 1e-12]]) # put penalty on debt
QLQ = np.array([1.])

# More informative representation state transition matrices

ALQ1 = np.array([[1, -R, 0],
[0, 0, 0],
[-R, 0, R]])
BLQ1 = np.array([[0, 0, R]]).T
CLQ1 = np.array([[σϵ, σϵ, 0]]).T

# Construct and solve the LQ problem

LQ1 = qe.LQ(QLQ, RLQ, ALQ1, BLQ1, C=CLQ1, beta=β)
P1, F1, d1 = LQ1.stationary_values()

# The optimal decision rule for c

-F1

array([[ 1. , -1. , -0.05]])

Evidently, optimal consumption and debt decision rules for the consumer having news representation (5.1) are

𝑐𝑡∗ = 𝑦𝑡 − 𝜖𝑡 − (1 − 𝛽) 𝑏𝑡 ,
∗
𝑏𝑡+1 = 𝛽 −1 𝑐𝑡∗ + 𝛽 −1 𝑏𝑡 − 𝛽 −1 𝑦𝑡
= 𝛽 −1 𝑦𝑡 − 𝛽 −1 𝜖𝑡 − (𝛽 −1 − 1) 𝑏𝑡 + 𝛽 −1 𝑏𝑡 − 𝛽 −1 𝑦𝑡
= 𝑏𝑡 − 𝛽 −1 𝜖𝑡 .

# Innovations representation
ALQ2 = np.array([[1, -β, 0],
[0, 0, 0],
[-R, 0, R]])
BLQ2 = np.array([[0, 0, R]]).T
CLQ2 = np.array([[σa, σa, 0]]).T

LQ2 = qe.LQ(QLQ, RLQ, ALQ2, BLQ2, C=CLQ2, beta=β)

P2, F2, d2 = LQ2.stationary_values()

-F2

array([[ 1. , -0.9025, -0.05 ]])

For a consumer having access only to the information associated with the innovations representation (5.2), the optimal

84 Chapter 5. Information and Consumption Smoothing

Advanced Quantitative Economics with Python

decision rules are

𝑐𝑡∗ = 𝑦𝑡 − 𝛽 2 𝑎𝑡 − (1 − 𝛽) 𝑏𝑡 ,
∗
𝑏𝑡+1 = 𝛽 −1 𝑐𝑡∗ + 𝛽 −1 𝑏𝑡 − 𝛽 −1 𝑦𝑡
= 𝛽 −1 𝑦𝑡 − 𝛽𝑎𝑡 − (𝛽 −1 − 1) 𝑏𝑡 + 𝛽 −1 𝑏𝑡 − 𝛽 −1 𝑦𝑡
= 𝑏𝑡 − 𝛽𝑎𝑡 .

Now we construct two Linear State Space models that emerge from using optimal policies of the form 𝑢𝑡 = −𝐹 𝑥𝑡 .
Take the more informative original representation (5.1) as an example:

𝑦𝑡+1 𝑦𝑡
⎡ 𝜖 ⎤ = (𝐴 − 𝐵 𝐹 ) ⎡ 𝜖 ⎤ + 𝐶 𝜈
⎢ 𝑡+1 ⎥ 1 1 1 ⎢ 𝑡 ⎥ 1 𝑡+1
⎣ 𝑏𝑡+1 ⎦ ⎣ 𝑏𝑡 ⎦

𝑦
𝑐𝑡 −𝐹1 ⎡ 𝑡 ⎤
[ ]=[ ] ⎢ 𝜖𝑡 ⎥
𝑏𝑡 𝑆𝑏
⎣ 𝑏𝑡 ⎦
To have the Linear State Space model be of an innovations representation form (5.2), we can simply replace the corre-
sponding matrices.

# Construct two Linear State Space models

Sb = np.array([0, 0, 1])

ABF1 = ALQ1 - BLQ1 @ F1

G1 = np.vstack([-F1, Sb])
LSS1 = qe.LinearStateSpace(ABF1, CLQ1, G1)

ABF2 = ALQ2 - BLQ2 @ F2

G2 = np.vstack([-F2, Sb])
LSS2 = qe.LinearStateSpace(ABF2, CLQ2, G2)

The following code computes impulse response functions of 𝑐𝑡 and 𝑏𝑡 .

J = 5 # Number of coefficients that we want

x_res1, y_res1 = LSS1.impulse_response(j=J)

b_res1 = np.array([x_res1[i][2, 0] for i in range(J)])
c_res1 = np.array([y_res1[i][0, 0] for i in range(J)])

x_res2, y_res2 = LSS2.impulse_response(j=J)

b_res2 = np.array([x_res2[i][2, 0] for i in range(J)])
c_res2 = np.array([y_res2[i][0, 0] for i in range(J)])

c_res1 / σϵ, b_res1 / σϵ

(array([1.99998906e-11, 1.89473923e-11, 1.78947621e-11, 1.68421319e-11,

1.57895017e-11]),
array([ 0. , -1.05263158, -1.05263158, -1.05263158, -1.05263158]))

plt.title("more informative representation")

plt.plot(range(J), c_res1 / σϵ, label="c impulse response function")
plt.plot(range(J), b_res1 / σϵ, label="b impulse response function")
plt.legend()

5.9. Computations 85
Advanced Quantitative Economics with Python

<matplotlib.legend.Legend at 0x7fc14e8fe450>

The above two impulse response functions show that when the consumer has the information assumed in the more infor-
mative representation (5.1), his response to receiving a positive shock of 𝜖𝑡 is to leave his consumption unchanged and to
save the entire amount of his extra income and then forever roll over the extra bonds that he holds.
To see this notice, that starting from next period on, his debt permanently decreases by 𝛽 −1

c_res2 / σa, b_res2 / σa

(array([0.0975, 0.0975, 0.0975, 0.0975, 0.0975]),

array([ 0. , -0.95, -0.95, -0.95, -0.95]))

plt.title("innovations representation")
plt.plot(range(J), c_res2 / σa, label="c impulse response function")
plt.plot(range(J), b_res2 / σa, label="b impulse response function")
plt.plot([0, J-1], [0, 0], '--', color='k')
plt.legend()

<matplotlib.legend.Legend at 0x7fc14e6a64e0>

86 Chapter 5. Information and Consumption Smoothing

Advanced Quantitative Economics with Python

The above impulse responses show that when the consumer has only the information that is assumed to be available
under the innovations representation (5.2) for {𝑦𝑡 − 𝑦𝑡−1 }, he responds to a positive 𝑎𝑡 by permanently increasing his
consumption.
He accomplishes this by consuming a fraction (1 − 𝛽 2 ) of the increment 𝑎𝑡 to his nonfinancial income and saving the
rest, thereby lowering 𝑏𝑡+1 in order to finance the permanent increment in his consumption.
The preceding computations confirm what we had derived earlier using paper and pencil.
Now let’s simulate some paths of consumption and debt for our two types of consumers while always presenting both
types with the same {𝑦𝑡 } path.

# Set time length for simulation

T = 100

x1, y1 = LSS1.simulate(ts_length=T)
plt.plot(range(T), y1[0, :], label="c")
plt.plot(range(T), x1[2, :], label="b")
plt.plot(range(T), x1[0, :], label="y")
plt.title("more informative representation")
plt.legend()

<matplotlib.legend.Legend at 0x7fc14e6a6d20>

5.9. Computations 87
Advanced Quantitative Economics with Python

x2, y2 = LSS2.simulate(ts_length=T)
plt.plot(range(T), y2[0, :], label="c")
plt.plot(range(T), x2[2, :], label="b")
plt.plot(range(T), x2[0, :], label="y")
plt.title("innovations representation")
plt.legend()

<matplotlib.legend.Legend at 0x7fc14e1b6c00>

88 Chapter 5. Information and Consumption Smoothing

Advanced Quantitative Economics with Python

5.10 Simulating Income Process and Two Associated Shock Pro-

cesses

We now form a single {𝑦𝑡 }𝑇𝑡=0 realization that we will use to simulate decisions associated with our two types of consumer.
We accomplish this in the following steps.
1. We form a {𝑦𝑡 , 𝜖𝑡 } realization by drawing a long simulation of {𝜖𝑡 }𝑇𝑡=0 , where 𝑇 is a big integer, 𝜖𝑡 = 𝜎𝜖 𝑣𝑡 , 𝑣𝑡 is
a standard normal scalar, 𝑦0 = 100, and

𝑦𝑡+1 − 𝑦𝑡 = −𝛽 −1 𝜖𝑡 + 𝜖𝑡+1 .

2. We take the {𝑦𝑡 } realization generated in step 1 and form an innovation process {𝑎𝑡 } from the formulas
𝑎0 = 0
𝑡−1
𝑎𝑡 = ∑ 𝛽 𝑗 (𝑦𝑡−𝑗 − 𝑦𝑡−𝑗−1 ) + 𝛽 𝑡 𝑎0 , 𝑡≥1
𝑗=0

3. We throw away the first 𝑆 observations and form a sample {𝑦𝑡 , 𝜖𝑡 , 𝑎𝑡 }𝑇𝑆+1 as the realization that we’ll use in the
following steps.
4. We use the step 3 realization to evaluate and simulate the decision rules for 𝑐𝑡 , 𝑏𝑡 that Python has computed for
us above.
The above steps implement the experiment of comparing decisions made by two consumers having identical incomes at
each date but at each date having different information about their future incomes.

5.10. Simulating Income Process and Two Associated Shock Processes 89

Advanced Quantitative Economics with Python

5.11 Calculating Innovations in Another Way

Here we use formula (5.3) above to compute 𝑎𝑡+1 as a function of the history 𝜖𝑡+1 , 𝜖𝑡 , 𝜖𝑡−1 , …
Thus, we compute

𝑎𝑡+1 = 𝛽𝑎𝑡 + 𝜖𝑡+1 − 𝛽 −1 𝜖𝑡

= 𝛽 (𝛽𝑎𝑡−1 + 𝜖𝑡 − 𝛽 −1 𝜖𝑡−1 ) + 𝜖𝑡+1 − 𝛽 −1 𝜖𝑡
= 𝛽 2 𝑎𝑡−1 + 𝛽 (𝜖𝑡 − 𝛽 −1 𝜖𝑡−1 ) + 𝜖𝑡+1 − 𝛽 −1 𝜖𝑡
= ⋮ ⋮
𝑡
= 𝛽 𝑡+1 𝑎0 + ∑ 𝛽 𝑗 (𝜖𝑡+1−𝑗 − 𝛽 −1 𝜖𝑡−𝑗 )
𝑗=0
𝑡−1
= 𝛽 𝑡+1 𝑎0 + 𝜖𝑡+1 + (𝛽 − 𝛽 −1 ) ∑ 𝛽 𝑗 𝜖𝑡−𝑗 − 𝛽 𝑡−1 𝜖0 .
𝑗=0

We can verify that we recover the same {𝑎𝑡 } sequence computed earlier.

5.12 Another Invertibility Issue

This quantecon lecture contains another example of a shock-invertibility issue that is endemic to the LQ permanent income
or consumption smoothing model.
The technical issue discussed there is ultimately the source of the shock-invertibility issues discussed by Eric Leeper,
Todd Walker, and Susan Yang [Leeper et al., 2013] in their analysis of fiscal foresight.

90 Chapter 5. Information and Consumption Smoothing

CHAPTER

SIX

CONSUMPTION SMOOTHING WITH COMPLETE AND INCOMPLETE

MARKETS

In addition to what’s in Anaconda, this lecture uses the library:

!pip install --upgrade quantecon

6.1 Overview

This lecture describes two types of consumption-smoothing models.

• one is in the complete markets tradition of Kenneth Arrow
• the other is in the incomplete markets tradition of Hall [Hall, 1978]
Complete markets allow a consumer to buy and sell claims contingent on all possible states of the world.
Incomplete markets allow a consumer to buy and sell a limited set of securities, often only a single risk-free security.
Hall [Hall, 1978] worked in an incomplete markets tradition by assuming that the only asset that can be traded is a
risk-free one-period bond.
Hall assumed an exogenous stochastic process of nonfinancial income and an exogenous and time-invariant gross interest
rate on one-period risk-free debt that equals 𝛽 −1 , where 𝛽 ∈ (0, 1) is also a consumer’s intertemporal discount factor.
This is equivalent to saying that it costs 𝛽 of time 𝑡 consumption to buy one unit of consumption at time 𝑡 + 1 for sure.
So 𝛽 is the price of a one-period risk-free claim to consumption next period.
We preserve Hall’s assumption about the interest rate when we describe an incomplete markets version of our model.
In addition, we extend Hall’s assumption about the risk-free interest rate to an appropriate counterpart when we create
another model in which there are markets in a complete array of one-period Arrow state-contingent securities.
We’ll consider two closely related alternative assumptions about the consumer’s exogenous nonfinancial income process:
• that it is generated by a finite 𝑁 state Markov chain (setting 𝑁 = 2 most of the time in this lecture)
• that it is described by a linear state space model with a continuous state vector in ℝ𝑛 driven by a Gaussian vector
IID shock process
We’ll spend most of this lecture studying the finite-state Markov specification, but will begin by studying the linear state
space specification because it is so closely linked to earlier lectures.
Let’s start with some imports:

91
Advanced Quantitative Economics with Python

import numpy as np
import quantecon as qe
import matplotlib.pyplot as plt
import scipy.linalg as la

6.1.1 Relationship to Other Lectures

This lecture can be viewed as a followup to Optimal Savings II: LQ Techniques

This lecture is also a prologomenon to a lecture on tax-smoothing Tax Smoothing with Complete and Incomplete Markets

6.2 Background

Outcomes in consumption-smoothing models emerge from two sources:

• a consumer who wants to maximize an intertemporal objective function that expresses its preference for paths of
consumption that are smooth in the sense of varying as little as possible both across time and across realized Markov
states
• opportunities that allow the consumer to transform an erratic nonfinancial income process into a smoother con-
sumption process by buying and selling one or more financial securities
In the complete markets version, each period the consumer can buy or sell a complete set of one-period ahead state-
contingent securities whose payoffs depend on next period’s realization of the Markov state.
• In the two-state Markov chain case, two such securities are traded each period.
• In an 𝑁 state Markov state version, 𝑁 such securities are traded each period.
• In a continuous state Markov state version, a continuum of such securities is traded each period.
These state-contingent securities are commonly called Arrow securities, after Kenneth Arrow.
In the incomplete markets version, the consumer can buy and sell only one security each period, a risk-free one-period
bond with gross one-period return 𝛽 −1 .

6.3 Linear State Space Version of Complete Markets Model

We’ll study a complete markets model adapted to a setting with a continuous Markov state like that in the first lecture on
the permanent income model.
In that model
• a consumer can trade only a single risk-free one-period bond bearing gross one-period risk-free interest rate equal
to 𝛽 −1 .
• a consumer’s exogenous nonfinancial income is governed by a linear state space model driven by Gaussian shocks,
the kind of model studied in an earlier lecture about linear state space models.
Let’s write down a complete markets counterpart of that model.
Suppose that nonfinancial income is governed by the state space system

𝑥𝑡+1 = 𝐴𝑥𝑡 + 𝐶𝑤𝑡+1

𝑦𝑡 = 𝑆𝑦 𝑥𝑡

92 Chapter 6. Consumption Smoothing with Complete and Incomplete Markets

Advanced Quantitative Economics with Python

where 𝑥𝑡 is an 𝑛 × 1 vector and 𝑤𝑡+1 ∼ 𝑁 (0, 𝐼) is IID over time.

We want a natural counterpart of the Hall assumption that the one-period risk-free gross interest rate is 𝛽 −1 .
We make the good guess that prices of one-period ahead Arrow securities are described by the pricing kernel

𝑞𝑡+1 (𝑥𝑡+1 | 𝑥𝑡 ) = 𝛽𝜙(𝑥𝑡+1 | 𝐴𝑥𝑡 , 𝐶𝐶 ′ ) (6.1)

where 𝜙(⋅ | 𝜇, Σ) is a multivariate Gaussian distribution with mean vector 𝜇 and covariance matrix Σ.
With the pricing kernel 𝑞𝑡+1 (𝑥𝑡+1 | 𝑥𝑡 ) in hand, we can price claims to consumption at time 𝑡 + 1 consumption that pay
off when 𝑥𝑡+1 ∈ 𝑆 at time 𝑡 + 1:

∫ 𝑞𝑡+1 (𝑥𝑡+1 | 𝑥𝑡 )𝑑𝑥𝑡+1

𝑆

𝑛
where 𝑆 is a subset of ℝ .
The price ∫𝑆 𝑞𝑡+1 (𝑥𝑡+1 | 𝑥𝑡 )𝑑𝑥𝑡+1 of such a claim depends on state 𝑥𝑡 because the prices of the 𝑥𝑡+1 -contingent securities
depend on 𝑥𝑡 through the pricing kernel 𝑞(𝑥𝑡+1 | 𝑥𝑡 ).
Let 𝑏(𝑥𝑡+1 ) be a vector of state-contingent debt due at 𝑡 + 1 as a function of the 𝑡 + 1 state 𝑥𝑡+1 .
Using the pricing kernel assumed in (6.1), the value at 𝑡 of 𝑏(𝑥𝑡+1 ) is evidently

𝛽 ∫ 𝑏(𝑥𝑡+1 )𝜙(𝑥𝑡+1 | 𝐴𝑥𝑡 , 𝐶𝐶 ′ )𝑑𝑥𝑡+1 = 𝛽𝔼𝑡 𝑏𝑡+1

In our complete markets setting, the consumer faces a sequence of budget constraints

𝑐𝑡 + 𝑏𝑡 = 𝑦𝑡 + 𝛽𝔼𝑡 𝑏𝑡+1 , 𝑡≥0

Please note that

𝛽𝐸𝑡 𝑏𝑡+1 = 𝛽 ∫ 𝜙𝑡+1 (𝑥𝑡+1 |𝐴𝑥𝑡 , 𝐶𝐶 ′ )𝑏𝑡+1 (𝑥𝑡+1 )𝑑𝑥𝑡+1

𝛽𝐸𝑡 𝑏𝑡+1 = ∫ 𝑞𝑡+1 (𝑥𝑡+1 |𝑥𝑡 )𝑏𝑡+1 (𝑥𝑡+1 )𝑑𝑥𝑡+1

which verifies that 𝛽𝐸𝑡 𝑏𝑡+1 is the value of time 𝑡 + 1 state-contingent claims on time 𝑡 + 1 consumption issued by the
consumer at time 𝑡
We can solve the time 𝑡 budget constraint forward to obtain
∞
𝑏𝑡 = 𝔼𝑡 ∑ 𝛽 𝑗 (𝑦𝑡+𝑗 − 𝑐𝑡+𝑗 )
𝑗=0

The consumer cares about the expected value of

∞
∑ 𝛽 𝑡 𝑢(𝑐𝑡 ), 0<𝛽<1
𝑡=0

In the incomplete markets version of the model, we assumed that 𝑢(𝑐𝑡 ) = −(𝑐𝑡 − 𝛾)2 , so that the above utility functional
became
∞
− ∑ 𝛽 𝑡 (𝑐𝑡 − 𝛾)2 , 0<𝛽<1
𝑡=0

6.3. Linear State Space Version of Complete Markets Model 93

Advanced Quantitative Economics with Python

But in the complete markets version, it is tractable to assume a more general utility function that satisfies 𝑢′ > 0 and
𝑢″ < 0.
First-order conditions for the consumer’s problem with complete markets and our assumption about Arrow securities
prices are

𝑢′ (𝑐𝑡+1 ) = 𝑢′ (𝑐𝑡 ) for all 𝑡 ≥ 0

which implies 𝑐𝑡 = 𝑐 ̄ for some 𝑐.̄

So it follows that
∞
𝑏𝑡 = 𝔼𝑡 ∑ 𝛽 𝑗 (𝑦𝑡+𝑗 − 𝑐)̄
𝑗=0

or
1
𝑏𝑡 = 𝑆𝑦 (𝐼 − 𝛽𝐴)−1 𝑥𝑡 − 𝑐̄ (6.2)
1−𝛽
where 𝑐 ̄ satisfies
1
𝑏̄0 = 𝑆𝑦 (𝐼 − 𝛽𝐴)−1 𝑥0 − 𝑐̄ (6.3)
1−𝛽

where 𝑏̄0 is an initial level of the consumer’s debt due at time 𝑡 = 0, specified as a parameter of the problem.
Thus, in the complete markets version of the consumption-smoothing model, 𝑐𝑡 = 𝑐,̄ ∀𝑡 ≥ 0 is determined by (6.3) and
the consumer’s debt is the fixed function of the state 𝑥𝑡 described by (6.2).
Please recall that in the LQ permanent income model studied in permanent income model, the state is 𝑥𝑡 , 𝑏𝑡 , where 𝑏𝑡 is
a complicated function of past state vectors 𝑥𝑡−𝑗 .
Notice that in contrast to that incomplete markets model, at time 𝑡 the state vector is 𝑥𝑡 alone in our complete markets
model.
Here’s an example that shows how in this setting the availability of insurance against fluctuating nonfinancial income
allows the consumer completely to smooth consumption across time and across states of the world

def complete_ss(β, b0, x0, A, C, S_y, T=12):

"""
Computes the path of consumption and debt for the previously described
complete markets model where exogenous income follows a linear
state space
"""
# Create a linear state space for simulation purposes
# This adds "b" as a state to the linear state space system
# so that setting the seed places shocks in same place for
# both the complete and incomplete markets economy
# Atilde = np.vstack([np.hstack([A, np.zeros((A.shape[0], 1))]),
# np.zeros((1, A.shape[1] + 1))])
# Ctilde = np.vstack([C, np.zeros((1, 1))])
# S_ytilde = np.hstack([S_y, np.zeros((1, 1))])

lss = qe.LinearStateSpace(A, C, S_y, mu_0=x0)

# Add extra state to initial condition

# x0 = np.hstack([x0, np.zeros(1)])

# Compute the (I - β * A)^{-1}

(continues on next page)

94 Chapter 6. Consumption Smoothing with Complete and Incomplete Markets

Advanced Quantitative Economics with Python

(continued from previous page)

rm = la.inv(np.eye(A.shape[0]) - β * A)

# Constant level of consumption

cbar = (1 - β) * (S_y @ rm @ x0 - b0)
c_hist = np.full(T, cbar)

# Debt
x_hist, y_hist = lss.simulate(T)
b_hist = np.squeeze(S_y @ rm @ x_hist - cbar / (1 - β))

return c_hist, b_hist, np.squeeze(y_hist), x_hist

# Define parameters
N_simul = 80
α, ρ1, ρ2 = 10.0, 0.9, 0.0
σ = 1.0

A = np.array([[1., 0., 0.],

[α, ρ1, ρ2],
[0., 1., 0.]])
C = np.array([[0.], [σ], [0.]])
S_y = np.array([[1, 1.0, 0.]])
β, b0 = 0.95, -10.0
x0 = np.array([1.0, α / (1 - ρ1), α / (1 - ρ1)])

# Do simulation for complete markets

s = np.random.randint(0, 10000)
np.random.seed(s) # Seeds get set the same for both economies
out = complete_ss(β, b0, x0, A, C, S_y, 80)
c_hist_com, b_hist_com, y_hist_com, x_hist_com = out

fig, ax = plt.subplots(1, 2, figsize=(14, 4))

# Consumption plots
ax[0].set_title('Consumption and income')
ax[0].plot(np.arange(N_simul), c_hist_com, label='consumption')
ax[0].plot(np.arange(N_simul), y_hist_com, label='income', alpha=.6, linestyle='--')
ax[0].legend()
ax[0].set_xlabel('Periods')
ax[0].set_ylim([80, 120])

# Debt plots
ax[1].set_title('Debt and income')
ax[1].plot(np.arange(N_simul), b_hist_com, label='debt')
ax[1].plot(np.arange(N_simul), y_hist_com, label='Income', alpha=.6, linestyle='--')
ax[1].legend()
ax[1].axhline(0, color='k')
ax[1].set_xlabel('Periods')

plt.show()

6.3. Linear State Space Version of Complete Markets Model 95

Advanced Quantitative Economics with Python

6.3.1 Interpretation of Graph

In the above graph, please note that:

• nonfinancial income fluctuates in a stationary manner.
• consumption is completely constant.
• the consumer’s debt fluctuates in a stationary manner; in fact, in this case, because nonfinancial income is a first-
order autoregressive process, the consumer’s debt is an exact affine function (meaning linear plus a constant) of the
consumer’s nonfinancial income.

6.3.2 Incomplete Markets Version

The incomplete markets version of the model with nonfinancial income being governed by a linear state space system is
described in permanent income model.
In that incomplete markerts setting, consumption follows a random walk and the consumer’s debt follows a process with
a unit root.

6.3.3 Finite State Markov Income Process

We now turn to a finite-state Markov version of the model in which the consumer’s nonfinancial income is an exact
function of a Markov state that takes one of 𝑁 values.
We’ll start with a setting in which in each version of our consumption-smoothing model, nonfinancial income is governed
by a two-state Markov chain (it’s easy to generalize this to an 𝑁 state Markov chain).
In particular, the state 𝑠𝑡 ∈ {1, 2} follows a Markov chain with transition probability matrix

𝑃𝑖𝑗 = ℙ{𝑠𝑡+1 = 𝑗 | 𝑠𝑡 = 𝑖}

where ℙ means conditional probability

Nonfinancial income {𝑦𝑡 } obeys

𝑦1̄ if 𝑠𝑡 = 1
𝑦𝑡 = {
𝑦2̄ if 𝑠𝑡 = 2

96 Chapter 6. Consumption Smoothing with Complete and Incomplete Markets

Advanced Quantitative Economics with Python

A consumer wishes to maximize

∞
𝔼 [∑ 𝛽 𝑡 𝑢(𝑐𝑡 )] where 𝑢(𝑐𝑡 ) = −(𝑐𝑡 − 𝛾)2 and 0 < 𝛽 < 1 (6.4)
𝑡=0

Here 𝛾 > 0 is a bliss level of consumption

6.3.4 Market Structure

Our complete and incomplete markets models differ in how thoroughly the market structure allows a consumer to transfer
resources across time and Markov states, there being more transfer opportunities in the complete markets setting than in
the incomplete markets setting.
Watch how these differences in opportunities affect
• how smooth consumption is across time and Markov states
• how the consumer chooses to make his levels of indebtedness behave over time and across Markov states

6.4 Model 1 (Complete Markets)

At each date 𝑡 ≥ 0, the consumer trades a full array of one-period ahead Arrow securities.
We assume that prices of these securities are exogenous to the consumer.
Exogenous means that they are unaffected by the consumer’s decisions.
In Markov state 𝑠𝑡 at time 𝑡, one unit of consumption in state 𝑠𝑡+1 at time 𝑡 + 1 costs 𝑞(𝑠𝑡+1 | 𝑠𝑡 ) units of the time 𝑡
consumption good.
The prices 𝑞(𝑠𝑡+1 | 𝑠𝑡 ) are given and can be organized into a matrix 𝑄 with 𝑄𝑖𝑗 = 𝑞(𝑗|𝑖)
At time 𝑡 = 0, the consumer starts with an inherited level of debt due at time 0 of 𝑏0 units of time 0 consumption goods.
The consumer’s budget constraint at 𝑡 ≥ 0 in Markov state 𝑠𝑡 is

𝑐𝑡 + 𝑏𝑡 ≤ 𝑦(𝑠𝑡 ) + ∑ 𝑞(𝑗 | 𝑠𝑡 ) 𝑏𝑡+1 (𝑗 | 𝑠𝑡 ) (6.5)

𝑗

where 𝑏𝑡 is the consumer’s one-period debt that falls due at time 𝑡 and 𝑏𝑡+1 (𝑗 | 𝑠𝑡 ) are the consumer’s time 𝑡 sales of the
time 𝑡 + 1 consumption good in Markov state 𝑗.
Thus
• 𝑞(𝑗 | 𝑠𝑡 )𝑏𝑡+1 (𝑗 | 𝑠𝑡 ) is a source of time 𝑡 financial income for the consumer in Markov state 𝑠𝑡
• 𝑏𝑡 ≡ 𝑏𝑡 (𝑗 | 𝑠𝑡−1 ) is a source of time 𝑡 expenditures for the consumer when 𝑠𝑡 = 𝑗
Remark: We are ignoring an important technicality here, namely, that the consumer’s choice of 𝑏𝑡+1 (𝑗| 𝑠𝑡 ) must respect
so-called natural debt limits that assure that it is feasible for the consumer to repay debts due even if he consumers zero
forevermore. We shall discuss such debt limits in another lecture.
A natural analog of Hall’s assumption that the one-period risk-free gross interest rate is 𝛽 −1 is

𝑞(𝑗 | 𝑖) = 𝛽𝑃𝑖𝑗 (6.6)

To understand how this is a natural analogue, observe that in state 𝑖 it costs ∑𝑗 𝑞(𝑗 | 𝑖) to purchase one unit of consumption
next period for sure, i.e., meaning no matter what Markov state 𝑗 occurs at 𝑡 + 1.

6.4. Model 1 (Complete Markets) 97

Advanced Quantitative Economics with Python

Hence the implied price of a risk-free claim on one unit of consumption next period is

∑ 𝑞(𝑗 | 𝑖) = ∑ 𝛽𝑃𝑖𝑗 = 𝛽
𝑗 𝑗

This confirms the sense in which (6.6) is a natural counterpart to Hall’s assumption that the risk-free one-period gross
interest rate is 𝑅 = 𝛽 −1 .
It is timely please to recall that the gross one-period risk-free interest rate is the reciprocal of the price at time 𝑡 of a
risk-free claim on one unit of consumption tomorrow.
First-order necessary conditions for maximizing the consumer’s expected utility subject to the sequence of budget con-
straints (6.5) are
𝑢′ (𝑐𝑡+1 )
𝛽 ℙ{𝑠𝑡+1 | 𝑠𝑡 } = 𝑞(𝑠𝑡+1 | 𝑠𝑡 )
𝑢′ (𝑐𝑡 )
for all 𝑠𝑡 , 𝑠𝑡+1 or, under our assumption (6.6) about Arrow security prices,

𝑐𝑡+1 = 𝑐𝑡 (6.7)

Thus, our consumer sets 𝑐𝑡 = 𝑐 ̄ for all 𝑡 ≥ 0 for some value 𝑐 ̄ that it is our job now to determine along with values for
𝑏𝑡+1 (𝑗|𝑠𝑡 = 𝑖) for 𝑖 = 1, 2 and 𝑗 = 1, 2.
We’ll use a guess and verify method to determine these objects
Guess: We’ll make the plausible guess that

𝑏𝑡+1 (𝑠𝑡+1 = 𝑗 | 𝑠𝑡 = 𝑖) = 𝑏(𝑗), 𝑖 = 1, 2; 𝑗 = 1, 2 (6.8)

so that the amount borrowed today depends only on tomorrow’s Markov state. (Why is this is a plausible guess?)
To determine 𝑐,̄ we shall deduce implications of the consumer’s budget constraints in each Markov state today and our
guess (6.8) about the consumer’s debt level choices.
For 𝑡 ≥ 1, these imply

𝑐 ̄ + 𝑏(1) = 𝑦(1) + 𝑞(1 | 1)𝑏(1) + 𝑞(2 | 1)𝑏(2)

(6.9)
𝑐 ̄ + 𝑏(2) = 𝑦(2) + 𝑞(1 | 2)𝑏(1) + 𝑞(2 | 2)𝑏(2)
or
𝑏(1) 𝑐̄ 𝑦(1) 𝑃 𝑃12 𝑏(1)
[ ]+[ ]=[ ] + 𝛽 [ 11 ][ ]
𝑏(2) 𝑐̄ 𝑦(2) 𝑃21 𝑃22 𝑏(2)

These are 2 equations in the 3 unknowns 𝑐,̄ 𝑏(1), 𝑏(2)

To get a third equation, we assume that at time 𝑡 = 0, 𝑏0 is debt due; and we assume that at time 𝑡 = 0, the Markov state
𝑠0 = 1
(We could instead have assumed that at time 𝑡 = 0 the Markov state 𝑠0 = 2, which would affect our answer as we shall
see)
Since we have assumed that 𝑠0 = 1, the budget constraint at time 𝑡 = 0 is

𝑐 ̄ + 𝑏0 = 𝑦(1) + 𝑞(1 | 1)𝑏(1) + 𝑞(2 | 1)𝑏(2) (6.10)

where 𝑏0 is the (exogenous) debt the consumer is assumed to bring into period 0
If we substitute (6.10) into the first equation of (6.9) and rearrange, we discover that

𝑏(1) = 𝑏0 (6.11)

98 Chapter 6. Consumption Smoothing with Complete and Incomplete Markets

Advanced Quantitative Economics with Python

We can then use the second equation of (6.9) to deduce the restriction

𝑦(1) − 𝑦(2) + [𝑞(1 | 1) − 𝑞(1 | 2) − 1]𝑏0 + [𝑞(2 | 1) + 1 − 𝑞(2 | 2)]𝑏(2) = 0, (6.12)

an equation that we can solve for the unknown 𝑏(2).

Knowing 𝑏(1) and 𝑏(2), we can solve equation (6.10) for the constant level of consumption 𝑐.̄

6.4.1 Key Outcomes

The preceding calculations indicate that in the complete markets version of our model, we obtain the following striking
results:
• The consumer chooses to make consumption perfectly constant across time and across Markov states.
• State-contingent debt purchases 𝑏𝑡+1 (𝑠𝑡+1 = 𝑗|𝑠𝑡 = 𝑖) depend only on 𝑗
• If the initial Markov state is 𝑠0 = 𝑗 and initial consumer debt is 𝑏0 , then debt in Markov state 𝑗 satisfies 𝑏(𝑗) = 𝑏0
To summarize what we have achieved up to now, we have computed the constant level of consumption 𝑐 ̄ and indicated
how that level depends on the underlying specifications of preferences, Arrow securities prices, the stochastic process of
exogenous nonfinancial income, and the initial debt level 𝑏0
• The consumer’s debt neither accumulates, nor decumulates, nor drifts – instead, the debt level each period is an
exact function of the Markov state, so in the two-state Markov case, it switches between two values.
• We have verified guess (6.8).
• When the state 𝑠𝑡 returns to the initial state 𝑠0 , debt returns to the initial debt level.
• Debt levels in all other states depend on virtually all remaining parameters of the model.

6.4.2 Code

Here’s some code that, among other things, contains a function called consumption_complete().
This function computes {𝑏(𝑖)}𝑁
𝑖=1 , 𝑐 ̄ as outcomes given a set of parameters for the general case with 𝑁 Markov states
under the assumption of complete markets

class ConsumptionProblem:
"""
The data for a consumption problem, including some default values.
"""

def __init__(self,
β=.96,
y=[2, 1.5],
b0=3,
P=[[.8, .2],
[.4, .6]],
init=0):
"""
Parameters
----------

β : discount factor
y : list containing the two income levels
b0 : debt in period 0 (= initial state debt level)
(continues on next page)

6.4. Model 1 (Complete Markets) 99

Advanced Quantitative Economics with Python

(continued from previous page)

P : 2x2 transition matrix
init : index of initial state s0
"""
self.β = β
self.y = np.asarray(y)
self.b0 = b0
self.P = np.asarray(P)
self.init = init

def simulate(self, N_simul=80, random_state=1):

"""
Parameters
----------

N_simul : number of periods for simulation

random_state : random state for simulating Markov chain
"""
# For the simulation define a quantecon MC class
mc = qe.MarkovChain(self.P)
s_path = mc.simulate(N_simul, init=self.init, random_state=random_state)

return s_path

def consumption_complete(cp):
"""
Computes endogenous values for the complete market case.

Parameters
----------

cp : instance of ConsumptionProblem

Returns
-------

c_bar : constant consumption

b : optimal debt in each state

associated with the price system

Q = β * P
"""
β, P, y, b0, init = cp.β, cp.P, cp.y, cp.b0, cp.init # Unpack

Q = β * P # assumed price system

# construct matrices of augmented equation system

n = P.shape[0] + 1

y_aug = np.empty((n, 1))

y_aug[0, 0] = y[init] - b0
y_aug[1:, 0] = y

Q_aug = np.zeros((n, n))

Q_aug[0, 1:] = Q[init, :]

(continues on next page)

100 Chapter 6. Consumption Smoothing with Complete and Incomplete Markets

Advanced Quantitative Economics with Python

(continued from previous page)

Q_aug[1:, 1:] = Q

A = np.zeros((n, n))
A[:, 0] = 1
A[1:, 1:] = np.eye(n-1)

x = np.linalg.inv(A - Q_aug) @ y_aug

c_bar = x[0, 0]
b = x[1:, 0]

return c_bar, b

def consumption_incomplete(cp, s_path):

"""
Computes endogenous values for the incomplete market case.

Parameters
----------

cp : instance of ConsumptionProblem
s_path : the path of states
"""
β, P, y, b0 = cp.β, cp.P, cp.y, cp.b0 # Unpack

N_simul = len(s_path)

# Useful variables
n = len(y)
y.shape = (n, 1)
v = np.linalg.inv(np.eye(n) - β * P) @ y

# Store consumption and debt path

b_path, c_path = np.ones(N_simul+1), np.ones(N_simul)
b_path[0] = b0

# Optimal decisions from (12) and (13)

db = ((1 - β) * v - y) / β

for i, s in enumerate(s_path):
c_path[i] = (1 - β) * (v - np.full((n, 1), b_path[i]))[s, 0]
b_path[i + 1] = b_path[i] + db[s, 0]

return c_path, b_path[:-1], y[s_path]

Let’s test by checking that 𝑐 ̄ and 𝑏2 satisfy the budget constraint

cp = ConsumptionProblem()
c_bar, b = consumption_complete(cp)
np.isclose(c_bar + b[1] - cp.y[1] - (cp.β * cp.P)[1, :] @ b, 0)

True

Below, we’ll take the outcomes produced by this code – in particular the implied consumption and debt paths – and
compare them with outcomes from an incomplete markets model in the spirit of Hall [Hall, 1978]

6.4. Model 1 (Complete Markets) 101

Advanced Quantitative Economics with Python

6.5 Model 2 (One-Period Risk-Free Debt Only)

This is a version of the original model of Hall (1978) in which the consumer’s ability to substitute intertemporally is
constrained by his ability to buy or sell only one security, a risk-free one-period bond bearing a constant gross interest
rate that equals 𝛽 −1 .
Given an initial debt 𝑏0 at time 0, the consumer faces a sequence of budget constraints

𝑐𝑡 + 𝑏𝑡 = 𝑦𝑡 + 𝛽𝑏𝑡+1 , 𝑡≥0

where 𝛽 is the price at time 𝑡 of a risk-free claim on one unit of time consumption at time 𝑡 + 1.
First-order conditions for the consumer’s problem are

∑ 𝑢′ (𝑐𝑡+1,𝑗 )𝑃𝑖𝑗 = 𝑢′ (𝑐𝑡,𝑖 )

𝑗

For our assumed quadratic utility function this implies

∑ 𝑐𝑡+1,𝑗 𝑃𝑖𝑗 = 𝑐𝑡,𝑖 (6.13)

𝑗

which for our finite-state Markov setting is Hall’s (1978) conclusion that consumption follows a random walk.
As we saw in our first lecture on the permanent income model, this leads to
∞
𝑏𝑡 = 𝔼𝑡 ∑ 𝛽 𝑗 𝑦𝑡+𝑗 − (1 − 𝛽)−1 𝑐𝑡 (6.14)
𝑗=0

and
∞
𝑐𝑡 = (1 − 𝛽) [𝔼𝑡 ∑ 𝛽 𝑗 𝑦𝑡+𝑗 − 𝑏𝑡 ] (6.15)
𝑗=0

Equation (6.15) expresses 𝑐𝑡 as a net interest rate factor 1 − 𝛽 times the sum of the expected present value of nonfinancial
∞
income 𝔼𝑡 ∑𝑗=0 𝛽 𝑗 𝑦𝑡+𝑗 and financial wealth −𝑏𝑡 .
Substituting (6.15) into the one-period budget constraint and rearranging leads to
∞
𝑏𝑡+1 − 𝑏𝑡 = 𝛽 −1 [(1 − 𝛽)𝔼𝑡 ∑ 𝛽 𝑗 𝑦𝑡+𝑗 − 𝑦𝑡 ] (6.16)
𝑗=0

∞
Now let’s calculate the key term 𝔼𝑡 ∑𝑗=0 𝛽 𝑗 𝑦𝑡+𝑗 in our finite Markov chain setting.
Define the expected discounted present value of non-financial income
∞
𝑣𝑡 ∶= 𝔼𝑡 ∑ 𝛽 𝑗 𝑦𝑡+𝑗
𝑗=0

which in the spirit of dynamic programming we can write as a Bellman equation

𝑣𝑡 ∶= 𝑦𝑡 + 𝛽𝔼𝑡 𝑣𝑡+1

In our two-state Markov chain setting, 𝑣𝑡 = 𝑣(1) when 𝑠𝑡 = 1 and 𝑣𝑡 = 𝑣(2) when 𝑠𝑡 = 2.
Therefore, we can write our Bellman equation as

𝑣(1) = 𝑦(1) + 𝛽𝑃11 𝑣(1) + 𝛽𝑃12 𝑣(2)

𝑣(2) = 𝑦(2) + 𝛽𝑃21 𝑣(1) + 𝛽𝑃22 𝑣(2)

102 Chapter 6. Consumption Smoothing with Complete and Incomplete Markets

Advanced Quantitative Economics with Python

𝑣 ⃗ = 𝑦 ⃗ + 𝛽𝑃 𝑣 ⃗

𝑣(1) 𝑦(1)
where 𝑣 ⃗ = [ ] and 𝑦 ⃗ = [ ].
𝑣(2) 𝑦(2)
We can also write the last expression as

𝑣 ⃗ = (𝐼 − 𝛽𝑃 )−1 𝑦 ⃗

In our finite Markov chain setting, from expression (6.15), consumption at date 𝑡 when debt is 𝑏𝑡 and the Markov state
today is 𝑠𝑡 = 𝑖 is evidently

𝑐(𝑏𝑡 , 𝑖) = (1 − 𝛽) ([(𝐼 − 𝛽𝑃 )−1 𝑦]⃗ 𝑖 − 𝑏𝑡 ) (6.17)

and the increment to debt is

𝑏𝑡+1 − 𝑏𝑡 = 𝛽 −1 [(1 − 𝛽)𝑣(𝑖) − 𝑦(𝑖)] (6.18)

6.5.1 Summary of Outcomes

In contrast to outcomes in the complete markets model, in the incomplete markets model
• consumption drifts over time as a random walk; the level of consumption at time 𝑡 depends on the level of debt that
the consumer brings into the period as well as the expected discounted present value of nonfinancial income at 𝑡.
• the consumer’s debt drifts upward over time in response to low realizations of nonfinancial income and drifts
downward over time in response to high realizations of nonfinancial income.
• the drift over time in the consumer’s debt and the dependence of current consumption on today’s debt level account
for the drift over time in consumption.

6.5.2 The Incomplete Markets Model

The code above also contains a function called consumption_incomplete() that uses (6.17) and (6.18) to
• simulate paths of 𝑦𝑡 , 𝑐𝑡 , 𝑏𝑡+1
• plot these against values of 𝑐,̄ 𝑏(𝑠1 ), 𝑏(𝑠2 ) found in a corresponding complete markets economy
Let’s try this, using the same parameters in both complete and incomplete markets economies

cp = ConsumptionProblem()
s_path = cp.simulate()
N_simul = len(s_path)

c_bar, debt_complete = consumption_complete(cp)

c_path, debt_path, y_path = consumption_incomplete(cp, s_path)

fig, ax = plt.subplots(1, 2, figsize=(14, 4))

ax[0].set_title('Consumption paths')
ax[0].plot(np.arange(N_simul), c_path, label='incomplete market')
ax[0].plot(np.arange(N_simul), np.full(N_simul, c_bar),
(continues on next page)

6.5. Model 2 (One-Period Risk-Free Debt Only) 103

Advanced Quantitative Economics with Python

(continued from previous page)

label='complete market')
ax[0].plot(np.arange(N_simul), y_path, label='income', alpha=.6, ls='--')
ax[0].legend()
ax[0].set_xlabel('Periods')

ax[1].set_title('Debt paths')
ax[1].plot(np.arange(N_simul), debt_path, label='incomplete market')
ax[1].plot(np.arange(N_simul), debt_complete[s_path],
label='complete market')
ax[1].plot(np.arange(N_simul), y_path, label='income', alpha=.6, ls='--')
ax[1].legend()
ax[1].axhline(0, color='k', ls='--')
ax[1].set_xlabel('Periods')

plt.show()

In the graph on the left, for the same sample path of nonfinancial income 𝑦𝑡 , notice that
• consumption is constant when there are complete markets, but takes a random walk in the incomplete markets
version of the model.
• the consumer’s debt oscillates between two values that are functions of the Markov state in the complete markets
model, while the consumer’s debt drifts in a “unit root” fashion in the incomplete markets economy.

6.5.3 A sequel

In tax smoothing with complete and incomplete markets, we reinterpret the mathematics and Python code presented in this
lecture in order to construct tax-smoothing models in the incomplete markets tradition of Barro [Barro, 1979] as well as
in the complete markets tradition of Lucas and Stokey [Lucas and Stokey, 1983].

104 Chapter 6. Consumption Smoothing with Complete and Incomplete Markets

CHAPTER

SEVEN

TAX SMOOTHING WITH COMPLETE AND INCOMPLETE MARKETS

In addition to what’s in Anaconda, this lecture uses the library:

!pip install --upgrade quantecon

7.1 Overview

This lecture describes tax-smoothing models that are counterparts to consumption-smoothing models in Consumption
Smoothing with Complete and Incomplete Markets.
• one is in the complete markets tradition of Lucas and Stokey [Lucas and Stokey, 1983].
• the other is in the incomplete markets tradition of Barro [Barro, 1979].
Complete markets allow a government to buy or sell claims contingent on all possible Markov states.
Incomplete markets allow a government to buy or sell only a limited set of securities, often only a single risk-free security.
Barro [Barro, 1979] worked in an incomplete markets tradition by assuming that the only asset that can be traded is a
risk-free one period bond.
In his consumption-smoothing model, Hall [Hall, 1978] had assumed an exogenous stochastic process of nonfinancial
income and an exogenous gross interest rate on one period risk-free debt that equals 𝛽 −1 , where 𝛽 ∈ (0, 1) is also a
consumer’s intertemporal discount factor.
Barro [Barro, 1979] made an analogous assumption about the risk-free interest rate in a tax-smoothing model that turns
out to have the same mathematical structure as Hall’s consumption-smoothing model.
To get Barro’s model from Hall’s, all we have to do is to rename variables.
We maintain Hall’s and Barro’s assumption about the interest rate when we describe an incomplete markets version of
our model.
In addition, we extend their assumption about the interest rate to an appropriate counterpart to create a “complete markets”
model in the style of Lucas and Stokey [Lucas and Stokey, 1983].

105
Advanced Quantitative Economics with Python

7.1.1 Isomorphism between Consumption and Tax Smoothing

For each version of a consumption-smoothing model, a tax-smoothing counterpart can be obtained simply by relabeling
• consumption as tax collections
• a consumer’s one-period utility function as a government’s one-period loss function from collecting taxes that im-
pose deadweight welfare losses
• a consumer’s nonfinancial income as a government’s purchases
• a consumer’s debt as a government’s assets
Thus, we can convert the consumption-smoothing models in lecture Consumption Smoothing with Complete and Incomplete
Markets into tax-smoothing models by setting 𝑐𝑡 = 𝑇𝑡 , 𝑦𝑡 = 𝐺𝑡 , and −𝑏𝑡 = 𝑎𝑡 , where 𝑇𝑡 is total tax collections, {𝐺𝑡 }
is an exogenous government expenditures process, and 𝑎𝑡 is the government’s holdings of one-period risk-free bonds
coming maturing at the due at the beginning of time 𝑡.
For elaborations on this theme, please see Optimal Savings II: LQ Techniques and later parts of this lecture.
We’ll spend most of this lecture studying acquire finite-state Markov specification, but will also treat the linear state space
specification.

Link to History

For those who love history, President Thomas Jefferson’s Secretary of Treasury Albert Gallatin (1807) [Gallatin, 1837]
seems to have prescribed policies that come from Barro’s model [Barro, 1979]
Let’s start with some standard imports:

import numpy as np
import quantecon as qe
import matplotlib.pyplot as plt

To exploit the isomorphism between consumption-smoothing and tax-smoothing models, we simply use code from Con-
sumption Smoothing with Complete and Incomplete Markets

7.1.2 Code

Among other things, this code contains a function called consumption_complete().

This function computes {𝑏(𝑖)}𝑁
𝑖=1 , 𝑐 ̄ as outcomes given a set of parameters for the general case with 𝑁 Markov states
under the assumption of complete markets

class ConsumptionProblem:
"""
The data for a consumption problem, including some default values.
"""

def __init__(self,
β=.96,
y=[2, 1.5],
b0=3,
P=[[.8, .2],
[.4, .6]],
init=0):
"""
(continues on next page)

106 Chapter 7. Tax Smoothing with Complete and Incomplete Markets

Advanced Quantitative Economics with Python

(continued from previous page)

Parameters
----------

β : discount factor
y : list containing the two income levels
b0 : debt in period 0 (= initial state debt level)
P : 2x2 transition matrix
init : index of initial state s0
"""
self.β = β
self.y = np.asarray(y)
self.b0 = b0
self.P = np.asarray(P)
self.init = init

def simulate(self, N_simul=80, random_state=1):

"""
Parameters
----------

N_simul : number of periods for simulation

return s_path

def consumption_complete(cp):
"""
Computes endogenous values for the complete market case.

Parameters
----------

cp : instance of ConsumptionProblem

Returns
-------

c_bar : constant consumption

b : optimal debt in each state

associated with the price system

Q = β * P
"""
β, P, y, b0, init = cp.β, cp.P, cp.y, cp.b0, cp.init # Unpack

Q = β * P # assumed price system

# construct matrices of augmented equation system

n = P.shape[0] + 1

(continues on next page)

7.1. Overview 107

Advanced Quantitative Economics with Python

(continued from previous page)

y_aug = np.empty((n, 1))
y_aug[0, 0] = y[init] - b0
y_aug[1:, 0] = y

Q_aug = np.zeros((n, n))

Q_aug[0, 1:] = Q[init, :]
Q_aug[1:, 1:] = Q

A = np.zeros((n, n))
A[:, 0] = 1
A[1:, 1:] = np.eye(n-1)

x = np.linalg.inv(A - Q_aug) @ y_aug

c_bar = x[0, 0]
b = x[1:, 0]

return c_bar, b

def consumption_incomplete(cp, s_path):

"""
Computes endogenous values for the incomplete market case.

Parameters
----------

cp : instance of ConsumptionProblem
s_path : the path of states
"""
β, P, y, b0 = cp.β, cp.P, cp.y, cp.b0 # Unpack

N_simul = len(s_path)

# Useful variables
n = len(y)
y.shape = (n, 1)
v = np.linalg.inv(np.eye(n) - β * P) @ y

# Store consumption and debt path

b_path, c_path = np.ones(N_simul+1), np.ones(N_simul)
b_path[0] = b0

# Optimal decisions from (12) and (13)

db = ((1 - β) * v - y) / β

for i, s in enumerate(s_path):
c_path[i] = (1 - β) * (v - np.full((n, 1), b_path[i]))[s, 0]
b_path[i + 1] = b_path[i] + db[s, 0]

return c_path, b_path[:-1], y[s_path]

108 Chapter 7. Tax Smoothing with Complete and Incomplete Markets

Advanced Quantitative Economics with Python

7.1.3 Revisiting the consumption-smoothing model

cp = ConsumptionProblem()
s_path = cp.simulate()
N_simul = len(s_path)

c_bar, debt_complete = consumption_complete(cp)

c_path, debt_path, y_path = consumption_incomplete(cp, s_path)

fig, ax = plt.subplots(1, 2, figsize=(14, 4))

ax[0].set_title('Consumption paths')
ax[0].plot(np.arange(N_simul), c_path, label='incomplete market')
ax[0].plot(np.arange(N_simul), np.full(N_simul, c_bar), label='complete market')
ax[0].plot(np.arange(N_simul), y_path, label='income', alpha=.6, ls='--')
ax[0].legend()
ax[0].set_xlabel('Periods')

ax[1].set_title('Debt paths')
ax[1].plot(np.arange(N_simul), debt_path, label='incomplete market')
ax[1].plot(np.arange(N_simul), debt_complete[s_path], label='complete market')
ax[1].plot(np.arange(N_simul), y_path, label='income', alpha=.6, ls='--')
ax[1].legend()
ax[1].axhline(0, color='k', ls='--')
ax[1].set_xlabel('Periods')

plt.show()

In the graph on the left, for the same sample path of nonfinancial income 𝑦𝑡 , notice that
• consumption is constant when there are complete markets.
• consumption takes a random walk in the incomplete markets version of the model.
• the consumer’s debt oscillates between two values that are functions of the Markov state in the complete markets
model.

7.1. Overview 109

Advanced Quantitative Economics with Python

• the consumer’s debt drifts because it contains a unit root in the incomplete markets economy.

Relabeling variables to create tax-smoothing models

As indicated above, we relabel variables to acquire tax-smoothing interpretations of the complete markets and incomplete
markets consumption-smoothing models.

fig, ax = plt.subplots(1, 2, figsize=(14, 4))

ax[0].set_title('Tax collection paths')

ax[0].plot(np.arange(N_simul), c_path, label='incomplete market')
ax[0].plot(np.arange(N_simul), np.full(N_simul, c_bar), label='complete market')
ax[0].plot(np.arange(N_simul), y_path, label='govt expenditures', alpha=.6, ls='--')
ax[0].legend()
ax[0].set_xlabel('Periods')
ax[0].set_ylim([1.4, 2.1])

ax[1].set_title('Government assets paths')

ax[1].plot(np.arange(N_simul), debt_path, label='incomplete market')
ax[1].plot(np.arange(N_simul), debt_complete[s_path], label='complete market')
ax[1].plot(np.arange(N_simul), y_path, label='govt expenditures', ls='--')
ax[1].legend()
ax[1].axhline(0, color='k', ls='--')
ax[1].set_xlabel('Periods')

plt.show()

7.2 Tax Smoothing with Complete Markets

It is instructive to focus on a simple tax-smoothing example with complete markets.

This example illustrates how, in a complete markets model like that of Lucas and Stokey [Lucas and Stokey, 1983], the
government purchases insurance from the private sector.
Payouts from the insurance it had purchased allows the government to avoid raising taxes when emergencies make gov-
ernment expenditures surge.
We assume that government expenditures take one of two values 𝐺1 < 𝐺2 , where Markov state 1 means “peace” and
Markov state 2 means “war”.

110 Chapter 7. Tax Smoothing with Complete and Incomplete Markets

Advanced Quantitative Economics with Python

The government budget constraint in Markov state 𝑖 is

𝑇𝑖 + 𝑏𝑖 = 𝐺𝑖 + ∑ 𝑄𝑖𝑗 𝑏𝑗
𝑗

where

𝑄𝑖𝑗 = 𝛽𝑃𝑖𝑗

is the price today of one unit of goods in Markov state 𝑗 tomorrow when the Markov state is 𝑖 today.
𝑏𝑖 is the government’s level of assets when it arrives in Markov state 𝑖.
That is, 𝑏𝑖 equals one-period state-contingent claims owed to the government that fall due at time 𝑡 when the Markov state
is 𝑖.
Thus, if 𝑏𝑖 < 0, it means the government is owed 𝑏𝑖 or owes −𝑏𝑖 when the economy arrives in Markov state 𝑖 at time 𝑡.
In our examples below, this happens when in a previous war-time period the government has sold an Arrow securities
paying off −𝑏𝑖 in peacetime Markov state 𝑖
It can be enlightening to express the government’s budget constraint in Markov state 𝑖 as

𝑇𝑖 = 𝐺𝑖 + (∑ 𝑄𝑖𝑗 𝑏𝑗 − 𝑏𝑖 )
𝑗

in which the term (∑𝑗 𝑄𝑖𝑗 𝑏𝑗 − 𝑏𝑖 ) equals the net amount that the government spends to purchase one-period Arrow
securities that will pay off next period in Markov states 𝑗 = 1, … , 𝑁 after it has received payments 𝑏𝑖 this period.

7.3 Returns on State-Contingent Debt

𝑁
Notice that ∑𝑗′ =1 𝑄𝑖𝑗′ 𝑏(𝑗′ ) is the amount that the government spends in Markov state 𝑖 at time 𝑡 to purchase one-period
state-contingent claims that will pay off in Markov state 𝑗′ at time 𝑡 + 1.
Then the ex post one-period gross return on the portfolio of government assets held from state 𝑖 at time 𝑡 to state 𝑗 at time
𝑡 + 1 is
𝑏(𝑗)
𝑅(𝑗|𝑖) = 𝑁
∑𝑗′ =1 𝑄𝑖𝑗′ 𝑏(𝑗′ )

The cumulative return earned from putting 1 unit of time 𝑡 goods into the government portfolio of state-contingent
securities at time 𝑡 and then rolling over the proceeds into the government portfolio each period thereafter is

𝑅𝑇 (𝑠𝑡+𝑇 , 𝑠𝑡+𝑇 −1 , … , 𝑠𝑡 ) ≡ 𝑅(𝑠𝑡+1 |𝑠𝑡 )𝑅(𝑠𝑡+2 |𝑠𝑡+1 ) ⋯ 𝑅(𝑠𝑡+𝑇 |𝑠𝑡+𝑇 −1 )

Here is some code that computes one-period and cumulative returns on the government portfolio in the finite-state Markov
version of our complete markets model.
Convention: In this code, when 𝑃𝑖𝑗 = 0, we arbitrarily set 𝑅(𝑗|𝑖) to be 0.

def ex_post_gross_return(b, cp):

"""
calculate the ex post one-period gross return on the portfolio
of government assets, given b and Q.
"""
Q = cp.β * cp.P
(continues on next page)

7.3. Returns on State-Contingent Debt 111

Advanced Quantitative Economics with Python

(continued from previous page)

values = Q @ b

n = len(b)
R = np.zeros((n, n))

for i in range(n):
ind = cp.P[i, :] != 0
R[i, ind] = b[ind] / values[i]

return R

def cumulative_return(s_path, R):

"""
compute cumulative return from holding 1 unit market portfolio
of government bonds, given some simulated state path.
"""
T = len(s_path)

RT_path = np.empty(T)
RT_path[0] = 1
RT_path[1:] = np.cumprod([R[s_path[t], s_path[t+1]] for t in range(T-1)])

return RT_path

7.3.1 An Example of Tax Smoothing

We’ll study a tax-smoothing model with two Markov states.

In Markov state 1, there is peace and government expenditures are low.
In Markov state 2, there is war and government expenditures are high.
We’ll compute optimal policies in both complete and incomplete markets settings.
Then we’ll feed in a particular assumed path of Markov states and study outcomes.
• We’ll assume that the initial Markov state is state 1, which means we start from a state of peace.
• The government then experiences 3 time periods of war and come back to peace again.
• The history of Markov states is therefore {𝑝𝑒𝑎𝑐𝑒, 𝑤𝑎𝑟, 𝑤𝑎𝑟, 𝑤𝑎𝑟, 𝑝𝑒𝑎𝑐𝑒}.
In addition, as indicated above, to simplify our example, we’ll set the government’s initial asset level to 1, so that 𝑏1 = 1.
Here’s code that itinitializes government assets to be unity in an initial peace time Markov state.

# Parameters
β = .96

# change notation y to g in the tax-smoothing example

g = [1, 2]
b0 = 1
P = np.array([[.8, .2],
[.4, .6]])

cp = ConsumptionProblem(β, g, b0, P)
(continues on next page)

112 Chapter 7. Tax Smoothing with Complete and Incomplete Markets

Advanced Quantitative Economics with Python

(continued from previous page)

Q = β * P

# change notation c_bar to T_bar in the tax-smoothing example

T_bar, b = consumption_complete(cp)
R = ex_post_gross_return(b, cp)
s_path = [0, 1, 1, 1, 0]
RT_path = cumulative_return(s_path, R)

print(f"P \n {P}")
print(f"Q \n {Q}")
print(f"Govt expenditures in peace and war = {g}")
print(f"Constant tax collections = {T_bar}")
print(f"Govt debts in two states = {-b}")

msg = """
Now let's check the government's budget constraint in peace and war.
Our assumptions imply that the government always purchases 0 units of the
Arrow peace security.
"""
print(msg)

AS1 = Q[0, :] @ b
# spending on Arrow security
# since the spending on Arrow peace security is not 0 anymore after we change b0 to 1
print(f"Spending on Arrow security in peace = {AS1}")
AS2 = Q[1, :] @ b
print(f"Spending on Arrow security in war = {AS2}")

print("")
# tax collections minus debt levels
print("Government tax collections minus debt levels in peace and war")
TB1 = T_bar + b[0]
print(f"T+b in peace = {TB1}")
TB2 = T_bar + b[1]
print(f"T+b in war = {TB2}")

print("")
print("Total government spending in peace and war")
G1 = g[0] + AS1
G2 = g[1] + AS2
print(f"Peace = {G1}")
print(f"War = {G2}")

print("")
print("Let's see ex-post and ex-ante returns on Arrow securities")

Π = np.reciprocal(Q)
exret = Π
print(f"Ex-post returns to purchase of Arrow securities = \n {exret}")
exant = Π * P
print(f"Ex-ante returns to purchase of Arrow securities \n {exant}")

print("")
print("The Ex-post one-period gross return on the portfolio of government assets")
print(R)

(continues on next page)

7.3. Returns on State-Contingent Debt 113

Advanced Quantitative Economics with Python

(continued from previous page)

print("")
print("The cumulative return earned from holding 1 unit market portfolio of␣
↪government bonds")

print(RT_path[-1])

P
[[0.8 0.2]
[0.4 0.6]]
Q
[[0.768 0.192]
[0.384 0.576]]
Govt expenditures in peace and war = [1, 2]
Constant tax collections = 1.2716883116883118
Govt debts in two states = [-1. -2.62337662]

Now let's check the government's budget constraint in peace and war.
Our assumptions imply that the government always purchases 0 units of the
Arrow peace security.

Spending on Arrow security in peace = 1.2716883116883118

Spending on Arrow security in war = 1.895064935064935

Government tax collections minus debt levels in peace and war

T+b in peace = 2.2716883116883118
T+b in war = 3.895064935064935

Total government spending in peace and war

Peace = 2.2716883116883118
War = 3.895064935064935

Let's see ex-post and ex-ante returns on Arrow securities

Ex-post returns to purchase of Arrow securities =
[[1.30208333 5.20833333]
[2.60416667 1.73611111]]
Ex-ante returns to purchase of Arrow securities
[[1.04166667 1.04166667]
[1.04166667 1.04166667]]

The Ex-post one-period gross return on the portfolio of government assets

[[0.78635621 2.0629085 ]
[0.5276864 1.38432018]]

The cumulative return earned from holding 1 unit market portfolio of government␣
↪bonds

2.0860704239993675

114 Chapter 7. Tax Smoothing with Complete and Incomplete Markets

Advanced Quantitative Economics with Python

7.3.2 Explanation

In this example, the government always purchase 1 units of the Arrow security that pays off in peace time (Markov state
1).
And it purchases a higher amount of the security that pays off in war time (Markov state 2).
Thus, this is an example in which
• during peacetime, the government purchases insurance against the possibility that war breaks out next period
• during wartime, the government purchases insurance against the possibility that war continues another period
• so long as peace continues, the ex post return on insurance against war is low
• when war breaks out or continues, the ex post return on insurance against war is high
• given the history of states that we assumed, the value of one unit of the portfolio of government assets eventually
doubles in the end because of high returns during wartime.
We recommend plugging the quantities computed above into the government budget constraints in the two Markov states
and staring.

Exercise 7.3.1
Try changing the Markov transition matrix so that

1 0
𝑃 =[ ]
.2 .8

Also, start the system in Markov state 2 (war) with initial government assets −10, so that the government starts the war
in debt and 𝑏2 = −10.

7.4 More Finite Markov Chain Tax-Smoothing Examples

To interpret some episodes in the fiscal history of the United States, we find it interesting to study a few more examples.
We compute examples in an 𝑁 state Markov setting under both complete and incomplete markets.
These examples differ in how Markov states are jumping between peace and war.
To wrap procedures for solving models, relabeling graphs so that we record government debt rather than government
assets, and displaying results, we construct a Python class.

class TaxSmoothingExample:
"""
construct a tax-smoothing example, by relabeling consumption problem class.
"""
def __init__(self, g, P, b0, states, β=.96,
init=0, s_path=None, N_simul=80, random_state=1):

self.states = states # state names

# if the path of states is not specified

if s_path is None:
self.cp = ConsumptionProblem(β, g, b0, P, init=init)
self.s_path = self.cp.simulate(N_simul=N_simul, random_state=random_state)
(continues on next page)

7.4. More Finite Markov Chain Tax-Smoothing Examples 115

Advanced Quantitative Economics with Python

(continued from previous page)

# if the path of states is specified
else:
self.cp = ConsumptionProblem(β, g, b0, P, init=s_path[0])
self.s_path = s_path

# solve for complete market case

self.T_bar, self.b = consumption_complete(self.cp)
self.debt_value = - (β * P @ self.b).T

# solve for incomplete market case

self.T_path, self.asset_path, self.g_path = \
consumption_incomplete(self.cp, self.s_path)

# calculate returns on state-contingent debt

self.R = ex_post_gross_return(self.b, self.cp)
self.RT_path = cumulative_return(self.s_path, self.R)

def display(self):

# plot graphs
N = len(self.T_path)

plt.figure()
plt.title('Tax collection paths')
plt.plot(np.arange(N), self.T_path, label='incomplete market')
plt.plot(np.arange(N), np.full(N, self.T_bar), label='complete market')
plt.plot(np.arange(N), self.g_path, label='govt expenditures', alpha=.6, ls='-
↪-')
plt.legend()
plt.xlabel('Periods')
plt.show()

plt.title('Government debt paths')

plt.plot(np.arange(N), -self.asset_path, label='incomplete market')
plt.plot(np.arange(N), -self.b[self.s_path], label='complete market')
plt.plot(np.arange(N), self.g_path, label='govt expenditures', ls='--')
plt.plot(np.arange(N), self.debt_value[self.s_path], label="value of debts␣
↪today")

plt.legend()
plt.axhline(0, color='k', ls='--')
plt.xlabel('Periods')
plt.show()

fig, ax = plt.subplots()
ax.set_title('Cumulative return path (complete markets)')
line1 = ax.plot(np.arange(N), self.RT_path, color='blue')[0]
c1 = line1.get_color()
ax.set_xlabel('Periods')
ax.set_ylabel('Cumulative return', color=c1)

ax_ = ax.twinx()
line2 = ax_.plot(np.arange(N), self.g_path, ls='--', color='green')[0]
c2 = line2.get_color()
ax_.set_ylabel('Government expenditures', color=c2)

plt.show()

(continues on next page)

116 Chapter 7. Tax Smoothing with Complete and Incomplete Markets

Advanced Quantitative Economics with Python

(continued from previous page)

# plot detailed information

Q = self.cp.β * self.cp.P

print(f"P \n {self.cp.P}")
print(f"Q \n {Q}")
print(f"Govt expenditures in {', '.join(self.states)} = {self.cp.y.flatten()}
↪")
print(f"Constant tax collections = {self.T_bar}")
print(f"Govt debt in {len(self.states)} states = {-self.b}")

print("")
print(f"Government tax collections minus debt levels in {', '.join(self.
↪states)}")

for i in range(len(self.states)):
TB = self.T_bar + self.b[i]
print(f" T+b in {self.states[i]} = {TB}")

print("")
print(f"Total government spending in {', '.join(self.states)}")
for i in range(len(self.states)):
G = self.cp.y[i, 0] + Q[i, :] @ self.b
print(f" {self.states[i]} = {G}")

print("")
print("Let's see ex-post and ex-ante returns on Arrow securities \n")

print(f"Ex-post returns to purchase of Arrow securities:")

for i in range(len(self.states)):
for j in range(len(self.states)):
if Q[i, j] != 0.:
print(f" π({self.states[j]}|{self.states[i]}) = {1/Q[i, j]}")

print("")
exant = 1 / self.cp.β
print(f"Ex-ante returns to purchase of Arrow securities = {exant}")

print("")
print("The Ex-post one-period gross return on the portfolio of government␣
↪assets")

print(self.R)

print("")
print("The cumulative return earned from holding 1 unit market portfolio of␣
↪government bonds")

print(self.RT_path[-1])

7.4. More Finite Markov Chain Tax-Smoothing Examples 117

Advanced Quantitative Economics with Python

7.4.1 Parameters

γ = .1
λ = .1
ϕ = .1
θ = .1
ψ = .1
g_L = .5
g_M = .8
g_H = 1.2
β = .96

7.4.2 Example 1

This example is designed to produce some stylized versions of tax, debt, and deficit paths followed by the United States
during and after the Civil War and also during and after World War I.
We set the Markov chain to have three states
1−𝜆 𝜆 0
𝑃 =⎡
⎢ 0 1 − 𝜙 𝜙⎤⎥
⎣ 0 0 1⎦

where the government expenditure vector 𝑔 = [𝑔𝐿 𝑔𝐻 𝑔𝑀 ] where 𝑔𝐿 < 𝑔𝑀 < 𝑔𝐻 .

We set 𝑏0 = 1 and assume that the initial Markov state is state 1 so that the system starts off in peace.
These parameters have government expenditure beginning at a low level, surging during the war, then decreasing after
the war to a level that exceeds its prewar level.
(This type of pattern occurred in the US Civil War and World War I experiences.)

g_ex1 = [g_L, g_H, g_M]

P_ex1 = np.array([[1-λ, λ, 0],
[0, 1-ϕ, ϕ],
[0, 0, 1]])
b0_ex1 = 1
states_ex1 = ['peace', 'war', 'postwar']

ts_ex1 = TaxSmoothingExample(g_ex1, P_ex1, b0_ex1, states_ex1, random_state=1)

ts_ex1.display()

118 Chapter 7. Tax Smoothing with Complete and Incomplete Markets

Advanced Quantitative Economics with Python

7.4. More Finite Markov Chain Tax-Smoothing Examples 119

Advanced Quantitative Economics with Python

120 Chapter 7. Tax Smoothing with Complete and Incomplete Markets

Advanced Quantitative Economics with Python

P
[[0.9 0.1 0. ]
[0. 0.9 0.1]
[0. 0. 1. ]]
Q
[[0.864 0.096 0. ]
[0. 0.864 0.096]
[0. 0. 0.96 ]]
Govt expenditures in peace, war, postwar = [0.5 1.2 0.8]
Constant tax collections = 0.7548096885813149
Govt debt in 3 states = [-1. -4.07093426 -1.12975779]

Government tax collections minus debt levels in peace, war, postwar

T+b in peace = 1.754809688581315
T+b in war = 4.825743944636679
T+b in postwar = 1.8845674740484442

Total government spending in peace, war, postwar

peace = 1.754809688581315
war = 4.825743944636679
postwar = 1.8845674740484442

Let's see ex-post and ex-ante returns on Arrow securities

Ex-post returns to purchase of Arrow securities:

π(peace|peace) = 1.1574074074074074
π(war|peace) = 10.416666666666666
(continues on next page)

7.4. More Finite Markov Chain Tax-Smoothing Examples 121

Advanced Quantitative Economics with Python

(continued from previous page)

π(war|war) = 1.1574074074074074
π(postwar|war) = 10.416666666666666
π(postwar|postwar) = 1.0416666666666667

Ex-ante returns to purchase of Arrow securities = 1.0416666666666667

The Ex-post one-period gross return on the portfolio of government assets

[[0.7969336 3.24426428 0. ]
[0. 1.12278592 0.31159337]
[0. 0. 1.04166667]]

The cumulative return earned from holding 1 unit market portfolio of government␣
↪bonds

0.17908622141460231

# The following shows the use of the wrapper class when a specific state path is given
s_path = [0, 0, 1, 1, 2]
ts_s_path = TaxSmoothingExample(g_ex1, P_ex1, b0_ex1, states_ex1, s_path=s_path)
ts_s_path.display()

122 Chapter 7. Tax Smoothing with Complete and Incomplete Markets

Advanced Quantitative Economics with Python

7.4. More Finite Markov Chain Tax-Smoothing Examples 123

Advanced Quantitative Economics with Python

Government tax collections minus debt levels in peace, war, postwar

T+b in peace = 1.754809688581315
T+b in war = 4.825743944636679
T+b in postwar = 1.8845674740484442

Total government spending in peace, war, postwar

peace = 1.754809688581315
war = 4.825743944636679
postwar = 1.8845674740484442

Let's see ex-post and ex-ante returns on Arrow securities

Ex-post returns to purchase of Arrow securities:

π(peace|peace) = 1.1574074074074074
π(war|peace) = 10.416666666666666
(continues on next page)

124 Chapter 7. Tax Smoothing with Complete and Incomplete Markets

Advanced Quantitative Economics with Python

(continued from previous page)

π(war|war) = 1.1574074074074074
π(postwar|war) = 10.416666666666666
π(postwar|postwar) = 1.0416666666666667

Ex-ante returns to purchase of Arrow securities = 1.0416666666666667

The Ex-post one-period gross return on the portfolio of government assets

[[0.7969336 3.24426428 0. ]
[0. 1.12278592 0.31159337]
[0. 0. 1.04166667]]

The cumulative return earned from holding 1 unit market portfolio of government␣
↪bonds

0.9045311615620277

7.4.3 Example 2

This example captures a peace followed by a war, eventually followed by a permanent peace .
Here we set
1 0 0
𝑃 =⎡
⎢0 1−𝛾 𝛾 ⎤⎥
⎣𝜙 0 1 − 𝜙⎦

where the government expenditure vector 𝑔 = [𝑔𝐿 𝑔𝐿 𝑔𝐻 ] and where 𝑔𝐿 < 𝑔𝐻 .

We assume 𝑏0 = 1 and that the initial Markov state is state 2 so that the system starts off in a temporary peace.

g_ex2 = [g_L, g_L, g_H]

P_ex2 = np.array([[1, 0, 0],
[0, 1-γ, γ],
[ϕ, 0, 1-ϕ]])
b0_ex2 = 1
states_ex2 = ['peace', 'temporary peace', 'war']

ts_ex2 = TaxSmoothingExample(g_ex2, P_ex2, b0_ex2, states_ex2, init=1, random_state=1)

ts_ex2.display()

7.4. More Finite Markov Chain Tax-Smoothing Examples 125

Advanced Quantitative Economics with Python

126 Chapter 7. Tax Smoothing with Complete and Incomplete Markets

Advanced Quantitative Economics with Python

7.4. More Finite Markov Chain Tax-Smoothing Examples 127

Advanced Quantitative Economics with Python

P
[[1. 0. 0. ]
[0. 0.9 0.1]
[0.1 0. 0.9]]
Q
[[0.96 0. 0. ]
[0. 0.864 0.096]
[0.096 0. 0.864]]
Govt expenditures in peace, temporary peace, war = [0.5 0.5 1.2]
Constant tax collections = 0.6053287197231834
Govt debt in 3 states = [ 2.63321799 -1. -2.51384083]

Government tax collections minus debt levels in peace, temporary peace, war
T+b in peace = -2.0278892733564
T+b in temporary peace = 1.6053287197231834
T+b in war = 3.1191695501730106

Total government spending in peace, temporary peace, war

peace = -2.0278892733564
temporary peace = 1.6053287197231834
war = 3.1191695501730106

Let's see ex-post and ex-ante returns on Arrow securities

Ex-post returns to purchase of Arrow securities:

π(peace|peace) = 1.0416666666666667
π(temporary peace|temporary peace) = 1.1574074074074074
(continues on next page)

128 Chapter 7. Tax Smoothing with Complete and Incomplete Markets

Advanced Quantitative Economics with Python

(continued from previous page)

π(war|temporary peace) = 10.416666666666666
π(peace|war) = 10.416666666666666
π(war|war) = 1.1574074074074074

Ex-ante returns to purchase of Arrow securities = 1.0416666666666667

The Ex-post one-period gross return on the portfolio of government assets

[[ 1.04166667 0. 0. ]
[ 0. 0.90470824 2.27429251]
[-1.37206116 0. 1.30985865]]

The cumulative return earned from holding 1 unit market portfolio of government␣
↪bonds

-9.368991732594216

7.4.4 Example 3

This example features a situation in which one of the states is a war state with no hope of peace next period, while another
state is a war state with a positive probability of peace next period.
The Markov chain is:
1−𝜆 𝜆 0 0
⎡ 0 1−𝜙 𝜙 0 ⎤
𝑃 =⎢ ⎥
⎢ 0 0 1−𝜓 𝜓 ⎥
⎣ 𝜃 0 0 1 − 𝜃⎦

with government expenditure levels for the four states being [𝑔𝐿 𝑔𝐿 𝑔𝐻 𝑔𝐻 ] where 𝑔𝐿 < 𝑔𝐻 .
We start with 𝑏0 = 1 and 𝑠0 = 1.

g_ex3 = [g_L, g_L, g_H, g_H]

P_ex3 = np.array([[1-λ, λ, 0, 0],
[0, 1-ϕ, ϕ, 0],
[0, 0, 1-ψ, ψ],
[θ, 0, 0, 1-θ ]])
b0_ex3 = 1
states_ex3 = ['peace1', 'peace2', 'war1', 'war2']

ts_ex3 = TaxSmoothingExample(g_ex3, P_ex3, b0_ex3, states_ex3, random_state=1)

ts_ex3.display()

7.4. More Finite Markov Chain Tax-Smoothing Examples 129

Advanced Quantitative Economics with Python

130 Chapter 7. Tax Smoothing with Complete and Incomplete Markets

Advanced Quantitative Economics with Python

7.4. More Finite Markov Chain Tax-Smoothing Examples 131

Advanced Quantitative Economics with Python

P
[[0.9 0.1 0. 0. ]
[0. 0.9 0.1 0. ]
[0. 0. 0.9 0.1]
[0.1 0. 0. 0.9]]
Q
[[0.864 0.096 0. 0. ]
[0. 0.864 0.096 0. ]
[0. 0. 0.864 0.096]
[0.096 0. 0. 0.864]]
Govt expenditures in peace1, peace2, war1, war2 = [0.5 0.5 1.2 1.2]
Constant tax collections = 0.6927944572748268
Govt debt in 4 states = [-1. -3.42494226 -6.86027714 -4.43533487]

Government tax collections minus debt levels in peace1, peace2, war1, war2
T+b in peace1 = 1.6927944572748268
T+b in peace2 = 4.117736720554273
T+b in war1 = 7.553071593533488
T+b in war2 = 5.128129330254041

Total government spending in peace1, peace2, war1, war2

peace1 = 1.6927944572748268
peace2 = 4.117736720554273
war1 = 7.553071593533487
war2 = 5.128129330254041

Let's see ex-post and ex-ante returns on Arrow securities

(continues on next page)

132 Chapter 7. Tax Smoothing with Complete and Incomplete Markets

Advanced Quantitative Economics with Python

(continued from previous page)

Ex-post returns to purchase of Arrow securities:

Ex-ante returns to purchase of Arrow securities = 1.0416666666666667

The Ex-post one-period gross return on the portfolio of government assets

[[0.83836741 2.87135998 0. 0. ]
[0. 0.94670854 1.89628977 0. ]
[0. 0. 1.07983627 0.69814023]
[0.2545741 0. 0. 1.1291214 ]]

The cumulative return earned from holding 1 unit market portfolio of government␣
↪bonds

0.02371440178864222

7.4.5 Example 4

Here the Markov chain is:

1−𝜆 𝜆 0 0 0
⎡ 0 1−𝜙 𝜙 0 0⎤
⎢ ⎥
𝑃 =⎢ 0 0 1−𝜓 𝜓 0⎥
⎢ 0 0 0 1−𝜃 𝜃⎥
⎣ 0 0 0 0 1⎦

with government expenditure levels for the five states being [𝑔𝐿 𝑔𝐿 𝑔𝐻 𝑔𝐻 𝑔𝐿 ] where 𝑔𝐿 < 𝑔𝐻 .
We ssume that 𝑏0 = 1 and 𝑠0 = 1.

g_ex4 = [g_L, g_L, g_H, g_H, g_L]

P_ex4 = np.array([[1-λ, λ, 0, 0, 0],
[0, 1-ϕ, ϕ, 0, 0],
[0, 0, 1-ψ, ψ, 0],
[0, 0, 0, 1-θ, θ],
[0, 0, 0, 0, 1]])
b0_ex4 = 1
states_ex4 = ['peace1', 'peace2', 'war1', 'war2', 'permanent peace']

ts_ex4 = TaxSmoothingExample(g_ex4, P_ex4, b0_ex4, states_ex4, random_state=1)

ts_ex4.display()

7.4. More Finite Markov Chain Tax-Smoothing Examples 133

Advanced Quantitative Economics with Python

134 Chapter 7. Tax Smoothing with Complete and Incomplete Markets

Advanced Quantitative Economics with Python

7.4. More Finite Markov Chain Tax-Smoothing Examples 135

Advanced Quantitative Economics with Python

P
[[0.9 0.1 0. 0. 0. ]
[0. 0.9 0.1 0. 0. ]
[0. 0. 0.9 0.1 0. ]
[0. 0. 0. 0.9 0.1]
[0. 0. 0. 0. 1. ]]
Q
[[0.864 0.096 0. 0. 0. ]
[0. 0.864 0.096 0. 0. ]
[0. 0. 0.864 0.096 0. ]
[0. 0. 0. 0.864 0.096]
[0. 0. 0. 0. 0.96 ]]
Govt expenditures in peace1, peace2, war1, war2, permanent peace = [0.5 0.5 1.2 1.
↪2 0.5]

Constant tax collections = 0.6349979047185738

Govt debt in 5 states = [-1. -2.82289484 -5.4053292 -1.77211121 3.
↪37494762]

Government tax collections minus debt levels in peace1, peace2, war1, war2,␣
↪permanent peace

T+b in peace1 = 1.6349979047185736

T+b in peace2 = 3.4578927455370505
T+b in war1 = 6.040327103363229
T+b in war2 = 2.4071091102836433
T+b in permanent peace = -2.7399497132457697

Total government spending in peace1, peace2, war1, war2, permanent peace

(continues on next page)

136 Chapter 7. Tax Smoothing with Complete and Incomplete Markets

Advanced Quantitative Economics with Python

(continued from previous page)

peace1 = 1.6349979047185736
peace2 = 3.457892745537051
war1 = 6.040327103363228
war2 = 2.407109110283643
permanent peace = -2.7399497132457697

Let's see ex-post and ex-ante returns on Arrow securities

Ex-post returns to purchase of Arrow securities:

Ex-ante returns to purchase of Arrow securities = 1.0416666666666667

The Ex-post one-period gross return on the portfolio of government assets

[[ 0.8810589 2.48713661 0. 0. 0. ]
[ 0. 0.95436011 1.82742569 0. 0. ]
[ 0. 0. 1.11672808 0.36611394 0. ]
[ 0. 0. 0. 1.46806216 -2.79589276]
[ 0. 0. 0. 0. 1.04166667]]

The cumulative return earned from holding 1 unit market portfolio of government␣
↪bonds

-11.132109773063616

7.4.6 Example 5

The example captures a case when the system follows a deterministic path from peace to war, and back to peace again.
Since there is no randomness, the outcomes in complete markets setting should be the same as in incomplete markets
setting.
The Markov chain is:
0 1 0 0 0 0 0
⎡0 0 1 0 0 0 0⎤
⎢ ⎥
⎢0 0 0 1 0 0 0⎥
𝑃 = ⎢0 0 0 0 1 0 0⎥
⎢ ⎥
⎢0 0 0 0 0 1 0⎥
⎢0 0 0 0 0 0 1⎥
⎣0 0 0 0 0 0 1⎦

with government expenditure levels for the seven states being [𝑔𝐿 𝑔𝐿 𝑔𝐻 𝑔𝐻 𝑔𝐻 𝑔𝐻 𝑔𝐿 ] where 𝑔𝐿 < 𝑔𝐻 .
Assume 𝑏0 = 1 and 𝑠0 = 1.

g_ex5 = [g_L, g_L, g_H, g_H, g_H, g_H, g_L]

P_ex5 = np.array([[0, 1, 0, 0, 0, 0, 0],
(continues on next page)

7.4. More Finite Markov Chain Tax-Smoothing Examples 137

Advanced Quantitative Economics with Python

(continued from previous page)

[0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 0],
[0, 0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 0, 1]])
b0_ex5 = 1
states_ex5 = ['peace1', 'peace2', 'war1', 'war2', 'war3', 'permanent peace']

ts_ex5 = TaxSmoothingExample(g_ex5, P_ex5, b0_ex5, states_ex5, N_simul=7, random_

↪state=1)

ts_ex5.display()

138 Chapter 7. Tax Smoothing with Complete and Incomplete Markets

Advanced Quantitative Economics with Python

7.4. More Finite Markov Chain Tax-Smoothing Examples 139

Advanced Quantitative Economics with Python

P
[[0 1 0 0 0 0 0]
[0 0 1 0 0 0 0]
[0 0 0 1 0 0 0]
[0 0 0 0 1 0 0]
[0 0 0 0 0 1 0]
[0 0 0 0 0 0 1]
[0 0 0 0 0 0 1]]
Q
[[0. 0.96 0. 0. 0. 0. 0. ]
[0. 0. 0.96 0. 0. 0. 0. ]
[0. 0. 0. 0.96 0. 0. 0. ]
[0. 0. 0. 0. 0.96 0. 0. ]
[0. 0. 0. 0. 0. 0.96 0. ]
[0. 0. 0. 0. 0. 0. 0.96]
[0. 0. 0. 0. 0. 0. 0.96]]
Govt expenditures in peace1, peace2, war1, war2, war3, permanent peace = [0.5 0.5␣
↪1.2 1.2 1.2 1.2 0.5]

Constant tax collections = 0.5571895472128001

Govt debt in 6 states = [-1. -1.10123911 -1.20669652 -0.58738132 0.
↪05773868 0.72973868
1.42973868]

Government tax collections minus debt levels in peace1, peace2, war1, war2, war3,␣
↪permanent peace

T+b in peace1 = 1.5571895472128001

T+b in peace2 = 1.6584286588928001
(continues on next page)

140 Chapter 7. Tax Smoothing with Complete and Incomplete Markets

Advanced Quantitative Economics with Python

(continued from previous page)

T+b in war1 = 1.7638860668928005
T+b in war2 = 1.1445708668928003
T+b in war3 = 0.4994508668928004
T+b in permanent peace = -0.17254913310719955

Total government spending in peace1, peace2, war1, war2, war3, permanent peace
peace1 = 1.5571895472128
peace2 = 1.6584286588928003
war1 = 1.7638860668928
war2 = 1.1445708668928003
war3 = 0.49945086689280027
permanent peace = -0.17254913310719933

Let's see ex-post and ex-ante returns on Arrow securities

Ex-post returns to purchase of Arrow securities:

Ex-ante returns to purchase of Arrow securities = 1.0416666666666667

The Ex-post one-period gross return on the portfolio of government assets

[[0. 1.04166667 0. 0. 0. 0.
0. ]
[0. 0. 1.04166667 0. 0. 0.
0. ]
[0. 0. 0. 1.04166667 0. 0.
0. ]
[0. 0. 0. 0. 1.04166667 0.
0. ]
[0. 0. 0. 0. 0. 1.04166667
0. ]
[0. 0. 0. 0. 0. 0.
1.04166667]
[0. 0. 0. 0. 0. 0.
1.04166667]]

The cumulative return earned from holding 1 unit market portfolio of government␣
↪bonds

1.2775343959060068

7.4.7 Continuous-State Gaussian Model

To construct a tax-smoothing version of the complete markets consumption-smoothing model with a continuous state
space that we presented in the lecture consumption smoothing with complete and incomplete markets, we simply relabel
variables.
Thus, a government faces a sequence of budget constraints

𝑇𝑡 + 𝑏𝑡 = 𝑔𝑡 + 𝛽𝔼𝑡 𝑏𝑡+1 , 𝑡≥0

7.4. More Finite Markov Chain Tax-Smoothing Examples 141

Advanced Quantitative Economics with Python

where 𝑇𝑡 is tax revenues, 𝑏𝑡 are receipts at 𝑡 from contingent claims that the government had purchased at time 𝑡 − 1, and

𝛽𝔼𝑡 𝑏𝑡+1 ≡ ∫ 𝑞𝑡+1 (𝑥𝑡+1 |𝑥𝑡 )𝑏𝑡+1 (𝑥𝑡+1 )𝑑𝑥𝑡+1

is the value of time 𝑡 + 1 state-contingent claims purchased by the government at time 𝑡.

As above with the consumption-smoothing model, we can solve the time 𝑡 budget constraint forward to obtain
∞
𝑏𝑡 = 𝔼𝑡 ∑ 𝛽 𝑗 (𝑔𝑡+𝑗 − 𝑇𝑡+𝑗 )
𝑗=0

which can be rearranged to become

∞ ∞
𝔼𝑡 ∑ 𝛽 𝑗 𝑔𝑡+𝑗 = 𝑏𝑡 + 𝔼𝑡 ∑ 𝛽 𝑗 𝑇𝑡+𝑗
𝑗=0 𝑗=0

which states that the present value of government purchases equals the value of government assets at 𝑡 plus the present
value of tax receipts.
With these relabelings, examples presented in consumption smoothing with complete and incomplete markets can be inter-
preted as tax-smoothing models.
Returns: In the continuous state version of our incomplete markets model, the ex post one-period gross rate of return
on the government portfolio equals

𝑏(𝑥𝑡+1 )
𝑅(𝑥𝑡+1 |𝑥𝑡 ) =
𝛽𝐸𝑏(𝑥𝑡+1 )|𝑥𝑡

Related Lectures

Throughout this lecture, we have taken one-period interest rates and Arrow security prices as exogenous objects deter-
mined outside the model and specified them in ways designed to align our models closely with the consumption smoothing
model of Barro [Barro, 1979].
Other lectures make these objects endogenous and describe how a government optimally manipulates prices of govern-
ment debt, albeit indirectly via effects distorting taxes have on equilibrium prices and allocations.
In optimal taxation in an LQ economy and recursive optimal taxation, we study complete-markets models in which the
government recognizes that it can manipulate Arrow securities prices.
Linear-quadratic versions of the Lucas-Stokey tax-smoothing model are described in Optimal Taxation in an LQ Economy.
That lecture is a warm-up for the non-linear-quadratic model of tax smoothing described in Optimal Taxation with State-
Contingent Debt.
In both Optimal Taxation in an LQ Economy and Optimal Taxation with State-Contingent Debt, the government recognizes
that its decisions affect prices.
In optimal taxation with incomplete markets, we study an incomplete-markets model in which the government also
manipulates prices of government debt.

142 Chapter 7. Tax Smoothing with Complete and Incomplete Markets

CHAPTER

EIGHT

MARKOV JUMP LINEAR QUADRATIC DYNAMIC PROGRAMMING

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install --upgrade quantecon

8.1 Overview

This lecture describes Markov jump linear quadratic dynamic programming, an extension of the method described
in the first LQ control lecture.
Markov jump linear quadratic dynamic programming is described and analyzed in [Do Val et al., 1999] and the references
cited there.
The method has been applied to problems in macroeconomics and monetary economics by [Svensson et al., 2008] and
[Svensson and Williams, 2009].
The periodic models of seasonality described in chapter 14 of [Hansen and Sargent, 2013] are a special case of Markov
jump linear quadratic problems.
Markov jump linear quadratic dynamic programming combines advantages of
• the computational simplicity of linear quadratic dynamic programming, with
• the ability of finite state Markov chains to represent interesting patterns of random variation.
The idea is to replace the constant matrices that define a linear quadratic dynamic programming problem with 𝑁 sets
of matrices that are fixed functions of the state of an 𝑁 state Markov chain.
The state of the Markov chain together with the continuous 𝑛 × 1 state vector 𝑥𝑡 form the state of the system.
For the class of infinite horizon problems being studied in this lecture, we obtain 𝑁 interrelated matrix Riccati equations
that determine 𝑁 optimal value functions and 𝑁 linear decision rules.
One of these value functions and one of these decision rules apply in each of the 𝑁 Markov states.
That is, when the Markov state is in state 𝑗, the value function and the decision rule for state 𝑗 prevails.

143
Advanced Quantitative Economics with Python

8.2 Review of useful LQ dynamic programming formulas

To begin, it is handy to have the following reminder in mind.

A linear quadratic dynamic programming problem consists of a scalar discount factor 𝛽 ∈ (0, 1), an 𝑛 × 1 state
vector 𝑥𝑡 , an initial condition for 𝑥0 , a 𝑘 × 1 control vector 𝑢𝑡 , a 𝑝 × 1 random shock vector 𝑤𝑡+1 and the following two
triples of matrices:
• A triple of matrices (𝑅, 𝑄, 𝑊 ) defining a loss function
𝑟(𝑥𝑡 , 𝑢𝑡 ) = 𝑥′𝑡 𝑅𝑥𝑡 + 𝑢′𝑡 𝑄𝑢𝑡 + 2𝑢′𝑡 𝑊 𝑥𝑡
• a triple of matrices (𝐴, 𝐵, 𝐶) defining a state-transition law
𝑥𝑡+1 = 𝐴𝑥𝑡 + 𝐵𝑢𝑡 + 𝐶𝑤𝑡+1
The problem is
∞
−𝑥′0 𝑃 𝑥0 − 𝜌 = min
∞
𝐸 ∑ 𝛽 𝑡 𝑟(𝑥𝑡 , 𝑢𝑡 )
{𝑢𝑡 }𝑡=0
𝑡=0

subject to the transition law for the state.

The optimal decision rule has the form

𝑢𝑡 = −𝐹 𝑥𝑡

and the optimal value function is of the form

− (𝑥′𝑡 𝑃 𝑥𝑡 + 𝜌)

where 𝑃 solves the algebraic matrix Riccati equation

𝑃 = 𝑅 + 𝛽𝐴′ 𝑃 𝐴 − (𝛽𝐵′ 𝑃 𝐴 + 𝑊 )′ (𝑄 + 𝛽𝐵𝑃 𝐵)−1 (𝛽𝐵𝑃 𝐴 + 𝑊 )

and the constant 𝜌 satisfies

𝜌 = 𝛽 (𝜌 + trace(𝑃 𝐶𝐶 ′ ))

and the matrix 𝐹 in the decision rule for 𝑢𝑡 satisfies

𝐹 = (𝑄 + 𝛽𝐵′ 𝑃 𝐵)−1 (𝛽(𝐵′ 𝑃 𝐴) + 𝑊 )

With the preceding formulas in mind, we are ready to approach Markov Jump linear quadratic dynamic programming.

8.3 Linked Riccati equations for Markov LQ dynamic programming

The key idea is to make the matrices 𝐴, 𝐵, 𝐶, 𝑅, 𝑄, 𝑊 fixed functions of a finite state 𝑠 that is governed by an 𝑁 state
Markov chain.
This makes decision rules depend on the Markov state, and so fluctuate through time in limited ways.
In particular, we use the following extension of a discrete-time linear quadratic dynamic programming problem.
We let 𝑠𝑡 ∈ [1, 2, … , 𝑁 ] be a time 𝑡 realization of an 𝑁 -state Markov chain with transition matrix Π having typical
element Π𝑖𝑗 .

144 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming

Advanced Quantitative Economics with Python

Here 𝑖 denotes today and 𝑗 denotes tomorrow and

Π𝑖𝑗 = Prob(𝑠𝑡+1 = 𝑗|𝑠𝑡 = 𝑖)

We’ll switch between labeling today’s state as 𝑠𝑡 and 𝑖 and between labeling tomorrow’s state as 𝑠𝑡+1 or 𝑗.
The decision-maker solves the minimization problem:
∞
min
∞
𝐸 ∑ 𝛽 𝑡 𝑟(𝑥𝑡 , 𝑠𝑡 , 𝑢𝑡 )
{𝑢𝑡 }𝑡=0
𝑡=0

with

𝑟(𝑥𝑡 , 𝑠𝑡 , 𝑢𝑡 ) = 𝑥′𝑡 𝑅𝑠𝑡 𝑥𝑡 + 𝑢′𝑡 𝑄𝑠𝑡 𝑢𝑡 + 2𝑢′𝑡 𝑊𝑠𝑡 𝑥𝑡

subject to linear laws of motion with matrices (𝐴, 𝐵, 𝐶) each possibly dependent on the Markov-state-𝑠𝑡 :

𝑥𝑡+1 = 𝐴𝑠𝑡 𝑥𝑡 + 𝐵𝑠𝑡 𝑢𝑡 + 𝐶𝑠𝑡 𝑤𝑡+1

where {𝑤𝑡+1 } is an i.i.d. stochastic process with 𝑤𝑡+1 ∼ 𝑁 (0, 𝐼).

The optimal decision rule for this problem has the form

𝑢𝑡 = −𝐹𝑠𝑡 𝑥𝑡

and the optimal value functions are of the form

− (𝑥′𝑡 𝑃𝑠𝑡 𝑥𝑡 + 𝜌𝑠𝑡 )

or equivalently

−𝑥′𝑡 𝑃𝑖 𝑥𝑡 − 𝜌𝑖

The optimal value functions −𝑥′ 𝑃𝑖 𝑥 − 𝜌𝑖 for 𝑖 = 1, … , 𝑛 satisfy the 𝑁 interrelated Bellman equations

−𝑥′ 𝑃𝑖 𝑥 − 𝜌𝑖 = max −𝑥′ 𝑅𝑖 𝑥 +𝑢′ 𝑄𝑖 𝑢 + 2𝑢′ 𝑊𝑖 𝑥 + 𝛽 ∑ Π𝑖𝑗 𝐸((𝐴𝑖 𝑥 + 𝐵𝑖 𝑢 + 𝐶𝑖 𝑤)′ 𝑃𝑗 (𝐴𝑖 𝑥 + 𝐵𝑖 𝑢 + 𝐶𝑖 𝑤)𝑥 + 𝜌𝑗 )
𝑢
𝑗

The matrices 𝑃𝑠𝑡 = 𝑃𝑖 and the scalars 𝜌𝑠𝑡 = 𝜌𝑖 , 𝑖 = 1, …, n satisfy the following stacked system of algebraic matrix
Riccati equations:

𝑃𝑖 = 𝑅𝑖 + 𝛽 ∑ 𝐴′𝑖 𝑃𝑗 𝐴𝑖 Π𝑖𝑗 − ∑ Π𝑖𝑗 [(𝛽𝐵𝑖′ 𝑃𝑗 𝐴𝑖 + 𝑊𝑖 )′ (𝑄 + 𝛽𝐵𝑖′ 𝑃𝑗 𝐵𝑖 )−1 (𝛽𝐵𝑖′ 𝑃𝑗 𝐴𝑖 + 𝑊𝑖 )]

𝑗 𝑗

𝜌𝑖 = 𝛽 ∑ Π𝑖𝑗 (𝜌𝑗 + trace(𝑃𝑗 𝐶𝑖 𝐶𝑖′ ))

𝑗

and the 𝐹𝑖 matrices in the optimal decision rules are

𝐹𝑖 = (𝑄𝑖 + 𝛽 ∑ Π𝑖𝑗 𝐵𝑖′ 𝑃𝑗 𝐵𝑖 )−1 (𝛽 ∑ Π𝑖𝑗 (𝐵𝑖′ 𝑃𝑗 𝐴𝑖 ) + 𝑊𝑖 )

𝑗 𝑗

8.4 Applications

We now describe Python code and some examples.

To begin, we import these Python modules

8.4. Applications 145

Advanced Quantitative Economics with Python

import numpy as np
import quantecon as qe
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

# Set discount factor

β = 0.95

8.5 Example 1

This example is a version of a classic problem of optimally adjusting a variable 𝑘𝑡 to a target level in the face of costly
adjustment.
This provides a model of gradual adjustment.
Given 𝑘0 , the objective function is
∞
max
∞
𝐸0 ∑ 𝛽 𝑡 𝑟 (𝑠𝑡 , 𝑘𝑡 )
{𝑘𝑡 }𝑡=1 𝑡=0

where the one-period payoff function is

𝑟(𝑠𝑡 , 𝑘𝑡 ) = 𝑓1,𝑠𝑡 𝑘𝑡 − 𝑓2,𝑠𝑡 𝑘𝑡2 − 𝑑𝑠𝑡 (𝑘𝑡+1 − 𝑘𝑡 )2 ,

𝐸0 is a mathematical expectation conditioned on time 0 information 𝑥0 , 𝑠0 and the transition law for continuous state
variable 𝑘𝑡 is

𝑘𝑡+1 − 𝑘𝑡 = 𝑢𝑡

We can think of 𝑘𝑡 as the decision-maker’s capital and 𝑢𝑡 as costs of adjusting the level of capital.
We assume that 𝑓1 (𝑠𝑡 ) > 0, 𝑓2 (𝑠𝑡 ) > 0, and 𝑑 (𝑠𝑡 ) > 0.
Denote the state transition matrix for Markov state 𝑠𝑡 ∈ {1, 2} as Π:

Pr (𝑠𝑡+1 = 𝑗 ∣ 𝑠𝑡 = 𝑖) = Π𝑖𝑗

𝑘
Let 𝑥𝑡 = [ 𝑡 ]
1
We can represent the one-period payoff function 𝑟 (𝑠𝑡 , 𝑘𝑡 ) as

𝑟 (𝑠𝑡 , 𝑘𝑡 ) = 𝑓1,𝑠𝑡 𝑘𝑡 − 𝑓2,𝑠𝑡 𝑘𝑡2 − 𝑑𝑠𝑡 𝑢𝑡 2

𝑓
𝑓 𝑡 − 1,𝑠𝑡
= −𝑥′𝑡 [ 2,𝑠
𝑓1,𝑠𝑡
2 ]𝑥 + 𝑑
𝑡 ⏟ 𝑠𝑡 𝑢 𝑡
2
−
⏟⏟⏟⏟⏟⏟2 0
⏟⏟⏟ ≡𝑄(𝑠𝑡 )
≡𝑅(𝑠𝑡 )

and the state-transition law as

𝑘𝑡+1 1
𝑥𝑡+1 = [ ]= 𝐼⏟2 𝑥𝑡 + [ ] 𝑢𝑡
1 ⏟0
≡𝐴(𝑠 )
𝑡
≡𝐵(𝑠𝑡 )

146 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming

Advanced Quantitative Economics with Python

def construct_arrays1(f1_vals=[1. ,1.],

f2_vals=[1., 1.],
d_vals=[1., 1.]):
"""
Construct matrices that map the problem described in example 1
into a Markov jump linear quadratic dynamic programming problem
"""

# Number of Markov states

m = len(f1_vals)
# Number of state and control variables
n, k = 2, 1

# Construct sets of matrices for each state

As = [np.eye(n) for i in range(m)]
Bs = [np.array([[1, 0]]).T for i in range(m)]

Rs = np.zeros((m, n, n))
Qs = np.zeros((m, k, k))

for i in range(m):
Rs[i, 0, 0] = f2_vals[i]
Rs[i, 1, 0] = - f1_vals[i] / 2
Rs[i, 0, 1] = - f1_vals[i] / 2

Qs[i, 0, 0] = d_vals[i]

Cs, Ns = None, None

# Compute the optimal k level of the payoff function in each state

k_star = np.empty(m)
for i in range(m):
k_star[i] = f1_vals[i] / (2 * f2_vals[i])

return Qs, Rs, Ns, As, Bs, Cs, k_star

The continuous part of the state 𝑥𝑡 consists of two variables, namely, 𝑘𝑡 and a constant term.

state_vec1 = ["k", "constant term"]

We start with a Markov transition matrix that makes the Markov state be strictly periodic:
0 1
Π1 = [ ],
1 0
We set 𝑓1,𝑠𝑡 and 𝑓2,𝑠𝑡 to be independent of the Markov state 𝑠𝑡

𝑓1,1 = 𝑓1,2 = 1,

𝑓2,1 = 𝑓2,2 = 1
In contrast to 𝑓1,𝑠𝑡 and 𝑓2,𝑠𝑡 , we make the adjustment cost 𝑑𝑠𝑡 vary across Markov states 𝑠𝑡 .
We set the adjustment cost to be lower in Markov state 2

𝑑1 = 1, 𝑑2 = 0.5

The following code forms a Markov switching LQ problem and computes the optimal value functions and optimal decision
rules for each Markov state

8.5. Example 1 147

Advanced Quantitative Economics with Python

# Construct Markov transition matrix

Π1 = np.array([[0., 1.],
[1., 0.]])

# Construct matrices
Qs, Rs, Ns, As, Bs, Cs, k_star = construct_arrays1(d_vals=[1., 0.5])

# Construct a Markov Jump LQ problem

ex1_a = qe.LQMarkov(Π1, Qs, Rs, As, Bs, Cs=Cs, Ns=Ns, beta=β)
# Solve for optimal value functions and decision rules
ex1_a.stationary_values();

Let’s look at the value function matrices and the decision rules for each Markov state

# P(s)
ex1_a.Ps

array([[[ 1.56626026, -0.78313013],

[-0.78313013, -4.60843493]],

[[ 1.37424214, -0.68712107],
[-0.68712107, -4.65643947]]])

# d(s) = 0, since there is no randomness

ex1_a.ds

array([0., 0.])

# F(s)
ex1_a.Fs

array([[[ 0.56626026, -0.28313013]],

[[ 0.74848427, -0.37424214]]])

Now we’ll plot the decision rules and see if they make sense

# Plot the optimal decision rules

k_grid = np.linspace(0., 1., 100)
# Optimal choice in state s1
u1_star = - ex1_a.Fs[0, 0, 1] - ex1_a.Fs[0, 0, 0] * k_grid
# Optimal choice in state s2
u2_star = - ex1_a.Fs[1, 0, 1] - ex1_a.Fs[1, 0, 0] * k_grid

fig, ax = plt.subplots()
ax.plot(k_grid, k_grid + u1_star, label=r"$\overline{s}_1$ (high)")
ax.plot(k_grid, k_grid + u2_star, label=r"$\overline{s}_2$ (low)")

# The optimal k*
ax.scatter([0.5, 0.5], [0.5, 0.5], marker="*")
ax.plot([k_star[0], k_star[0]], [0., 1.0], '--')
(continues on next page)

148 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming

Advanced Quantitative Economics with Python

(continued from previous page)

# 45 degree line
ax.plot([0., 1.], [0., 1.], '--', label="45 degree line")

ax.set_xlabel("$k_t$")
ax.set_ylabel("$k_{t+1}$")
ax.legend()
plt.show()

The above graph plots 𝑘𝑡+1 = 𝑘𝑡 + 𝑢𝑡 = 𝑘𝑡 − 𝐹 𝑥𝑡 as an affine (i.e., linear in 𝑘𝑡 plus a constant) function of 𝑘𝑡 for both
Markov states 𝑠𝑡 .
It also plots the 45 degree line.
Notice that the two 𝑠𝑡 -dependent closed loop functions that determine 𝑘𝑡+1 as functions of 𝑘𝑡 share the same rest point
(also called a fixed point) at 𝑘𝑡 = 0.5.
Evidently, the optimal decision rule in Markov state 2, in which the adjustment cost is lower, makes 𝑘𝑡+1 a flatter function
of 𝑘𝑡 in Markov state 2.
This happens because when 𝑘𝑡 is not at its fixed point, |𝑢𝑡,2 | > |𝑢𝑡,2 |, so that the decision-maker adjusts toward the fixed
point faster when the Markov state 𝑠𝑡 takes a value that makes it cheaper.

# Compute time series

T = 20
x0 = np.array([[0., 1.]]).T
x_path = ex1_a.compute_sequence(x0, ts_length=T)[0]

(continues on next page)

8.5. Example 1 149

Advanced Quantitative Economics with Python

(continued from previous page)

fig, ax = plt.subplots()
ax.plot(range(T), x_path[0, :-1])
ax.set_xlabel("$t$")
ax.set_ylabel("$k_t$")
ax.set_title("Optimal path of $k_t$")
plt.show()

Now we’ll depart from the preceding transition matrix that made the Markov state be strictly periodic.
We’ll begin with symmetric transition matrices of the form

1−𝜆 𝜆
Π2 = [ ].
𝜆 1−𝜆

λ = 0.8 # high λ
Π2 = np.array([[1-λ, λ],
[λ, 1-λ]])

ex1_b = qe.LQMarkov(Π2, Qs, Rs, As, Bs, Cs=Cs, Ns=Ns, beta=β)

ex1_b.stationary_values();
ex1_b.Fs

array([[[ 0.57291724, -0.28645862]],

[[ 0.74434525, -0.37217263]]])

150 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming

Advanced Quantitative Economics with Python

λ = 0.2 # low λ
Π2 = np.array([[1-λ, λ],
[λ, 1-λ]])

ex1_b = qe.LQMarkov(Π2, Qs, Rs, As, Bs, Cs=Cs, Ns=Ns, beta=β)

ex1_b.stationary_values();
ex1_b.Fs

array([[[ 0.59533259, -0.2976663 ]],

[[ 0.72818728, -0.36409364]]])

We can plot optimal decision rules associated with different 𝜆 values.

λ_vals = np.linspace(0., 1., 10)

F1 = np.empty((λ_vals.size, 2))
F2 = np.empty((λ_vals.size, 2))

for i, λ in enumerate(λ_vals):
Π2 = np.array([[1-λ, λ],
[λ, 1-λ]])

ex1_b = qe.LQMarkov(Π2, Qs, Rs, As, Bs, Cs=Cs, Ns=Ns, beta=β)

ex1_b.stationary_values();
F1[i, :] = ex1_b.Fs[0, 0, :]
F2[i, :] = ex1_b.Fs[1, 0, :]

for i, state_var in enumerate(state_vec1):

fig, ax = plt.subplots()
ax.plot(λ_vals, F1[:, i], label=r"$\overline{s}_1$", color="b")
ax.plot(λ_vals, F2[:, i], label=r"$\overline{s}_2$", color="r")

ax.set_xlabel(r"$\lambda$")
ax.set_ylabel("$F_{s_t}$")
ax.set_title(f"Coefficient on {state_var}")
ax.legend()
plt.show()

8.5. Example 1 151

Advanced Quantitative Economics with Python

152 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming

Advanced Quantitative Economics with Python

Notice how the decision rules’ constants and slopes behave as functions of 𝜆.
Evidently, as the Markov chain becomes more nearly periodic (i.e., as 𝜆 → 1), the dynamic program adjusts capital faster
in the low adjustment cost Markov state to take advantage of what is only temporarily a more favorable time to invest.
Now let’s study situations in which the Markov transition matrix Π is asymmetric
1−𝜆 𝜆
Π3 = [ ].
𝛿 1−𝛿

λ, δ = 0.8, 0.2
Π3 = np.array([[1-λ, λ],
[δ, 1-δ]])

ex1_b = qe.LQMarkov(Π3, Qs, Rs, As, Bs, Cs=Cs, Ns=Ns, beta=β)

ex1_b.stationary_values();
ex1_b.Fs

array([[[ 0.57169781, -0.2858489 ]],

[[ 0.72749075, -0.36374537]]])

We can plot optimal decision rules for different 𝜆 and 𝛿 values.

λ_vals = np.linspace(0., 1., 10)

δ_vals = np.linspace(0., 1., 10)

(continues on next page)

8.5. Example 1 153

Advanced Quantitative Economics with Python

(continued from previous page)

λ_grid = np.empty((λ_vals.size, δ_vals.size))
δ_grid = np.empty((λ_vals.size, δ_vals.size))
F1_grid = np.empty((λ_vals.size, δ_vals.size, len(state_vec1)))
F2_grid = np.empty((λ_vals.size, δ_vals.size, len(state_vec1)))

for i, λ in enumerate(λ_vals):
λ_grid[i, :] = λ
δ_grid[i, :] = δ_vals
for j, δ in enumerate(δ_vals):
Π3 = np.array([[1-λ, λ],
[δ, 1-δ]])

ex1_b = qe.LQMarkov(Π3, Qs, Rs, As, Bs, Cs=Cs, Ns=Ns, beta=β)

ex1_b.stationary_values();
F1_grid[i, j, :] = ex1_b.Fs[0, 0, :]
F2_grid[i, j, :] = ex1_b.Fs[1, 0, :]

for i, state_var in enumerate(state_vec1):

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
# high adjustment cost, blue surface
ax.plot_surface(λ_grid, δ_grid, F1_grid[:, :, i], color="b")
# low adjustment cost, red surface
ax.plot_surface(λ_grid, δ_grid, F2_grid[:, :, i], color="r")
ax.set_xlabel(r"$\lambda$")
ax.set_ylabel(r"$\delta$")
ax.set_zlabel("$F_{s_t}$")
ax.set_title(f"coefficient on {state_var}")
plt.show()

154 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming

Advanced Quantitative Economics with Python

The following code defines a wrapper function that computes optimal decision rules for cases with different Markov
transition matrices

8.5. Example 1 155

Advanced Quantitative Economics with Python

def run(construct_func, vals_dict, state_vec):

"""
A Wrapper function that repeats the computation above
for different cases
"""

Qs, Rs, Ns, As, Bs, Cs, k_star = construct_func(**vals_dict)

# Symmetric Π
# Notice that pure periodic transition is a special case
# when λ=1
print("symmetric Π case:\n")
λ_vals = np.linspace(0., 1., 10)
F1 = np.empty((λ_vals.size, len(state_vec)))
F2 = np.empty((λ_vals.size, len(state_vec)))

for i, λ in enumerate(λ_vals):
Π2 = np.array([[1-λ, λ],
[λ, 1-λ]])

mplq = qe.LQMarkov(Π2, Qs, Rs, As, Bs, Cs=Cs, Ns=Ns, beta=β)

mplq.stationary_values();
F1[i, :] = mplq.Fs[0, 0, :]
F2[i, :] = mplq.Fs[1, 0, :]

for i, state_var in enumerate(state_vec):

fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(λ_vals, F1[:, i], label=r"$\overline{s}_1$", color="b")
ax.plot(λ_vals, F2[:, i], label=r"$\overline{s}_2$", color="r")

ax.set_xlabel(r"$\lambda$")
ax.set_ylabel(r"$F(\overline{s}_t)$")
ax.set_title(f"coefficient on {state_var}")
ax.legend()
plt.show()

# Plot optimal k*_{s_t} and k that optimal policies are targeting

# only for example 1
if state_vec == ["k", "constant term"]:
fig = plt.figure()
ax = fig.add_subplot(111)
for i in range(2):
F = [F1, F2][i]
c = ["b", "r"][i]
ax.plot([0, 1], [k_star[i], k_star[i]], "--",
color=c, label=r"$k^*(\overline{s}_"+str(i+1)+")$")
ax.plot(λ_vals, - F[:, 1] / F[:, 0], color=c,
label=r"$k^{target}(\overline{s}_"+str(i+1)+")$")

# Plot a vertical line at λ=0.5

ax.plot([0.5, 0.5], [min(k_star), max(k_star)], "-.")

ax.set_xlabel(r"$\lambda$")
ax.set_ylabel("$k$")
ax.set_title("Optimal k levels and k targets")
ax.text(0.5, min(k_star)+(max(k_star)-min(k_star))/20, r"$\lambda=0.5$")
(continues on next page)

156 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming

Advanced Quantitative Economics with Python

(continued from previous page)

ax.legend(bbox_to_anchor=(1., 1.))
plt.show()

# Asymmetric Π
print("asymmetric Π case:\n")
δ_vals = np.linspace(0., 1., 10)

λ_grid = np.empty((λ_vals.size, δ_vals.size))

δ_grid = np.empty((λ_vals.size, δ_vals.size))
F1_grid = np.empty((λ_vals.size, δ_vals.size, len(state_vec)))
F2_grid = np.empty((λ_vals.size, δ_vals.size, len(state_vec)))

for i, λ in enumerate(λ_vals):
λ_grid[i, :] = λ
δ_grid[i, :] = δ_vals
for j, δ in enumerate(δ_vals):
Π3 = np.array([[1-λ, λ],
[δ, 1-δ]])

mplq = qe.LQMarkov(Π3, Qs, Rs, As, Bs, Cs=Cs, Ns=Ns, beta=β)

mplq.stationary_values();
F1_grid[i, j, :] = mplq.Fs[0, 0, :]
F2_grid[i, j, :] = mplq.Fs[1, 0, :]

for i, state_var in enumerate(state_vec):

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot_surface(λ_grid, δ_grid, F1_grid[:, :, i], color="b")
ax.plot_surface(λ_grid, δ_grid, F2_grid[:, :, i], color="r")
ax.set_xlabel(r"$\lambda$")
ax.set_ylabel(r"$\delta$")
ax.set_zlabel(r"$F(\overline{s}_t)$")
ax.set_title(f"coefficient on {state_var}")
plt.show()

To illustrate the code with another example, we shall set 𝑓2,𝑠𝑡 and 𝑑𝑠𝑡 as constant functions and

𝑓1,1 = 0.5, 𝑓1,2 = 1

Thus, the sole role of the Markov jump state 𝑠𝑡 is to identify times in which capital is very productive and other times in
which it is less productive.
The example below reveals much about the structure of the optimum problem and optimal policies.
Only 𝑓1,𝑠𝑡 varies with 𝑠𝑡 .
𝑓1,𝑠𝑡
So there are different 𝑠𝑡 -dependent optimal static 𝑘 level in different states 𝑘𝑠∗𝑡 = 2𝑓2,𝑠𝑡 , values of 𝑘 that maximize
one-period payoff functions in each state.
We denote a target 𝑘 level as 𝑘𝑠𝑡𝑎𝑟𝑔𝑒𝑡
𝑡
, the fixed point of the optimal policies in each state, given the value of 𝜆.
We call 𝑘𝑠𝑡𝑎𝑟𝑔𝑒𝑡
𝑡
a “target” because in each Markov state 𝑠𝑡 , optimal policies are contraction mappings and will push 𝑘𝑡
towards a fixed point 𝑘𝑠𝑡𝑎𝑟𝑔𝑒𝑡
𝑡
.
When 𝜆 → 0, each Markov state becomes close to absorbing state and consequently 𝑘𝑠𝑡𝑎𝑟𝑔𝑒𝑡
𝑡
→ 𝑘𝑠∗𝑡 .
But when 𝜆 → 1, the Markov transition matrix becomes more nearly periodic, so the optimum decision rules target more
at the optimal 𝑘 level in the other state in order to enjoy higher expected payoff in the next period.

8.5. Example 1 157

Advanced Quantitative Economics with Python

The switch happens at 𝜆 = 0.5 when both states are equally likely to be reached.
Below we plot an additional figure that shows optimal 𝑘 levels in the two states Markov jump state and also how the
targeted 𝑘 levels change as 𝜆 changes.

run(construct_arrays1, {"f1_vals":[0.5, 1.]}, state_vec1)

symmetric Π case:

158 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming

Advanced Quantitative Economics with Python

8.5. Example 1 159

Advanced Quantitative Economics with Python

asymmetric Π case:

160 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming

Advanced Quantitative Economics with Python

Set 𝑓1,𝑠𝑡 and 𝑑𝑠𝑡 as constant functions and

𝑓2,1 = 0.5, 𝑓2,2 = 1

run(construct_arrays1, {"f2_vals":[0.5, 1.]}, state_vec1)

symmetric Π case:

8.5. Example 1 161

Advanced Quantitative Economics with Python

162 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming

Advanced Quantitative Economics with Python

asymmetric Π case:

8.5. Example 1 163

Advanced Quantitative Economics with Python

8.6 Example 2

We now add to the example 1 setup another state variable 𝑤𝑡 that follows the evolution law

𝑤𝑡+1 = 𝛼0 (𝑠𝑡 ) + 𝜌 (𝑠𝑡 ) 𝑤𝑡 + 𝜎 (𝑠𝑡 ) 𝜖𝑡+1 , 𝜖𝑡+1 ∼ 𝑁 (0, 1)

We think of 𝑤𝑡 as a rental rate or tax rate that the decision maker pays each period for 𝑘𝑡 .
To capture this idea, we add to the decision-maker’s one-period payoff function the product of 𝑤𝑡 and 𝑘𝑡

𝑟(𝑠𝑡 , 𝑘𝑡 , 𝑤𝑡 ) = 𝑓1,𝑠𝑡 𝑘𝑡 − 𝑓2,𝑠𝑡 𝑘𝑡2 − 𝑑𝑠𝑡 (𝑘𝑡+1 − 𝑘𝑡 )2 − 𝑤𝑡 𝑘𝑡 ,

𝑘𝑡
We now let the continuous part of the state at time 𝑡 be 𝑥𝑡 = ⎡ ⎤
⎢ 1 ⎥ and continue to set the control 𝑢𝑡 = 𝑘𝑡+1 − 𝑘𝑡 .
⎣𝑤𝑡 ⎦
We can write the one-period payoff function 𝑟 (𝑠𝑡 , 𝑘𝑡 , 𝑤𝑡 ) as
2
𝑟 (𝑠𝑡 , 𝑘𝑡 , 𝑤𝑡 ) = 𝑓1 (𝑠𝑡 ) 𝑘𝑡 − 𝑓2 (𝑠𝑡 ) 𝑘𝑡2 − 𝑑 (𝑠𝑡 ) (𝑘𝑡+1 − 𝑘𝑡 ) − 𝑤𝑡 𝑘𝑡

⎛ ⎞
⎜
⎜ 𝑓2 (𝑠𝑡 ) − 𝑓1 (𝑠 2
𝑡) 1
2
⎟
⎟
⎜ ′ ⎡ 𝑓 (𝑠 ) ⎤ 2⎟
= −⎜
⎜𝑥 𝑡 ⎢− 1
2
𝑡
0 0 ⎥𝑥𝑡 + 𝑑
⏟ (𝑠 𝑡 ) 𝑢𝑡 ⎟
⎟ ,
⎜
⎜ ⏟⏟ 1
0 0 ≡𝑄(𝑠𝑡 ) ⎟
⎟
⎣ ⏟ 2⏟⏟⏟⏟⏟⏟⏟⏟ ⎦
⎝ ≡𝑅(𝑠𝑡 ) ⎠

and the state-transition law as

𝑘𝑡+1 1 0 0 1 0
𝑥𝑡+1 = ⎡ ⎤ ⎡
⎢ 1 ⎥ = ⎢0 1 0 ⎤ ⎥𝑥𝑡 +
⎡0⎤ 𝑢 +
⎢ ⎥ 𝑡
⎡ 0 ⎤𝜖
⎢ ⎥ 𝑡+1
⎣𝑤𝑡+1 ⎦ ⏟⏟⏟⏟⏟⏟⏟⏟⏟
⎣0 𝛼0 (𝑠𝑡 ) 𝜌 (𝑠𝑡 )⎦ ⎣0⎦
⏟ 𝜎
⏟⎦
⎣ (𝑠 𝑡 )
≡𝐴(𝑠𝑡 ) ≡𝐵(𝑠𝑡 ) ≡𝐶(𝑠𝑡 )

def construct_arrays2(f1_vals=[1. ,1.],

f2_vals=[1., 1.],
d_vals=[1., 1.],
α0_vals=[1., 1.],
ρ_vals=[0.9, 0.9],
σ_vals=[1., 1.]):
"""
Construct matrices that maps the problem described in example 2
into a Markov jump linear quadratic dynamic programming problem.
"""

m = len(f1_vals)
n, k, j = 3, 1, 1

Rs = np.zeros((m, n, n))
Qs = np.zeros((m, k, k))
As = np.zeros((m, n, n))
Bs = np.zeros((m, n, k))
Cs = np.zeros((m, n, j))

for i in range(m):
Rs[i, 0, 0] = f2_vals[i]
Rs[i, 1, 0] = - f1_vals[i] / 2
(continues on next page)

164 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming

Advanced Quantitative Economics with Python

(continued from previous page)

Rs[i, 0, 1] = - f1_vals[i] / 2
Rs[i, 0, 2] = 1/2
Rs[i, 2, 0] = 1/2

Qs[i, 0, 0] = d_vals[i]

As[i, 0, 0] = 1
As[i, 1, 1] = 1
As[i, 2, 1] = α0_vals[i]
As[i, 2, 2] = ρ_vals[i]

Bs[i, :, :] = np.array([[1, 0, 0]]).T

Cs[i, :, :] = np.array([[0, 0, σ_vals[i]]]).T

Ns = None
k_star = None

return Qs, Rs, Ns, As, Bs, Cs, k_star

state_vec2 = ["k", "constant term", "w"]

Only 𝑑𝑠𝑡 depends on 𝑠𝑡 .

run(construct_arrays2, {"d_vals":[1., 0.5]}, state_vec2)

symmetric Π case:

8.6. Example 2 165

Advanced Quantitative Economics with Python

166 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming

Advanced Quantitative Economics with Python

8.6. Example 2 167

Advanced Quantitative Economics with Python

asymmetric Π case:

168 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming

Advanced Quantitative Economics with Python

8.6. Example 2 169

Advanced Quantitative Economics with Python

Only 𝑓1,𝑠𝑡 depends on 𝑠𝑡 .

run(construct_arrays2, {"f1_vals":[0.5, 1.]}, state_vec2)

symmetric Π case:

170 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming

Advanced Quantitative Economics with Python

8.6. Example 2 171

Advanced Quantitative Economics with Python

172 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming

Advanced Quantitative Economics with Python

asymmetric Π case:

8.6. Example 2 173

Advanced Quantitative Economics with Python

174 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming

Advanced Quantitative Economics with Python

Only 𝑓2,𝑠𝑡 depends on 𝑠𝑡 .

run(construct_arrays2, {"f2_vals":[0.5, 1.]}, state_vec2)

symmetric Π case:

8.6. Example 2 175

Advanced Quantitative Economics with Python

176 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming

Advanced Quantitative Economics with Python

8.6. Example 2 177

Advanced Quantitative Economics with Python

asymmetric Π case:

178 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming

Advanced Quantitative Economics with Python

8.6. Example 2 179

Advanced Quantitative Economics with Python

Only 𝛼0 (𝑠𝑡 ) depends on 𝑠𝑡 .

run(construct_arrays2, {"α0_vals":[0.5, 1.]}, state_vec2)

symmetric Π case:

180 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming

Advanced Quantitative Economics with Python

8.6. Example 2 181

Advanced Quantitative Economics with Python

182 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming

Advanced Quantitative Economics with Python

asymmetric Π case:

8.6. Example 2 183

Advanced Quantitative Economics with Python

184 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming

Advanced Quantitative Economics with Python

Only 𝜌𝑠𝑡 depends on 𝑠𝑡 .

run(construct_arrays2, {"ρ_vals":[0.5, 0.9]}, state_vec2)

symmetric Π case:

8.6. Example 2 185

Advanced Quantitative Economics with Python

186 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming

Advanced Quantitative Economics with Python

8.6. Example 2 187

Advanced Quantitative Economics with Python

asymmetric Π case:

188 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming

Advanced Quantitative Economics with Python

8.6. Example 2 189

Advanced Quantitative Economics with Python

Only 𝜎𝑠𝑡 depends on 𝑠𝑡 .

run(construct_arrays2, {"σ_vals":[0.5, 1.]}, state_vec2)

symmetric Π case:

190 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming

Advanced Quantitative Economics with Python

8.6. Example 2 191

Advanced Quantitative Economics with Python

192 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming

Advanced Quantitative Economics with Python

asymmetric Π case:

8.6. Example 2 193

Advanced Quantitative Economics with Python

194 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming

Advanced Quantitative Economics with Python

8.7 More examples

The following lectures describe how Markov jump linear quadratic dynamic programming can be used to extend the
[Barro, 1979] model of optimal tax-smoothing and government debt in several interesting directions
1. How to Pay for a War: Part 1
2. How to Pay for a War: Part 2
3. How to Pay for a War: Part 3

8.7. More examples 195

Advanced Quantitative Economics with Python

196 Chapter 8. Markov Jump Linear Quadratic Dynamic Programming

CHAPTER

NINE

HOW TO PAY FOR A WAR: PART 1

9.1 Overview

This lecture uses the method of Markov jump linear quadratic dynamic programming that is described in lecture
Markov Jump LQ dynamic programming to extend the [Barro, 1979] model of optimal tax-smoothing and government
debt in a particular direction.
This lecture has two sequels that offer further extensions of the Barro model
1. How to Pay for a War: Part 2
2. How to Pay for a War: Part 3
The extensions are modified versions of his 1979 model suggested by [Barro, 1999] and [Barro and McCleary, 2003]).
[Barro, 1979] m is about a government that borrows and lends in order to minimize an intertemporal measure of distortions
caused by taxes.
Technical tractability induced [Barro, 1979] to assume that
• the government trades only one-period risk-free debt, and
• the one-period risk-free interest rate is constant
By using Markov jump linear quadratic dynamic programming we can allow interest rates to move over time in empirically
interesting ways.
Also, by expanding the dimension of the state, we can add a maturity composition decision to the government’s problem.
By doing these two things we extend [Barro, 1979] along lines he suggested in [Barro, 1999] and [Barro and McCleary,
2003]).
[Barro, 1979] assumed
• that a government faces an exogenous sequence of expenditures that it must finance by a tax collection sequence
whose expected present value equals the initial debt it owes plus the expected present value of those expenditures.
∞
• that the government wants to minimize a measure of tax distortions that is proportional to 𝐸0 ∑𝑡=0 𝛽 𝑡 𝑇𝑡2 , where
𝑇𝑡 are total tax collections and 𝐸0 is a mathematical expectation conditioned on time 0 information.
• that the government trades only one asset, a risk-free one-period bond.
• that the gross interest rate on the one-period bond is constant and equal to 𝛽 −1 , the reciprocal of the factor 𝛽 at
which the government discounts future tax distortions.
Barro’s model can be mapped into a discounted linear quadratic dynamic programming problem.
Partly inspired by [Barro, 1999] and [Barro and McCleary, 2003], our generalizations of [Barro, 1979], assume
• that the government borrows or saves in the form of risk-free bonds of maturities 1, 2, … , 𝐻.

197
Advanced Quantitative Economics with Python

• that interest rates on those bonds are time-varying and in particular, governed by a jointly stationary stochastic
process.
Our generalizations are designed to fit within a generalization of an ordinary linear quadratic dynamic programming
problem in which matrices that define the quadratic objective function and the state transition function are time-varying
and stochastic.
This generalization, known as a Markov jump linear quadratic dynamic program, combines
• the computational simplicity of linear quadratic dynamic programming, and
• the ability of finite state Markov chains to represent interesting patterns of random variation.
We want the stochastic time variation in the matrices defining the dynamic programming problem to represent variation
over time in
• interest rates
• default rates
• roll over risks
As described in Markov Jump LQ dynamic programming, the idea underlying Markov jump linear quadratic dynamic
programming is to replace the constant matrices defining a linear quadratic dynamic programming problem with
matrices that are fixed functions of an 𝑁 state Markov chain.
For infinite horizon problems, this leads to 𝑁 interrelated matrix Riccati equations that pin down 𝑁 value functions and
𝑁 linear decision rules, applying to the 𝑁 Markov states.

9.2 Public Finance Questions

[Barro, 1979] is designed to answer questions such as

• Should a government finance an exogenous surge in government expenditures by raising taxes or borrowing?
• How does the answer to that first question depend on the exogenous stochastic process for government expenditures,
for example, on whether the surge in government expenditures can be expected to be temporary or permanent?
[Barro, 1999] and [Barro and McCleary, 2003] are designed to answer more fine-grained questions such as
• What determines whether a government wants to issue short-term or long-term debt?
• How do roll-over risks affect that decision?
• How does the government’s long-short portfolio management decision depend on features of the exogenous stochas-
tic process for government expenditures?
Thus, both the simple and the more fine-grained versions of Barro’s models are ways of precisely formulating the classic
issue of How to pay for a war.
This lecture describes:
• An application of Markov jump LQ dynamic programming to a model in which a government faces exogenous
time-varying interest rates for issuing one-period risk-free debt.
A sequel to this lecture describes applies Markov LQ control to settings in which a government issues risk-free debt of
different maturities.

!pip install --upgrade quantecon

Let’s start with some standard imports:

198 Chapter 9. How to Pay for a War: Part 1

Advanced Quantitative Economics with Python

import quantecon as qe
import numpy as np
import matplotlib.pyplot as plt

9.3 Barro (1979) Model

We begin by solving a version of [Barro, 1979] by mapping it into the original LQ framework.
As mentioned in this lecture, the Barro model is mathematically isomorphic with the LQ permanent income model.
Let
• 𝑇𝑡 denote tax collections
• 𝛽 be a discount factor
• 𝑏𝑡,𝑡+1 be time 𝑡 + 1 goods that at 𝑡 the government promises to deliver to time 𝑡 buyers of one-period government
bonds
• 𝐺𝑡 be government purchases
• 𝑝𝑡,𝑡+1 the number of time 𝑡 goods received per time 𝑡 + 1 goods promised to one-period bond purchasers.
Evidently, 𝑝𝑡,𝑡+1 is inversely related to appropriate corresponding gross interest rates on government debt.
In the spirit of [Barro, 1979], the stochastic process of government expenditures is exogenous.
The government’s problem is to choose a plan for taxation and borrowing {𝑏𝑡+1 , 𝑇𝑡 }∞
𝑡=0 to minimize
∞
𝐸0 ∑ 𝛽 𝑡 𝑇𝑡2
𝑡=0

subject to the constraints

𝑇𝑡 + 𝑝𝑡,𝑡+1 𝑏𝑡,𝑡+1 = 𝐺𝑡 + 𝑏𝑡−1,𝑡

𝐺𝑡 = 𝑈𝑔 𝑧𝑡
𝑧𝑡+1 = 𝐴22 𝑧𝑡 + 𝐶2 𝑤𝑡+1
where 𝑤𝑡+1 ∼ 𝑁 (0, 𝐼)
The variables 𝑇𝑡 , 𝑏𝑡,𝑡+1 are control variables chosen at 𝑡, while 𝑏𝑡−1,𝑡 is an endogenous state variable inherited from the
past at time 𝑡 and 𝑝𝑡,𝑡+1 is an exogenous state variable at time 𝑡.
To begin, we assume that 𝑝𝑡,𝑡+1 is constant (and equal to 𝛽)
• later we will extend the model to allow 𝑝𝑡,𝑡+1 to vary over time
𝑏𝑡−1,𝑡
To map into the LQ framework, we use 𝑥𝑡 = [ ] as the state vector, and 𝑢𝑡 = 𝑏𝑡,𝑡+1 as the control variable.
𝑧𝑡
Therefore, the (𝐴, 𝐵, 𝐶) matrices are defined by the state-transition law:
0 0 1 0
𝑥𝑡+1 = [ ] 𝑥 + [ ] 𝑢𝑡 + [ ] 𝑤𝑡+1
0 𝐴22 𝑡 0 𝐶2

To find the appropriate (𝑅, 𝑄, 𝑊 ) matrices, we note that 𝐺𝑡 and 𝑏𝑡−1,𝑡 can be written as appropriately defined functions
of the current state:

𝐺𝑡 = 𝑆𝐺 𝑥𝑡 , 𝑏𝑡−1,𝑡 = 𝑆1 𝑥𝑡

9.3. Barro (1979) Model 199

Advanced Quantitative Economics with Python

If we define 𝑀𝑡 = −𝑝𝑡,𝑡+1 , and let 𝑆 = 𝑆𝐺 + 𝑆1 , then we can write taxation as a function of the states and control using
the government’s budget constraint:

𝑇𝑡 = 𝑆𝑥𝑡 + 𝑀𝑡 𝑢𝑡

It follows that the (𝑅, 𝑄, 𝑊 ) matrices are implicitly defined by:

𝑇𝑡2 = 𝑥′𝑡 𝑆 ′ 𝑆𝑥𝑡 + 𝑢′𝑡 𝑀𝑡′ 𝑀𝑡 𝑢𝑡 + 2𝑢′𝑡 𝑀𝑡′ 𝑆𝑥𝑡

If we assume that 𝑝𝑡,𝑡+1 = 𝛽, then 𝑀𝑡 ≡ 𝑀 = −𝛽.

In this case, none of the LQ matrices are time varying, and we can use the original LQ framework.
We will implement this constant interest-rate version first, assuming that 𝐺𝑡 follows an AR(1) process:

𝐺𝑡+1 = 𝐺 ̄ + 𝜌𝐺𝑡 + 𝜎𝑤𝑡+1

1
To do this, we set 𝑧𝑡 = [ ], and consequently:
𝐺𝑡

1 0 0
𝐴22 = [ ̄ ] , 𝐶2 = [ ]
𝐺 𝜌 𝜎

# Model parameters
β, Gbar, ρ, σ = 0.95, 5, 0.8, 1

# Basic model matrices

A22 = np.array([[1, 0],
[Gbar, ρ],])

C2 = np.array([[0],
[σ]])

Ug = np.array([[0, 1]])

# LQ framework matrices
A_t = np.zeros((1, 3))
A_b = np.hstack((np.zeros((2, 1)), A22))
A = np.vstack((A_t, A_b))

B = np.zeros((3, 1))
B[0, 0] = 1

C = np.vstack((np.zeros((1, 1)), C2))

Sg = np.hstack((np.zeros((1, 1)), Ug))

S1 = np.zeros((1, 3))
S1[0, 0] = 1
S = S1 + Sg

M = np.array([[-β]])

R = S.T @ S
Q = M.T @ M
W = M.T @ S

# Small penalty on the debt required to implement the no-Ponzi scheme

R[0, 0] = R[0, 0] + 1e-9

200 Chapter 9. How to Pay for a War: Part 1

Advanced Quantitative Economics with Python

We can now create an instance of LQ:

LQBarro = qe.LQ(Q, R, A, B, C=C, N=W, beta=β)

P, F, d = LQBarro.stationary_values()
x0 = np.array([[100, 1, 25]])

We can see the isomorphism by noting that consumption is a martingale in the permanent income model and that taxation
is a martingale in Barro’s model.
We can check this using the 𝐹 matrix of the LQ model.
Because 𝑢𝑡 = −𝐹 𝑥𝑡 , we have

𝑇𝑡 = 𝑆𝑥𝑡 + 𝑀 𝑢𝑡 = (𝑆 − 𝑀 𝐹 )𝑥𝑡

and

𝑇𝑡+1 = (𝑆 − 𝑀 𝐹 )𝑥𝑡+1 = (𝑆 − 𝑀 𝐹 )(𝐴𝑥𝑡 + 𝐵𝑢𝑡 + 𝐶𝑤𝑡+1 ) = (𝑆 − 𝑀 𝐹 )((𝐴 − 𝐵𝐹 )𝑥𝑡 + 𝐶𝑤𝑡+1 )

Therefore, the mathematical expectation of 𝑇𝑡+1 conditional on time 𝑡 information is

𝐸𝑡 𝑇𝑡+1 = (𝑆 − 𝑀 𝐹 )(𝐴 − 𝐵𝐹 )𝑥𝑡

Consequently, taxation is a martingale (𝐸𝑡 𝑇𝑡+1 = 𝑇𝑡 ) if

(𝑆 − 𝑀 𝐹 )(𝐴 − 𝐵𝐹 ) = (𝑆 − 𝑀 𝐹 ),

which holds in this case:

S - M @ F, (S - M @ F) @ (A - B @ F)

(array([[ 0.05000002, 19.79166502, 0.2083334 ]]),

array([[ 0.05000002, 19.79166504, 0.2083334 ]]))

This explains the fanning out of the conditional empirical distribution of taxation across time, computed by simulating
the Barro model many times and averaging over simulated paths:

T = 500
for i in range(250):
x, u, w = LQBarro.compute_sequence(x0, ts_length=T)
plt.plot(list(range(T+1)), ((S - M @ F) @ x)[0, :])
plt.xlabel('Time')
plt.ylabel('Taxation')
plt.show()

9.3. Barro (1979) Model 201

Advanced Quantitative Economics with Python

We can see a similar, but a smoother pattern, if we plot government debt over time.

T = 500
for i in range(250):
x, u, w = LQBarro.compute_sequence(x0, ts_length=T)
plt.plot(list(range(T+1)), x[0, :])
plt.xlabel('Time')
plt.ylabel('Govt Debt')
plt.show()

202 Chapter 9. How to Pay for a War: Part 1

Advanced Quantitative Economics with Python

9.4 Python Class to Solve Markov Jump Linear Quadratic Control

Problems

To implement the extension to the Barro model in which 𝑝𝑡,𝑡+1 varies over time, we must allow the M matrix to be
time-varying.
Our 𝑄 and 𝑊 matrices must also vary over time.
We can solve such a model using the LQMarkov class that solves Markov jump linear quandratic control problems as
described above.
The code for the class can be viewed here.
The class takes lists of matrices that corresponds to 𝑁 Markov states.
The value and policy functions are then found by iterating on a coupled system of matrix Riccati difference equations.
Optimal 𝑃𝑠 , 𝐹𝑠 , 𝑑𝑠 are stored as attributes.
The class also contains a method that simulates a model.

9.4. Python Class to Solve Markov Jump Linear Quadratic Control Problems 203
Advanced Quantitative Economics with Python

9.5 Barro Model with a Time-varying Interest Rate

We can use the above class to implement a version of the Barro model with a time-varying interest rate.
A simple way to extend the model is to allow the interest rate to take two possible values.
We set:
1
𝑝𝑡,𝑡+1 = 𝛽 + 0.02 = 0.97

2
𝑝𝑡,𝑡+1 = 𝛽 − 0.017 = 0.933
Thus, the first Markov state has a low interest rate and the second Markov state has a high interest rate.
We must also specify a transition matrix for the Markov state.
We use:
0.8 0.2
Π=[ ]
0.2 0.8

Here, each Markov state is persistent, and there is are equal chances of moving from one state to the other.
The choice of parameters means that the unconditional expectation of 𝑝𝑡,𝑡+1 is 0.9515, higher than 𝛽(= 0.95).
If we were to set 𝑝𝑡,𝑡+1 = 0.9515 in the version of the model with a constant interest rate, government debt would
explode.

# Create list of matrices that corresponds to each Markov state

Π = np.array([[0.8, 0.2],
[0.2, 0.8]])

As = [A, A]
Bs = [B, B]
Cs = [C, C]
Rs = [R, R]

M1 = np.array([[-β - 0.02]])
M2 = np.array([[-β + 0.017]])

Q1 = M1.T @ M1
Q2 = M2.T @ M2
Qs = [Q1, Q2]
W1 = M1.T @ S
W2 = M2.T @ S
Ws = [W1, W2]

# create Markov Jump LQ DP problem instance

lqm = qe.LQMarkov(Π, Qs, Rs, As, Bs, Cs=Cs, Ns=Ws, beta=β)
lqm.stationary_values();

The decision rules are now dependent on the Markov state:

lqm.Fs[0]

array([[-0.98437712, 19.20516427, -0.8314215 ]])

204 Chapter 9. How to Pay for a War: Part 1

Advanced Quantitative Economics with Python

lqm.Fs[1]

array([[-1.01434301, 21.5847983 , -0.83851116]])

Simulating a large number of such economies over time reveals interesting dynamics.
Debt tends to stay low and stable but recurrently surges.

T = 2000
x0 = np.array([[1000, 1, 25]])
for i in range(250):
x, u, w, s = lqm.compute_sequence(x0, ts_length=T)
plt.plot(list(range(T+1)), x[0, :])
plt.xlabel('Time')
plt.ylabel('Govt Debt')
plt.show()

9.5. Barro Model with a Time-varying Interest Rate 205

Advanced Quantitative Economics with Python

206 Chapter 9. How to Pay for a War: Part 1

CHAPTER

TEN

HOW TO PAY FOR A WAR: PART 2

10.1 Overview

This lecture presents another application of Markov jump linear quadratic dynamic programming and constitutes a sequel
to an earlier lecture.
We use a method introduced in lecture Markov Jump LQ dynamic programming toimplement suggestions by [Barro, 1999]
and [Barro and McCleary, 2003]) for extending his classic 1979 model of tax smoothing.
[Barro, 1979] model is about a government that borrows and lends in order to help it minimize an intertemporal measure
of distortions caused by taxes.
Technically, [Barro, 1979] model looks a lot like a consumption-smoothing model.
Our generalizations of [Barro, 1979] will also look like souped-up consumption-smoothing models.
Wanting tractability induced [Barro, 1979] to assume that
• the government trades only one-period risk-free debt, and
• the one-period risk-free interest rate is constant
In our earlier lecture, we relaxed the second of these assumptions but not the first.
In particular, we used Markov jump linear quadratic dynamic programming to allow the exogenous interest rate to vary
over time.
In this lecture, we add a maturity composition decision to the government’s problem by expanding the dimension of the
state.
We assume
• that the government borrows or saves in the form of risk-free bonds of maturities 1, 2, … , 𝐻.
• that interest rates on those bonds are time-varying and in particular are governed by a jointly stationary stochastic
process.
In addition to what’s in Anaconda, this lecture deploys the quantecon library:

!pip install --upgrade quantecon

Let’s start with some standard imports:

import quantecon as qe
import numpy as np
import matplotlib.pyplot as plt

207
Advanced Quantitative Economics with Python

10.2 Two example specifications

We’ll describe two possible specifications

• In one, each period the government issues zero-coupon bonds of one- and two-period maturities and redeems them
only when they mature – in this version, the maturity structure of government debt at each date is partly inherited
from the past.
• In the second, the government redesigns the maturity structure of the debt each period.

10.3 One- and Two-period Bonds but No Restructuring

Let
• 𝑇𝑡 denote tax collections
• 𝛽 be a discount factor
• 𝑏𝑡,𝑡+1 be time 𝑡 + 1 goods that the government promises to pay at 𝑡
• 𝑏𝑡,𝑡+2 betime 𝑡 + 2 goods that the government promises to pay at time 𝑡
• 𝐺𝑡 be government purchases
• 𝑝𝑡,𝑡+1 be the number of time 𝑡 goods received per time 𝑡 + 1 goods promised
• 𝑝𝑡,𝑡+2 be the number of time 𝑡 goods received per time 𝑡 + 2 goods promised.
Evidently, 𝑝𝑡,𝑡+1 , 𝑝𝑡,𝑡+2 are inversely related to appropriate corresponding gross interest rates on government debt.
In the spirit of [Barro, 1979], government expenditures are governed by an exogenous stochastic process.
Given initial conditions 𝑏−2,0 , 𝑏−1,0 , 𝑧0 , 𝑖0 , where 𝑖0 is the initial Markov state, the government chooses a contingency
plan for {𝑏𝑡,𝑡+1 , 𝑏𝑡,𝑡+2 , 𝑇𝑡 }∞
𝑡=0 to maximize.

∞
−𝐸0 ∑ 𝛽 𝑡 [𝑇𝑡2 + 𝑐1 (𝑏𝑡,𝑡+1 − 𝑏𝑡,𝑡+2 )2 ]
𝑡=0

subject to the constraints

𝑇𝑡 = 𝐺𝑡 + 𝑏𝑡−2,𝑡 + 𝑏𝑡−1,𝑡 − 𝑝𝑡,𝑡+2 𝑏𝑡,𝑡+2 − 𝑝𝑡,𝑡+1 𝑏𝑡,𝑡+1

𝐺𝑡 = 𝑈𝑔,𝑠𝑡 𝑧𝑡
𝑧𝑡+1 = 𝐴22,𝑠𝑡 𝑧𝑡 + 𝐶2,𝑠𝑡 𝑤𝑡+1
𝑝𝑡,𝑡+1
⎡𝑝 ⎤
⎢ 𝑡,𝑡+2 ⎥
⎢ 𝑈𝑔,𝑠𝑡 ⎥ ∼ functions of Markov state with transition matrix Π
⎢𝐴22,𝑠𝑡 ⎥
⎣ 𝐶2,𝑠𝑡 ⎦

Here
• 𝑤𝑡+1 ∼ 𝑁 (0, 𝐼) and Π𝑖𝑗 is the probability that the Markov state moves from state 𝑖 to state 𝑗 in one period
• 𝑇𝑡 , 𝑏𝑡,𝑡+1 , 𝑏𝑡,𝑡+2 are control variables chosen at time 𝑡
• variables 𝑏𝑡−1,𝑡 , 𝑏𝑡−2,𝑡 are endogenous state variables inherited from the past at time 𝑡
• 𝑝𝑡,𝑡+1 , 𝑝𝑡,𝑡+2 are exogenous state variables at time 𝑡

208 Chapter 10. How to Pay for a War: Part 2

Advanced Quantitative Economics with Python

The parameter 𝑐1 imposes a penalty on the government’s issuing different quantities of one and two-period debt.
This penalty deters the government from taking large “long-short” positions in debt of different maturities.
An example below will show the penalty in action.
As well as extending the model to allow for a maturity decision for government debt, we can also in principle allow the
matrices 𝑈𝑔,𝑠𝑡 , 𝐴22,𝑠𝑡 , 𝐶2,𝑠𝑡 to depend on the Markov state 𝑠𝑡 .
Below, we will often adopt the convention that for matrices appearing in a linear state space, 𝐴𝑡 ≡ 𝐴𝑠𝑡 , 𝐶𝑡 ≡ 𝐶𝑠𝑡 and
so on, so that dependence on 𝑡 is always intermediated through the Markov state 𝑠𝑡 .

10.4 Mapping into an LQ Markov Jump Problem

First, define

𝑏̂𝑡 = 𝑏𝑡−1,𝑡 + 𝑏𝑡−2,𝑡 ,

which is debt due at time 𝑡.

Then define the endogenous part of the state:

𝑏̂𝑡
𝑏̄𝑡 = [ ]
𝑏𝑡−1,𝑡+1

and the complete state vector

𝑏̄
𝑥𝑡 = [ 𝑡 ]
𝑧𝑡

and the control vector

𝑏𝑡,𝑡+1
𝑢𝑡 = [ ]
𝑏𝑡,𝑡+2

The endogenous part of state vector follows the law of motion:

𝑏̂ 0 1 𝑏̂𝑡 1 0 𝑏𝑡,𝑡+1
[ 𝑡+1 ] = [ ][ ]+[ ][ ]
𝑏𝑡,𝑡+2 0 0 𝑏𝑡−1,𝑡+1 0 1 𝑏𝑡,𝑡+2
or

𝑏̄𝑡+1 = 𝐴11 𝑏̄𝑡 + 𝐵1 𝑢𝑡

Define the following functions of the state

𝐺𝑡 = 𝑆𝐺,𝑡 𝑥𝑡 , 𝑏̂𝑡 = 𝑆1 𝑥𝑡

and

𝑀𝑡 = [−𝑝𝑡,𝑡+1 −𝑝𝑡,𝑡+2 ]

where 𝑝𝑡,𝑡+1 is the discount on one period loans in the discrete Markov state at time 𝑡 and 𝑝𝑡,𝑡+2 is the discount on
two-period loans in the discrete Markov state.
Define

𝑆𝑡 = 𝑆𝐺,𝑡 + 𝑆1

10.4. Mapping into an LQ Markov Jump Problem 209

Advanced Quantitative Economics with Python

Note that in discrete Markov state 𝑖

𝑇𝑡 = 𝑀𝑡 𝑢𝑡 + 𝑆𝑡 𝑥𝑡

It follows that

𝑇𝑡2 = 𝑥′𝑡 𝑆𝑡′ 𝑆𝑡 𝑥𝑡 + 𝑢′𝑡 𝑀𝑡′ 𝑀𝑡 𝑢𝑡 + 2𝑢′𝑡 𝑀𝑡′ 𝑆𝑡 𝑥𝑡

𝑇𝑡2 = 𝑥′𝑡 𝑅𝑡 𝑥𝑡 + 𝑢′𝑡 𝑄𝑡 𝑢𝑡 + 2𝑢′𝑡 𝑊𝑡 𝑥𝑡

where

𝑅𝑡 = 𝑆𝑡′ 𝑆𝑡 , 𝑄𝑡 = 𝑀𝑡′ 𝑀𝑡 , 𝑊𝑡 = 𝑀𝑡′ 𝑆𝑡

Because the payoff function also includes the penalty parameter on issuing debt of different maturities, we have:

𝑇𝑡2 + 𝑐1 (𝑏𝑡,𝑡+1 − 𝑏𝑡,𝑡+2 )2 = 𝑥′𝑡 𝑅𝑡 𝑥𝑡 + 𝑢′𝑡 𝑄𝑡 𝑢𝑡 + 2𝑢′𝑡 𝑊𝑡 𝑥𝑡 + 𝑐1 𝑢′𝑡 𝑄𝑐 𝑢𝑡

1 −1
where 𝑄𝑐 = [ ].
−1 1
Therefore, the appropriate 𝑄 matrix in the Markov jump LQ problem is:

𝑄𝑐𝑡 = 𝑄𝑡 + 𝑐1 𝑄𝑐

The law of motion of the state in all discrete Markov states 𝑖 is

𝑥𝑡+1 = 𝐴𝑡 𝑥𝑡 + 𝐵𝑢𝑡 + 𝐶𝑡 𝑤𝑡+1

where
𝐴11 0 𝐵1 0
𝐴𝑡 = [ ], 𝐵=[ ], 𝐶𝑡 = [ ]
0 𝐴22,𝑡 0 𝐶2,𝑡

Thus, in this problem all the matrices apart from 𝐵 may depend on the Markov state at time 𝑡.
As shown in the previous lecture, when provided with appropriate 𝐴, 𝐵, 𝐶, 𝑅, 𝑄, 𝑊 matrices for each Markov state the
LQMarkov class can solve Markov jump LQ problems.
The function below maps the primitive matrices and parameters from the above two-period model into the matrices that
the LQMarkov class requires:

def LQ_markov_mapping(A22, C2, Ug, p1, p2, c1=0):

"""
Function which takes A22, C2, Ug, p_{t, t+1}, p_{t, t+2} and penalty
parameter c1, and returns the required matrices for the LQMarkov
model: A, B, C, R, Q, W.
This version uses the condensed version of the endogenous state.
"""

# Make sure all matrices can be treated as 2D arrays

A22 = np.atleast_2d(A22)
C2 = np.atleast_2d(C2)
Ug = np.atleast_2d(Ug)
p1 = np.atleast_2d(p1)
(continues on next page)

210 Chapter 10. How to Pay for a War: Part 2

Advanced Quantitative Economics with Python

(continued from previous page)

p2 = np.atleast_2d(p2)

# Find the number of states (z) and shocks (w)

nz, nw = C2.shape

# Create A11, B1, S1, S2, Sg, S matrices

A11 = np.zeros((2, 2))
A11[0, 1] = 1

B1 = np.eye(2)

S1 = np.hstack((np.eye(1), np.zeros((1, nz+1))))

Sg = np.hstack((np.zeros((1, 2)), Ug))
S = S1 + Sg

# Create M matrix
M = np.hstack((-p1, -p2))

# Create A, B, C matrices
A_T = np.hstack((A11, np.zeros((2, nz))))
A_B = np.hstack((np.zeros((nz, 2)), A22))
A = np.vstack((A_T, A_B))

B = np.vstack((B1, np.zeros((nz, 2))))

C = np.vstack((np.zeros((2, nw)), C2))

# Create Q^c matrix

Qc = np.array([[1, -1], [-1, 1]])

# Create R, Q, W matrices

R = S.T @ S
Q = M.T @ M + c1 * Qc
W = M.T @ S

return A, B, C, R, Q, W

With the above function, we can proceed to solve the model in two steps:
1. Use LQ_markov_mapping to map 𝑈𝑔,𝑡 , 𝐴22,𝑡 , 𝐶2,𝑡 , 𝑝𝑡,𝑡+1 , 𝑝𝑡,𝑡+2 into the 𝐴, 𝐵, 𝐶, 𝑅, 𝑄, 𝑊 matrices for each
of the 𝑛 Markov states.
2. Use the LQMarkov class to solve the resulting n-state Markov jump LQ problem.

10.5 Penalty on Different Issues Across Maturities

To implement a simple example of the two-period model, we assume that 𝐺𝑡 follows an AR(1) process:

𝐺𝑡+1 = 𝐺 ̄ + 𝜌𝐺𝑡 + 𝜎𝑤𝑡+1

1
To do this, we set 𝑧𝑡 = [ ], and consequently:
𝐺𝑡

1 0 0
𝐴22 = [ ̄ ] , 𝐶2 = [ ] , 𝑈𝑔 = [0 1]
𝐺 𝜌 𝜎

10.5. Penalty on Different Issues Across Maturities 211

Advanced Quantitative Economics with Python

Therefore, in this example, 𝐴22 , 𝐶2 and 𝑈𝑔 are not time-varying.

We will assume that there are two Markov states, one with a flatter yield curve, and one with a steeper yield curve.
In state 1, prices are:
1 1
𝑝𝑡,𝑡+1 = 𝛽 , 𝑝𝑡,𝑡+2 = 𝛽 2 − 0.02

and in state 2, prices are:

2 2
𝑝𝑡,𝑡+1 = 𝛽 , 𝑝𝑡,𝑡+2 = 𝛽 2 + 0.02

We first solve the model with no penalty parameter on different issuance across maturities, i.e. 𝑐1 = 0.
We specify that the transition matrix for the Markov state is

0.9 0.1
Π=[ ]
0.1 0.9

Thus, each Markov state is persistent, and there is an equal chance of moving from one to the other.

# Model parameters
β, Gbar, ρ, σ, c1 = 0.95, 5, 0.8, 1, 0
p1, p2, p3, p4 = β, β**2 - 0.02, β, β**2 + 0.02

# Basic model matrices

A22 = np.array([[1, 0], [Gbar, ρ] ,])
C_2 = np.array([[0], [σ]])
Ug = np.array([[0, 1]])

A1, B1, C1, R1, Q1, W1 = LQ_markov_mapping(A22, C_2, Ug, p1, p2, c1)
A2, B2, C2, R2, Q2, W2 = LQ_markov_mapping(A22, C_2, Ug, p3, p4, c1)

# Small penalties on debt required to implement no-Ponzi scheme

R1[0, 0] = R1[0, 0] + 1e-9
R2[0, 0] = R2[0, 0] + 1e-9

# Construct lists of matrices correspond to each state

As = [A1, A2]
Bs = [B1, B2]
Cs = [C1, C2]
Rs = [R1, R2]
Qs = [Q1, Q2]
Ws = [W1, W2]

Π = np.array([[0.9, 0.1],
[0.1, 0.9]])

# Construct and solve the model using the LQMarkov class

lqm = qe.LQMarkov(Π, Qs, Rs, As, Bs, Cs=Cs, Ns=Ws, beta=β)
lqm.stationary_values()

# Simulate the model

x0 = np.array([[100, 50, 1, 10]])
x, u, w, t = lqm.compute_sequence(x0, ts_length=300)

# Plot of one and two-period debt issuance

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))
ax1.plot(u[0, :])
(continues on next page)

212 Chapter 10. How to Pay for a War: Part 2

Advanced Quantitative Economics with Python

(continued from previous page)

ax1.set_title('One-period debt issuance')
ax1.set_xlabel('Time')
ax2.plot(u[1, :])
ax2.set_title('Two-period debt issuance')
ax2.set_xlabel('Time')
plt.show()

The above simulations show that when no penalty is imposed on different issuances across maturities, the government has
an incentive to take large “long-short” positions in debt of different maturities.
To prevent such outcomes, we set 𝑐1 = 0.01.
This penalty is big enough to motivate the government to issue positive quantities of both one- and two-period debt:

# Put small penalty on different issuance across maturities

c1 = 0.01

A1, B1, C1, R1, Q1, W1 = LQ_markov_mapping(A22, C_2, Ug, p1, p2, c1)
A2, B2, C2, R2, Q2, W2 = LQ_markov_mapping(A22, C_2, Ug, p3, p4, c1)

# Small penalties on debt required to implement no-Ponzi scheme

R1[0, 0] = R1[0, 0] + 1e-9
R2[0, 0] = R2[0, 0] + 1e-9

# Construct lists of matrices

As = [A1, A2]
Bs = [B1, B2]
Cs = [C1, C2]
Rs = [R1, R2]
Qs = [Q1, Q2]
Ws = [W1, W2]

# Construct and solve the model using the LQMarkov class

lqm2 = qe.LQMarkov(Π, Qs, Rs, As, Bs, Cs=Cs, Ns=Ws, beta=β)
lqm2.stationary_values()

# Simulate the model

x, u, w, t = lqm2.compute_sequence(x0, ts_length=300)

# Plot of one and two-period debt issuance

(continues on next page)

10.5. Penalty on Different Issues Across Maturities 213

Advanced Quantitative Economics with Python

(continued from previous page)

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))
ax1.plot(u[0, :])
ax1.set_title('One-period debt issuance')
ax1.set_xlabel('Time')
ax2.plot(u[1, :])
ax2.set_title('Two-period debt issuance')
ax2.set_xlabel('Time')
plt.show()

10.6 A Model with Restructuring

We now alter two features of the previous model:

1. The maximum horizon of government debt is now extended to a general H periods.
2. The government is able to redesign the maturity structure of debt every period.
We impose a cost on adjusting issuance of each maturity by amending the payoff function to become:
𝐻−1
𝑡−1
𝑇𝑡2 + ∑ 𝑐2 (𝑏𝑡+𝑗 𝑡
− 𝑏𝑡+𝑗+1 )2
𝑗=0

The government’s budget constraint is now:

𝐻 𝐻−1
𝑡
𝑇𝑡 + ∑ 𝑝𝑡,𝑡+𝑗 𝑏𝑡+𝑗 𝑡−1
= 𝑏𝑡𝑡−1 + ∑ 𝑝𝑡,𝑡+𝑗 𝑏𝑡+𝑗 + 𝐺𝑡
𝑗=1 𝑗=1

To map this into the Markov Jump LQ framework, we define state and control variables.
Let:
𝑏𝑡𝑡−1 𝑡
𝑏𝑡+1
⎡ 𝑏𝑡−1 ⎤ ⎡ 𝑏𝑡 ⎤
𝑏̄𝑡 = ⎢ 𝑡+1 ⎥ , 𝑢𝑡 = ⎢ 𝑡+2 ⎥
⎢ ⋮ ⎥ ⎢ ⋮ ⎥
𝑡−1 𝑡
𝑏
⎣ 𝑡+𝐻−1 ⎦ ⎣𝑏𝑡+𝐻 ⎦

Thus, 𝑏̄𝑡 is the endogenous state (debt issued last period) and 𝑢𝑡 is the control (debt issued today).
As before, we will also have the exogenous state 𝑧𝑡 , which determines government spending.

214 Chapter 10. How to Pay for a War: Part 2

Advanced Quantitative Economics with Python

Therefore, the full state is:

𝑏̄
𝑥𝑡 = [ 𝑡 ]
𝑧𝑡

We also define a vector 𝑝𝑡 that contains the time 𝑡 price of goods in period 𝑡 + 𝑗:

𝑝𝑡,𝑡+1
⎡𝑝 ⎤
𝑝𝑡 = ⎢ 𝑡,𝑡+2 ⎥
⎢ ⋮ ⎥
⎣𝑝𝑡,𝑡+𝐻 ⎦

Finally, we define three useful matrices 𝑆𝑠 , 𝑆𝑥 , 𝑆𝑥̃ :

𝑝𝑡,𝑡+1 1 0 0 ⋯ 0
⎡ 𝑝 ⎤ ⎡0 1 0 ⋯ 0⎤
⎢ 𝑡,𝑡+2 ⎥ = 𝑆𝑠 𝑝𝑡 where 𝑆𝑠 = ⎢ ⎥
⎢ ⋮ ⎥ ⎢⋮ ⋱ ⎥
⎣𝑝𝑡,𝑡+𝐻−1 ⎦ ⎣0 0 ⋯ 1 0⎦
𝑡−1
𝑏𝑡+1 0 1 0 ⋯ 0
⎡ 𝑏𝑡−1 ⎤ ⎡0 0 1 ⋯ 0⎤
⎢ 𝑡+2 ̄
⎥ = 𝑆𝑥 𝑏𝑡 where 𝑆𝑥 = ⎢ ⎥
⎢ ⋮ ⎥ ⎢⋮ ⋱ ⎥
𝑡−1
⎣𝑏𝑡+𝑇 −1 ⎦ ⎣0 0 ⋯ 0 1⎦

𝑏𝑡𝑡−1 = 𝑆𝑥̃ 𝑏̄𝑡 where 𝑆𝑥̃ = [1 0 0 ⋯ 0]

In terms of dimensions, the first two matrices defined above are (𝐻 − 1) × 𝐻.
The last is 1 × 𝐻
We can now write the government’s budget constraint in matrix notation.
We can rearrange the government budget constraint to become
𝐻−1 𝐻
𝑡
𝑇𝑡 = 𝑏𝑡𝑡−1 + ∑ 𝑝𝑡+𝑗 𝑡−1
𝑏𝑡+𝑗 𝑡
+ 𝐺𝑡 − ∑ 𝑝𝑡+𝑗 𝑡
𝑏𝑡+𝑗
𝑗=1 𝑗=1

𝑇𝑡 = 𝑆𝑥̃ 𝑏̄𝑡 + (𝑆𝑠 𝑝𝑡 ) ⋅ (𝑆𝑥 𝑏̄𝑡 ) + 𝑈𝑔 𝑧𝑡 − 𝑝𝑡 ⋅ 𝑢𝑡

To express 𝑇𝑡 as a function of the full state, let

𝑇𝑡 = [(𝑆𝑥̃ + 𝑝𝑡′ 𝑆𝑠′ 𝑆𝑥 ) 𝑈 𝑔] 𝑥𝑡 − 𝑝𝑡′ 𝑢𝑡

To simplify the notation, let 𝑆𝑡 = [(𝑆𝑥̃ + 𝑝𝑡 ’𝑆𝑠 ’𝑆𝑥 ) 𝑈 𝑔].

Then

𝑇𝑡 = 𝑆𝑡 𝑥𝑡 − 𝑝𝑡′ 𝑢𝑡

Therefore

𝑇𝑡2 = 𝑥′𝑡 𝑅𝑡 𝑥𝑡 + 𝑢′𝑡 𝑄𝑡 𝑢𝑡 + 2𝑢′𝑡 𝑊𝑡 𝑥𝑡

where

𝑅𝑡 = 𝑆𝑡′ 𝑆𝑡 , 𝑄𝑡 = 𝑝𝑡 𝑝𝑡′ , 𝑊𝑡 = −𝑝𝑡 𝑆𝑡

10.6. A Model with Restructuring 215

Advanced Quantitative Economics with Python

where to economize on notation we adopt the convention that for the linear state matrices 𝑅𝑡 ≡ 𝑅𝑠𝑡 , 𝑄𝑡 ≡ 𝑊𝑠𝑡 and so
on.
We’ll use this convention for the linear state matrices 𝐴, 𝐵, 𝑊 and so on below.
Because the payoff function also includes the penalty parameter for rescheduling, we have:
𝐻−1
𝑇𝑡2 + ∑ 𝑐2 (𝑏𝑡+𝑗
𝑡−1 𝑡
− 𝑏𝑡+𝑗+1 )2 = 𝑇𝑡2 + 𝑐2 (𝑏̄𝑡 − 𝑢𝑡 )′ (𝑏̄𝑡 − 𝑢𝑡 )
𝑗=0

Because the complete state is 𝑥𝑡 and not 𝑏̄𝑡 , we rewrite this as:

𝑇𝑡2 + 𝑐2 (𝑆𝑐 𝑥𝑡 − 𝑢𝑡 )′ (𝑆𝑐 𝑥𝑡 − 𝑢𝑡 )

where 𝑆𝑐 = [𝐼 0]
Multiplying this out gives:

𝑇𝑡2 + 𝑐2 𝑥′𝑡 𝑆𝑐′ 𝑆𝑐 𝑥𝑡 − 2𝑐2 𝑢′𝑡 𝑆𝑐 𝑥𝑡 + 𝑐2 𝑢′𝑡 𝑢𝑡

Therefore, with the cost term, we must amend our 𝑅, 𝑄, 𝑊 matrices as follows:

𝑅𝑡𝑐 = 𝑅𝑡 + 𝑐2 𝑆𝑐′ 𝑆𝑐

𝑄𝑐𝑡 = 𝑄𝑡 + 𝑐2 𝐼

𝑊𝑡𝑐 = 𝑊𝑡 − 𝑐2 𝑆𝑐
To finish mapping into the Markov jump LQ setup, we need to construct the law of motion for the full state.
This is simpler than in the previous setup, as we now have 𝑏̄𝑡+1 = 𝑢𝑡 .
Therefore:
𝑏̄𝑡+1
𝑥𝑡+1 ≡ [ ] = 𝐴𝑡 𝑥𝑡 + 𝐵𝑢𝑡 + 𝐶𝑡 𝑤𝑡+1
𝑧𝑡+1

where
0 0 𝐼 0
𝐴𝑡 = [ ], 𝐵 = [ ], 𝐶=[ ]
0 𝐴22,𝑡 0 𝐶2,𝑡

This completes the mapping into a Markov jump LQ problem.

10.7 Restructuring as a Markov Jump Linear Quadratic Control Prob-

lem

We can define a function that maps the primitives of the model with restructuring into the matrices required by the
LQMarkov class:

def LQ_markov_mapping_restruct(A22, C2, Ug, T, p_t, c=0):

"""
Function which takes A22, C2, T, p_t, c and returns the
required matrices for the LQMarkov model: A, B, C, R, Q, W
Note, p_t should be a T by 1 matrix
(continues on next page)

216 Chapter 10. How to Pay for a War: Part 2

Advanced Quantitative Economics with Python

(continued from previous page)

c is the rescheduling cost (a scalar)
This version uses the condensed version of the endogenous state
"""

# Make sure all matrices can be treated as 2D arrays

A22 = np.atleast_2d(A22)
C2 = np.atleast_2d(C2)
Ug = np.atleast_2d(Ug)
p_t = np.atleast_2d(p_t)

# Find the number of states (z) and shocks (w)

nz, nw = C2.shape

# Create Sx, tSx, Ss, S_t matrices (tSx stands for \tilde S_x)
Ss = np.hstack((np.eye(T-1), np.zeros((T-1, 1))))
Sx = np.hstack((np.zeros((T-1, 1)), np.eye(T-1)))
tSx = np.zeros((1, T))
tSx[0, 0] = 1

S_t = np.hstack((tSx + p_t.T @ Ss.T @ Sx, Ug))

# Create A, B, C matrices
A_T = np.hstack((np.zeros((T, T)), np.zeros((T, nz))))
A_B = np.hstack((np.zeros((nz, T)), A22))
A = np.vstack((A_T, A_B))

B = np.vstack((np.eye(T), np.zeros((nz, T))))

C = np.vstack((np.zeros((T, nw)), C2))

# Create cost matrix Sc

Sc = np.hstack((np.eye(T), np.zeros((T, nz))))

# Create R_t, Q_t, W_t matrices

R_c = S_t.T @ S_t + c * Sc.T @ Sc

Q_c = p_t @ p_t.T + c * np.eye(T)
W_c = -p_t @ S_t - c * Sc

return A, B, C, R_c, Q_c, W_c

10.7.1 Example with Restructuring

As an example let 𝐻 = 3.
Assume that there are two Markov states, one with a flatter yield curve, the other with a steeper yield curve.
In state 1, prices are:
1 1 1
𝑝𝑡,𝑡+1 = 0.9695 , 𝑝𝑡,𝑡+2 = 0.902 , 𝑝𝑡,𝑡+3 = 0.8369

and in state 2, prices are:

2 2 2
𝑝𝑡,𝑡+1 = 0.9295 , 𝑝𝑡,𝑡+2 = 0.902 , 𝑝𝑡,𝑡+3 = 0.8769

We specify the same transition matrix and 𝐺𝑡 process that we used earlier.

10.7. Restructuring as a Markov Jump Linear Quadratic Control Problem 217

Advanced Quantitative Economics with Python

# New model parameters

H = 3
p1 = np.array([[0.9695], [0.902], [0.8369]])
p2 = np.array([[0.9295], [0.902], [0.8769]])
Pi = np.array([[0.9, 0.1], [0.1, 0.9]])

# Put penalty on different issuance across maturities

c2 = 0.5

A1, B1, C1, R1, Q1, W1 = LQ_markov_mapping_restruct(A22, C_2, Ug, H, p1, c2)
A2, B2, C2, R2, Q2, W2 = LQ_markov_mapping_restruct(A22, C_2, Ug, H, p2, c2)

# Small penalties on debt required to implement no-Ponzi scheme

R1[0, 0] = R1[0, 0] + 1e-9
R1[1, 1] = R1[1, 1] + 1e-9
R1[2, 2] = R1[2, 2] + 1e-9
R2[0, 0] = R2[0, 0] + 1e-9
R2[1, 1] = R2[1, 1] + 1e-9
R2[2, 2] = R2[2, 2] + 1e-9

# Construct lists of matrices

As = [A1, A2]
Bs = [B1, B2]
Cs = [C1, C2]
Rs = [R1, R2]
Qs = [Q1, Q2]
Ws = [W1, W2]

# Construct and solve the model using the LQMarkov class

lqm3 = qe.LQMarkov(Π, Qs, Rs, As, Bs, Cs=Cs, Ns=Ws, beta=β)
lqm3.stationary_values()

x0 = np.array([[5000, 5000, 5000, 1, 10]])

x, u, w, t = lqm3.compute_sequence(x0, ts_length=300)

# Plots of different maturities debt issuance

fig, (ax1, ax2, ax3, ax4) = plt.subplots(1, 4, figsize=(11, 3))

ax1.plot(u[0, :])
ax1.set_title('One-period debt issuance')
ax1.set_xlabel('Time')
ax2.plot(u[1, :])
ax2.set_title('Two-period debt issuance')
ax2.set_xlabel('Time')
ax3.plot(u[2, :])
ax3.set_title('Three-period debt issuance')
ax3.set_xlabel('Time')
ax4.plot(u[0, :] + u[1, :] + u[2, :])
ax4.set_title('Total debt issuance')
ax4.set_xlabel('Time')
plt.tight_layout()
plt.show()

218 Chapter 10. How to Pay for a War: Part 2

Advanced Quantitative Economics with Python

# Plot share of debt issuance that is short-term

fig, ax = plt.subplots()
ax.plot((u[0, :] / (u[0, :] + u[1, :] + u[2, :])))
ax.set_title('One-period debt issuance share')
ax.set_xlabel('Time')
plt.show()

10.7. Restructuring as a Markov Jump Linear Quadratic Control Problem 219

Advanced Quantitative Economics with Python

220 Chapter 10. How to Pay for a War: Part 2

CHAPTER

ELEVEN

HOW TO PAY FOR A WAR: PART 3

11.1 Overview

This lecture presents another application of Markov jump linear quadratic dynamic programming and constitutes a sequel
to an earlier lecture.
We again use a method introduced in lecture Markov Jump LQ dynamic programming to implement some ideas of [Barro,
1999] and [Barro and McCleary, 2003]) that extend the classic [Barro, 1979] model of tax smoothing.
[Barro, 1979] is about a government that borrows and lends in order to help it minimize an intertemporal measure of
distortions caused by taxes.
Technically, [Barro, 1979] looks a lot like a consumption-smoothing model.
Our generalization will also look like a souped-up consumption-smoothing model.
In this lecture, we describe a tax-smoothing problem of a government that faces roll-over risk.
In addition to what’s in Anaconda, this lecture deploys the quantecon library:

!pip install --upgrade quantecon

Let’s start with some standard imports:

import quantecon as qe
import numpy as np
import matplotlib.pyplot as plt

11.2 Roll-Over Risk

Let 𝑇𝑡 denote tax collections, 𝛽 a discount factor, 𝑏𝑡,𝑡+1 time 𝑡 + 1 goods that the government promises to pay at 𝑡, 𝐺𝑡
𝑡
government purchases, 𝑝𝑡+1 the number of time 𝑡 goods received per time 𝑡 + 1 goods promised.
The stochastic process of government expenditures is exogenous.
The government’s problem is to choose a plan for borrowing and tax collections {𝑏𝑡+1 , 𝑇𝑡 }∞
𝑡=0 to minimize

∞
𝐸0 ∑ 𝛽 𝑡 𝑇𝑡2
𝑡=0

subject to the constraints

𝑡
𝑇𝑡 + 𝑝𝑡+1 𝑏𝑡,𝑡+1 = 𝐺𝑡 + 𝑏𝑡−1,𝑡

221
Advanced Quantitative Economics with Python

𝐺𝑡 = 𝑈𝑔,𝑡 𝑧𝑡

𝑧𝑡+1 = 𝐴22,𝑡 𝑧𝑡 + 𝐶2,𝑡 𝑤𝑡+1

where 𝑤𝑡+1 ∼ 𝑁 (0, 𝐼).
Let
• 𝑇𝑡 , 𝑏𝑡,𝑡+1 be controls chosen at 𝑡
• 𝑏𝑡−1,𝑡 be an endogenous state variable inherited from the past at time 𝑡
𝑡
• 𝑝𝑡+1 be an exogenous price at time 𝑡.
This is the same set-up as used in this lecture.
We will consider a situation in which the government faces “roll-over risk”.
Specifically, we shut down the government’s ability to borrow in one of the Markov states.

11.3 A Dead End

𝑡
A first thought for how to implement this might be to allow 𝑝𝑡+1 to vary over time with:
𝑡
𝑝𝑡+1 =𝛽

in Markov state 1 and

𝑡
𝑝𝑡+1 =0

in Markov state 2.
Consequently, in the second Markov state, the government is unable to borrow, and the budget constraint becomes 𝑇𝑡 =
𝐺𝑡 + 𝑏𝑡−1,𝑡 .
However, if this is the only adjustment we make in our linear-quadratic model, the government will not set 𝑏𝑡,𝑡+1 = 0,
which is the outcome we want to express roll-over risk in period 𝑡.
Instead, the government would have an incentive to set 𝑏𝑡,𝑡+1 to a large negative number in state 2 – it would accumulate
large amounts of assets to bring into period 𝑡 + 1 because that is cheap
• Riccati equations will tell us this
Thus, we must represent “roll-over risk” some other way.

11.4 Better Representation of Roll-Over Risk

To force the government to set 𝑏𝑡,𝑡+1 = 0, we can instead extend the model to have four Markov states:
1. Good today, good yesterday
2. Good today, bad yesterday
3. Bad today, good yesterday
4. Bad today, bad yesterday

222 Chapter 11. How to Pay for a War: Part 3

Advanced Quantitative Economics with Python

where good is a state in which effectively the government can issue debt and bad is a state in which effectively the
government can’t issue debt.
We’ll explain what effectively means shortly.
We now set
𝑡
𝑝𝑡+1 =𝛽

in all states.
In addition – and this is important because it defines what we mean by effectively – we put a large penalty on the 𝑏𝑡−1,𝑡
element of the state vector in states 2 and 4.
This will prevent the government from wishing to issue any debt in states 3 or 4 because it would experience a large
penalty from doing so in the next period.
The transition matrix for this formulation is:
0.95 0 0.05 0
⎡0.95 0 0.05 0 ⎤
Π=⎢ ⎥
⎢ 0 0.9 0 0.1⎥
⎣ 0 0.9 0 0.1⎦
This transition matrix ensures that the Markov state cannot move, for example, from state 3 to state 1.
Because state 3 is “bad today”, the next period cannot have “good yesterday”.

# Model parameters
β, Gbar, ρ, σ = 0.95, 5, 0.8, 1

# Basic model matrices

A22 = np.array([[1, 0], [Gbar, ρ], ])
C2 = np.array([[0], [σ]])
Ug = np.array([[0, 1]])

# LQ framework matrices
A_t = np.zeros((1, 3))
A_b = np.hstack((np.zeros((2, 1)), A22))
A = np.vstack((A_t, A_b))

B = np.zeros((3, 1))
B[0, 0] = 1

C = np.vstack((np.zeros((1, 1)), C2))

Sg = np.hstack((np.zeros((1, 1)), Ug))

S1 = np.zeros((1, 3))
S1[0, 0] = 1
S = S1 + Sg

R = S.T @ S

# Large penalty on debt in R2 to prevent borrowing in a bad state

R1 = np.copy(R)
R2 = np.copy(R)
R1[0, 0] = R[0, 0] + 1e-9
R2[0, 0] = R[0, 0] + 1e12

M = np.array([[-β]])
(continues on next page)

11.4. Better Representation of Roll-Over Risk 223

Advanced Quantitative Economics with Python

(continued from previous page)

Q = M.T @ M
W = M.T @ S

Π = np.array([[0.95, 0, 0.05, 0],

[0.95, 0, 0.05, 0],
[0, 0.9, 0, 0.1],
[0, 0.9, 0, 0.1]])

# Construct lists of matrices that correspond to each state

As = [A, A, A, A]
Bs = [B, B, B, B]
Cs = [C, C, C, C]
Rs = [R1, R2, R1, R2]
Qs = [Q, Q, Q, Q]
Ws = [W, W, W, W]

lqm = qe.LQMarkov(Π, Qs, Rs, As, Bs, Cs=Cs, Ns=Ws, beta=β)

lqm.stationary_values();

Using the same process for 𝐺𝑡 as in this lecture, we shall simulate our model with roll-over risk.
𝑡
When 𝑝𝑡+1 = 𝛽 government debt fluctuates around zero.
The spikes in the tax collection series indicate periods when the government is unable to access financial markets:
• positive spikes occur when debt is positive and the government must urgently raise tax revenues now
Negative spikes occur when the government has positive asset holdings.
An inability to use financial markets in the next period means that the government uses those assets to lower taxation
today.

x0 = np.array([[0, 1, 25]])
T = 300
x, u, w, state = lqm.compute_sequence(x0, ts_length=T)

# Calculate taxation each period from the budget constraint and the Markov state
tax = np.zeros([T, 1])
for i in range(T):
tax[i, :] = S @ x[:, i] + M @ u[:, i]

# Plot of debt issuance and taxation

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 3))
ax1.plot(x[0, :])
ax1.set_title('One-period debt issuance')
ax1.set_xlabel('Time')
ax2.plot(tax)
ax2.set_title('Taxation')
ax2.set_xlabel('Time')
plt.show()

224 Chapter 11. How to Pay for a War: Part 3

Advanced Quantitative Economics with Python

We can adjust parameters so that, rather than debt fluctuating around zero, the government is a debtor in every period
that it can borrow.
𝑡
To accomplish this, we simply raise 𝑝𝑡+1 to 𝛽 + 0.02 = 0.97.

M = np.array([[-β - 0.02]])

Q = M.T @ M
W = M.T @ S

# Construct lists of matrices

As = [A, A, A, A]
Bs = [B, B, B, B]
Cs = [C, C, C, C]
Rs = [R1, R2, R1, R2]
Qs = [Q, Q, Q, Q]
Ws = [W, W, W, W]

lqm2 = qe.LQMarkov(Π, Qs, Rs, As, Bs, Cs=Cs, Ns=Ws, beta=β)

x, u, w, state = lqm2.compute_sequence(x0, ts_length=T)

# Calculate taxation each period from the budget constraint and the
# Markov state
tax = np.zeros([T, 1])
for i in range(T):
tax[i, :] = S @ x[:, i] + M @ u[:, i]

# Plot of debt issuance and taxation

11.4. Better Representation of Roll-Over Risk 225

Advanced Quantitative Economics with Python

With a lower interest rate, the government has an incentive to increase debt over time.
However, with “roll-over risk”, debt is recurrently reset to zero and tax collections spike up.
In this model, high costs of a “sudden stop” make the government wary about letting its debt get too high.

226 Chapter 11. How to Pay for a War: Part 3

CHAPTER

TWELVE

OPTIMAL TAXATION IN AN LQ ECONOMY

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install --upgrade quantecon

12.1 Overview

In this lecture, we study optimal fiscal policy in a linear quadratic setting.

We modify a model of Robert Lucas and Nancy Stokey [Lucas and Stokey, 1983] so that convenient formulas for solving
linear-quadratic models can be applied.
The economy consists of a representative household and a benevolent government.
The government finances an exogenous stream of government purchases with state-contingent loans and a linear tax on
labor income.
A linear tax is sometimes called a flat-rate tax.
The household maximizes utility by choosing paths for consumption and labor, taking prices and the government’s tax
rate and borrowing plans as given.
Maximum attainable utility for the household depends on the government’s tax and borrowing plans.
The Ramsey problem [Ramsey, 1927] is to choose tax and borrowing plans that maximize the household’s welfare, taking
the household’s optimizing behavior as given.
There is a large number of competitive equilibria indexed by different government fiscal policies.
The Ramsey planner chooses the best competitive equilibrium.
We want to study the dynamics of tax rates, tax revenues, government debt under a Ramsey plan.
Because the Lucas and Stokey model features state-contingent government debt, the government debt dynamics differ
substantially from those in a model of Robert Barro [Barro, 1979].
The treatment given here closely follows this manuscript, prepared by Thomas J. Sargent and Francois R. Velde.
We cover only the key features of the problem in this lecture, leaving you to refer to that source for additional results and
intuition.
We’ll need the following imports:

import sys
import numpy as np
import matplotlib.pyplot as plt
(continues on next page)

227
Advanced Quantitative Economics with Python

(continued from previous page)

from numpy import sqrt, eye, zeros, cumsum
from numpy.random import randn
import scipy.linalg
from collections import namedtuple
from quantecon import nullspace, mc_sample_path, var_quadratic_sum

12.1.1 Model Features

• Linear quadratic (LQ) model

• Representative household
• Stochastic dynamic programming over an infinite horizon
• Distortionary taxation

12.2 The Ramsey Problem

We begin by outlining the key assumptions regarding technology, households and the government sector.

12.2.1 Technology

Labor can be converted one-for-one into a single, non-storable consumption good.

In the usual spirit of the LQ model, the amount of labor supplied in each period is unrestricted.
This is unrealistic, but helpful when it comes to solving the model.
Realistic labor supply can be induced by suitable parameter values.

12.2.2 Households

Consider a representative household who chooses a path {ℓ𝑡 , 𝑐𝑡 } for labor and consumption to maximize

1 ∞
−𝔼 ∑ 𝛽 𝑡 [(𝑐𝑡 − 𝑏𝑡 )2 + ℓ𝑡2 ] (12.1)
2 𝑡=0

subject to the budget constraint

∞
𝔼 ∑ 𝛽 𝑡 𝑝𝑡0 [𝑑𝑡 + (1 − 𝜏𝑡 )ℓ𝑡 + 𝑠𝑡 − 𝑐𝑡 ] = 0 (12.2)
𝑡=0

Here
• 𝛽 is a discount factor in (0, 1).
• 𝑝𝑡0 is a scaled Arrow-Debreu price at time 0 of history contingent goods at time 𝑡 + 𝑗.
• 𝑏𝑡 is a stochastic preference parameter.
• 𝑑𝑡 is an endowment process.
• 𝜏𝑡 is a flat tax rate on labor income.

228 Chapter 12. Optimal Taxation in an LQ Economy

Advanced Quantitative Economics with Python

• 𝑠𝑡 is a promised time-𝑡 coupon payment on debt issued by the government.

The scaled Arrow-Debreu price 𝑝𝑡0 is related to the unscaled Arrow-Debreu price as follows.
If we let 𝜋𝑡0 (𝑥𝑡 ) denote the probability (density) of a history 𝑥𝑡 = [𝑥𝑡 , 𝑥𝑡−1 , … , 𝑥0 ] of the state 𝑥𝑡 , then the Arrow-Debreu
time 0 price of a claim on one unit of consumption at date 𝑡, history 𝑥𝑡 would be

𝛽 𝑡 𝑝𝑡0
𝜋𝑡0 (𝑥𝑡 )

Thus, our scaled Arrow-Debreu price is the ordinary Arrow-Debreu price multiplied by the discount factor 𝛽 𝑡 and divided
by an appropriate probability.
The budget constraint (12.2) requires that the present value of consumption be restricted to equal the present value of
endowments, labor income and coupon payments on bond holdings.

12.2.3 Government

The government imposes a linear tax on labor income, fully committing to a stochastic path of tax rates at time zero.
The government also issues state-contingent debt.
Given government tax and borrowing plans, we can construct a competitive equilibrium with distorting government taxes.
Among all such competitive equilibria, the Ramsey plan is the one that maximizes the welfare of the representative
consumer.

12.2.4 Exogenous Variables

Endowments, government expenditure, the preference shock process 𝑏𝑡 , and promised coupon payments on initial gov-
ernment debt 𝑠𝑡 are all exogenous, and given by
• 𝑑𝑡 = 𝑆𝑑 𝑥𝑡
• 𝑔𝑡 = 𝑆𝑔 𝑥𝑡
• 𝑏𝑡 = 𝑆𝑏 𝑥𝑡
• 𝑠𝑡 = 𝑆𝑠 𝑥𝑡
The matrices 𝑆𝑑 , 𝑆𝑔 , 𝑆𝑏 , 𝑆𝑠 are primitives and {𝑥𝑡 } is an exogenous stochastic process taking values in ℝ𝑘 .
We consider two specifications for {𝑥𝑡 }.
1. Discrete case: {𝑥𝑡 } is a discrete state Markov chain with transition matrix 𝑃 .
2. VAR case: {𝑥𝑡 } obeys 𝑥𝑡+1 = 𝐴𝑥𝑡 + 𝐶𝑤𝑡+1 where {𝑤𝑡 } is independent zero-mean Gaussian with identify
covariance matrix.

12.2.5 Feasibility

The period-by-period feasibility restriction for this economy is

𝑐𝑡 + 𝑔𝑡 = 𝑑𝑡 + ℓ𝑡 (12.3)

A labor-consumption process {ℓ𝑡 , 𝑐𝑡 } is called feasible if (12.3) holds for all 𝑡.

12.2. The Ramsey Problem 229

Advanced Quantitative Economics with Python

12.2.6 Government Budget Constraint

Where 𝑝𝑡0 is again a scaled Arrow-Debreu price, the time zero government budget constraint is
∞
𝔼 ∑ 𝛽 𝑡 𝑝𝑡0 (𝑠𝑡 + 𝑔𝑡 − 𝜏𝑡 ℓ𝑡 ) = 0 (12.4)
𝑡=0

12.2.7 Equilibrium

An equilibrium is a feasible allocation {ℓ𝑡 , 𝑐𝑡 }, a sequence of prices {𝑝𝑡0 }, and a tax system {𝜏𝑡 } such that
1. The allocation {ℓ𝑡 , 𝑐𝑡 } is optimal for the household given {𝑝𝑡0 } and {𝜏𝑡 }.
2. The government’s budget constraint (12.4) is satisfied.
The Ramsey problem is to choose the equilibrium {ℓ𝑡 , 𝑐𝑡 , 𝜏𝑡 , 𝑝𝑡0 } that maximizes the household’s welfare.
If {ℓ𝑡 , 𝑐𝑡 , 𝜏𝑡 , 𝑝𝑡0 } solves the Ramsey problem, then {𝜏𝑡 } is called the Ramsey plan.
The solution procedure we adopt is
1. Use the first-order conditions from the household problem to pin down prices and allocations given {𝜏𝑡 }.
2. Use these expressions to rewrite the government budget constraint (12.4) in terms of exogenous variables and
allocations.
3. Maximize the household’s objective function (12.1) subject to the constraint constructed in step 2 and the feasibility
constraint (12.3).
The solution to this maximization problem pins down all quantities of interest.

12.2.8 Solution

Step one is to obtain the first-conditions for the household’s problem, taking taxes and prices as given.
Letting 𝜇 be the Lagrange multiplier on (12.2), the first-order conditions are 𝑝𝑡0 = (𝑐𝑡 − 𝑏𝑡 )/𝜇 and ℓ𝑡 = (𝑐𝑡 − 𝑏𝑡 )(1 − 𝜏𝑡 ).
Rearranging and normalizing at 𝜇 = 𝑏0 − 𝑐0 , we can write these conditions as

𝑏𝑡 − 𝑐 𝑡 ℓ𝑡
𝑝𝑡0 = and 𝜏𝑡 = 1 − (12.5)
𝑏0 − 𝑐 0 𝑏𝑡 − 𝑐 𝑡

Substituting (12.5) into the government’s budget constraint (12.4) yields

∞
𝔼 ∑ 𝛽 𝑡 [(𝑏𝑡 − 𝑐𝑡 )(𝑠𝑡 + 𝑔𝑡 − ℓ𝑡 ) + ℓ𝑡2 ] = 0 (12.6)
𝑡=0

The Ramsey problem now amounts to maximizing (12.1) subject to (12.6) and (12.3).
The associated Lagrangian is
∞
1
ℒ = 𝔼 ∑ 𝛽 𝑡 {− [(𝑐𝑡 − 𝑏𝑡 )2 + ℓ𝑡2 ] + 𝜆 [(𝑏𝑡 − 𝑐𝑡 )(ℓ𝑡 − 𝑠𝑡 − 𝑔𝑡 ) − ℓ𝑡2 ] + 𝜇𝑡 [𝑑𝑡 + ℓ𝑡 − 𝑐𝑡 − 𝑔𝑡 ]} (12.7)
𝑡=0
2

The first-order conditions associated with 𝑐𝑡 and ℓ𝑡 are

−(𝑐𝑡 − 𝑏𝑡 ) + 𝜆[−ℓ𝑡 + (𝑔𝑡 + 𝑠𝑡 )] = 𝜇𝑡

230 Chapter 12. Optimal Taxation in an LQ Economy

Advanced Quantitative Economics with Python

and

ℓ𝑡 − 𝜆[(𝑏𝑡 − 𝑐𝑡 ) − 2ℓ𝑡 ] = 𝜇𝑡

Combining these last two equalities with (12.3) and working through the algebra, one can show that

ℓ𝑡 = ℓ𝑡̄ − 𝜈𝑚𝑡 and 𝑐𝑡 = 𝑐𝑡̄ − 𝜈𝑚𝑡 (12.8)

where
• 𝜈 ∶= 𝜆/(1 + 2𝜆)
• ℓ𝑡̄ ∶= (𝑏𝑡 − 𝑑𝑡 + 𝑔𝑡 )/2
• 𝑐𝑡̄ ∶= (𝑏𝑡 + 𝑑𝑡 − 𝑔𝑡 )/2
• 𝑚𝑡 ∶= (𝑏𝑡 − 𝑑𝑡 − 𝑠𝑡 )/2
Apart from 𝜈, all of these quantities are expressed in terms of exogenous variables.
To solve for 𝜈, we can use the government’s budget constraint again.
The term inside the brackets in (12.6) is (𝑏𝑡 − 𝑐𝑡 )(𝑠𝑡 + 𝑔𝑡 ) − (𝑏𝑡 − 𝑐𝑡 )ℓ𝑡 + ℓ𝑡2 .
Using (12.8), the definitions above and the fact that ℓ ̄ = 𝑏 − 𝑐,̄ this term can be rewritten as

(𝑏𝑡 − 𝑐𝑡̄ )(𝑔𝑡 + 𝑠𝑡 ) + 2𝑚2𝑡 (𝜈 2 − 𝜈)

Reinserting into (12.6), we get

∞ ∞
𝔼 {∑ 𝛽 𝑡 (𝑏𝑡 − 𝑐𝑡̄ )(𝑔𝑡 + 𝑠𝑡 )} + (𝜈 2 − 𝜈)𝔼 {∑ 𝛽 𝑡 2𝑚2𝑡 } = 0 (12.9)
𝑡=0 𝑡=0

Although it might not be clear yet, we are nearly there because:

• The two expectations terms in (12.9) can be solved for in terms of model primitives.
• This in turn allows us to solve for the Lagrange multiplier 𝜈.
• With 𝜈 in hand, we can go back and solve for the allocations via (12.8).
• Once we have the allocations, prices and the tax system can be derived from (12.5).

12.2.9 Computing the Quadratic Term

Let’s consider how to obtain the term 𝜈 in (12.9).

If we can compute the two expected geometric sums
∞ ∞
𝑏0 ∶= 𝔼 {∑ 𝛽 𝑡 (𝑏𝑡 − 𝑐𝑡̄ )(𝑔𝑡 + 𝑠𝑡 )} and 𝑎0 ∶= 𝔼 {∑ 𝛽 𝑡 2𝑚2𝑡 } (12.10)
𝑡=0 𝑡=0

then the problem reduces to solving

𝑏0 + 𝑎0 (𝜈 2 − 𝜈) = 0

for 𝜈.
Provided that 4𝑏0 < 𝑎0 , there is a unique solution 𝜈 ∈ (0, 1/2), and a unique corresponding 𝜆 > 0.
Let’s work out how to compute mathematical expectations in (12.10).

12.2. The Ramsey Problem 231

Advanced Quantitative Economics with Python

For the first one, the random variable (𝑏𝑡 − 𝑐𝑡̄ )(𝑔𝑡 + 𝑠𝑡 ) inside the summation can be expressed as

1 ′
𝑥 (𝑆 − 𝑆𝑑 + 𝑆𝑔 )′ (𝑆𝑔 + 𝑆𝑠 )𝑥𝑡
2 𝑡 𝑏
For the second expectation in (12.10), the random variable 2𝑚2𝑡 can be written as

1 ′
𝑥 (𝑆 − 𝑆𝑑 − 𝑆𝑠 )′ (𝑆𝑏 − 𝑆𝑑 − 𝑆𝑠 )𝑥𝑡
2 𝑡 𝑏
It follows that both objects of interest are special cases of the expression
∞
𝑞(𝑥0 ) = 𝔼 ∑ 𝛽 𝑡 𝑥′𝑡 𝐻𝑥𝑡 (12.11)
𝑡=0

where 𝐻 is a matrix conformable to 𝑥𝑡 and 𝑥′𝑡 is the transpose of column vector 𝑥𝑡 .

Suppose first that {𝑥𝑡 } is the Gaussian VAR described above.
In this case, the formula for computing 𝑞(𝑥0 ) is known to be 𝑞(𝑥0 ) = 𝑥′0 𝑄𝑥0 + 𝑣, where
• 𝑄 is the solution to 𝑄 = 𝐻 + 𝛽𝐴′ 𝑄𝐴, and
• 𝑣 = trace (𝐶 ′ 𝑄𝐶)𝛽/(1 − 𝛽)
The first equation is known as a discrete Lyapunov equation and can be solved using this function.

12.2.10 Finite State Markov Case

Next, suppose that {𝑥𝑡 } is the discrete Markov process described above.
Suppose further that each 𝑥𝑡 takes values in the state space {𝑥1 , … , 𝑥𝑁 } ⊂ ℝ𝑘 .
Let ℎ ∶ ℝ𝑘 → ℝ be a given function, and suppose that we wish to evaluate
∞
𝑞(𝑥0 ) = 𝔼 ∑ 𝛽 𝑡 ℎ(𝑥𝑡 ) given 𝑥0 = 𝑥𝑗
𝑡=0

For example, in the discussion above, ℎ(𝑥𝑡 ) = 𝑥′𝑡 𝐻𝑥𝑡 .

It is legitimate to pass the expectation through the sum, leading to
∞
𝑞(𝑥0 ) = ∑ 𝛽 𝑡 (𝑃 𝑡 ℎ)[𝑗] (12.12)
𝑡=0

Here
• 𝑃 𝑡 is the 𝑡-th power of the transition matrix 𝑃 .
• ℎ is, with some abuse of notation, the vector (ℎ(𝑥1 ), … , ℎ(𝑥𝑁 )).
• (𝑃 𝑡 ℎ)[𝑗] indicates the 𝑗-th element of 𝑃 𝑡 ℎ.
It can be shown that (12.12) is in fact equal to the 𝑗-th element of the vector (𝐼 − 𝛽𝑃 )−1 ℎ.
This last fact is applied in the calculations below.

232 Chapter 12. Optimal Taxation in an LQ Economy

Advanced Quantitative Economics with Python

12.2.11 Other Variables

We are interested in tracking several other variables besides the ones described above.
To prepare the way for this, we define

𝑡
𝑏𝑡+𝑗 − 𝑐𝑡+𝑗
𝑝𝑡+𝑗 =
𝑏𝑡 − 𝑐 𝑡
as the scaled Arrow-Debreu time 𝑡 price of a history contingent claim on one unit of consumption at time 𝑡 + 𝑗.
These are prices that would prevail at time 𝑡 if markets were reopened at time 𝑡.
These prices are constituents of the present value of government obligations outstanding at time 𝑡, which can be expressed
as
∞
𝐵𝑡 ∶= 𝔼𝑡 ∑ 𝛽 𝑗 𝑝𝑡+𝑗
𝑡
(𝜏𝑡+𝑗 ℓ𝑡+𝑗 − 𝑔𝑡+𝑗 ) (12.13)
𝑗=0

Using our expression for prices and the Ramsey plan, we can also write 𝐵𝑡 as
∞ 2
(𝑏𝑡+𝑗 − 𝑐𝑡+𝑗 )(ℓ𝑡+𝑗 − 𝑔𝑡+𝑗 ) − ℓ𝑡+𝑗
𝐵𝑡 = 𝔼𝑡 ∑ 𝛽 𝑗
𝑗=0
𝑏𝑡 − 𝑐 𝑡

This version is more convenient for computation.

Using the equation
𝑡 𝑡 𝑡+1
𝑝𝑡+𝑗 = 𝑝𝑡+1 𝑝𝑡+𝑗

it is possible to verify that (12.13) implies that

∞
𝑡
𝐵𝑡 = (𝜏𝑡 ℓ𝑡 − 𝑔𝑡 ) + 𝐸𝑡 ∑ 𝑝𝑡+𝑗 (𝜏𝑡+𝑗 ℓ𝑡+𝑗 − 𝑔𝑡+𝑗 )
𝑗=1

and
𝑡
𝐵𝑡 = (𝜏𝑡 ℓ𝑡 − 𝑔𝑡 ) + 𝛽𝐸𝑡 𝑝𝑡+1 𝐵𝑡+1 (12.14)

Define

𝑅𝑡−1 ∶= 𝔼𝑡 𝛽 𝑗 𝑝𝑡+1
𝑡
(12.15)

𝑅𝑡 is the gross 1-period risk-free rate for loans between 𝑡 and 𝑡 + 1.

12.2.12 A Martingale

We now want to study the following two objects, namely,

𝜋𝑡+1 ∶= 𝐵𝑡+1 − 𝑅𝑡 [𝐵𝑡 − (𝜏𝑡 ℓ𝑡 − 𝑔𝑡 )]

and the cumulation of 𝜋𝑡

𝑡
Π𝑡 ∶= ∑ 𝜋𝑡
𝑠=0

The term 𝜋𝑡+1 is the difference between two quantities:

12.2. The Ramsey Problem 233

Advanced Quantitative Economics with Python

• 𝐵𝑡+1 , the value of government debt at the start of period 𝑡 + 1.

• 𝑅𝑡 [𝐵𝑡 + 𝑔𝑡 − 𝜏𝑡 ], which is what the government would have owed at the beginning of period 𝑡 + 1 if it had simply
borrowed at the one-period risk-free rate rather than selling state-contingent securities.
Thus, 𝜋𝑡+1 is the excess payout on the actual portfolio of state-contingent government debt relative to an alternative
portfolio sufficient to finance 𝐵𝑡 + 𝑔𝑡 − 𝜏𝑡 ℓ𝑡 and consisting entirely of risk-free one-period bonds.
Use expressions (12.14) and (12.15) to obtain
1 𝑡
𝜋𝑡+1 = 𝐵𝑡+1 − 𝑡 [𝛽𝐸𝑡 𝑝𝑡+1 𝐵𝑡+1 ]
𝛽𝐸𝑡 𝑝𝑡+1
or

𝜋𝑡+1 = 𝐵𝑡+1 − 𝐸𝑡̃ 𝐵𝑡+1 (12.16)

where 𝐸𝑡̃ is the conditional mathematical expectation taken with respect to a one-step transition density that has been
formed by multiplying the original transition density with the likelihood ratio
𝑡
𝑝𝑡+1
𝑚𝑡𝑡+1 = 𝑡
𝐸𝑡 𝑝𝑡+1

It follows from equation (12.16) that

𝐸𝑡̃ 𝜋𝑡+1 = 𝐸𝑡̃ 𝐵𝑡+1 − 𝐸𝑡̃ 𝐵𝑡+1 = 0

which asserts that {𝜋𝑡+1 } is a martingale difference sequence under the distorted probability measure, and that {Π𝑡 } is
a martingale under the distorted probability measure.
In the tax-smoothing model of Robert Barro [Barro, 1979], government debt is a random walk.
In the current model, government debt {𝐵𝑡 } is not a random walk, but the excess payoff {Π𝑡 } on it is.

12.3 Implementation

The following code provides functions for

1. Solving for the Ramsey plan given a specification of the economy.
2. Simulating the dynamics of the major variables.
Description and clarifications are given below

# Set up a namedtuple to store data on the model economy

Economy = namedtuple('economy',
('β', # Discount factor
'Sg', # Govt spending selector matrix
'Sd', # Exogenous endowment selector matrix
'Sb', # Utility parameter selector matrix
'Ss', # Coupon payments selector matrix
'discrete', # Discrete or continuous -- boolean
'proc')) # Stochastic process parameters

# Set up a namedtuple to store return values for compute_paths()

Path = namedtuple('path',
('g', # Govt spending
'd', # Endowment
(continues on next page)

234 Chapter 12. Optimal Taxation in an LQ Economy

Advanced Quantitative Economics with Python

(continued from previous page)

'b', # Utility shift parameter
's', # Coupon payment on existing debt
'c', # Consumption
'l', # Labor
'p', # Price
'τ', # Tax rate
'rvn', # Revenue
'B', # Govt debt
'R', # Risk-free gross return
'π', # One-period risk-free interest rate
'Π', # Cumulative rate of return, adjusted
'ξ')) # Adjustment factor for Π

def compute_paths(T, econ):

"""
Compute simulated time paths for exogenous and endogenous variables.

Parameters
===========
T: int
Length of the simulation

econ: a namedtuple of type 'Economy', containing

β - Discount factor
Sg - Govt spending selector matrix
Sd - Exogenous endowment selector matrix
Sb - Utility parameter selector matrix
Ss - Coupon payments selector matrix
discrete - Discrete exogenous process (True or False)
proc - Stochastic process parameters

Returns
========
path: a namedtuple of type 'Path', containing
g - Govt spending
d - Endowment
b - Utility shift parameter
s - Coupon payment on existing debt
c - Consumption
l - Labor
p - Price
τ - Tax rate
rvn - Revenue
B - Govt debt
R - Risk-free gross return
π - One-period risk-free interest rate
Π - Cumulative rate of return, adjusted
ξ - Adjustment factor for Π

The corresponding values are flat numpy ndarrays.

"""

# Simplify names
β, Sg, Sd, Sb, Ss = econ.β, econ.Sg, econ.Sd, econ.Sb, econ.Ss

(continues on next page)

12.3. Implementation 235

Advanced Quantitative Economics with Python

(continued from previous page)

if econ.discrete:
P, x_vals = econ.proc
else:
A, C = econ.proc

# Simulate the exogenous process x

if econ.discrete:
state = mc_sample_path(P, init=0, sample_size=T)
x = x_vals[:, state]
else:
# Generate an initial condition x0 satisfying x0 = A x0
nx, nx = A.shape
x0 = nullspace((eye(nx) - A))
x0 = -x0 if (x0[nx-1] < 0) else x0
x0 = x0 / x0[nx-1]

# Generate a time series x of length T starting from x0

nx, nw = C.shape
x = zeros((nx, T))
w = randn(nw, T)
x[:, 0] = x0.T
for t in range(1, T):
x[:, t] = A @ x[:, t-1] + C @ w[:, t]

# Compute exogenous variable sequences

g, d, b, s = ((S @ x).flatten() for S in (Sg, Sd, Sb, Ss))

# Solve for Lagrange multiplier in the govt budget constraint

# In fact we solve for ν = lambda / (1 + 2*lambda). Here ν is the
# solution to a quadratic equation a(ν**2 - ν) + b = 0 where
# a and b are expected discounted sums of quadratic forms of the state.
Sm = Sb - Sd - Ss
# Compute a and b
if econ.discrete:
ns = P.shape[0]
F = scipy.linalg.inv(eye(ns) - β * P)
a0 = 0.5 * (F @ (x_vals.T @ Sm.T)**2)[0]
H = ((Sb - Sd + Sg) @ x_vals) * ((Sg - Ss) @ x_vals)
b0 = 0.5 * (F @ H.T)[0]
a0, b0 = float(a0), float(b0)
else:
H = Sm.T @ Sm
a0 = 0.5 * var_quadratic_sum(A, C, H, β, x0)
H = (Sb - Sd + Sg).T @ (Sg + Ss)
b0 = 0.5 * var_quadratic_sum(A, C, H, β, x0)

# Test that ν has a real solution before assigning

warning_msg = """
Hint: you probably set government spending too {}. Elect a {}
Congress and start over.
"""
disc = a0**2 - 4 * a0 * b0
if disc >= 0:
ν = 0.5 * (a0 - sqrt(disc)) / a0
else:

(continues on next page)

236 Chapter 12. Optimal Taxation in an LQ Economy

Advanced Quantitative Economics with Python

(continued from previous page)

print("There is no Ramsey equilibrium for these parameters.")
print(warning_msg.format('high', 'Republican'))
sys.exit(0)

# Test that the Lagrange multiplier has the right sign

if ν * (0.5 - ν) < 0:
print("Negative multiplier on the government budget constraint.")
print(warning_msg.format('low', 'Democratic'))
sys.exit(0)

# Solve for the allocation given ν and x

Sc = 0.5 * (Sb + Sd - Sg - ν * Sm)
Sl = 0.5 * (Sb - Sd + Sg - ν * Sm)
c = (Sc @ x).flatten()
l = (Sl @ x).flatten()
p = ((Sb - Sc) @ x).flatten() # Price without normalization
τ = 1 - l / (b - c)
rvn = l * τ

# Compute remaining variables

if econ.discrete:
H = ((Sb - Sc) @ x_vals) * ((Sl - Sg) @ x_vals) - (Sl @ x_vals)**2
temp = (F @ H.T).flatten()
B = temp[state] / p
H = (P[state, :] @ x_vals.T @ (Sb - Sc).T).flatten()
R = p / (β * H)
temp = ((P[state, :] @ x_vals.T @ (Sb - Sc).T)).flatten()
ξ = p[1:] / temp[:T-1]
else:
H = Sl.T @ Sl - (Sb - Sc).T @ (Sl - Sg)
L = np.empty(T)
for t in range(T):
L[t] = var_quadratic_sum(A, C, H, β, x[:, t])
B = L / p
Rinv = (β * ((Sb - Sc) @ A @ x)).flatten() / p
R = 1 / Rinv
AF1 = (Sb - Sc) @ x[:, 1:]
AF2 = (Sb - Sc) @ A @ x[:, :T-1]
ξ = AF1 / AF2
ξ = ξ.flatten()

π = B[1:] - R[:T-1] * B[:T-1] - rvn[:T-1] + g[:T-1]

Π = cumsum(π * ξ)

# Prepare return values

path = Path(g=g, d=d, b=b, s=s, c=c, l=l, p=p,
τ=τ, rvn=rvn, B=B, R=R, π=π, Π=Π, ξ=ξ)

return path

def gen_fig_1(path):
"""
The parameter is the path namedtuple returned by compute_paths(). See
the docstring of that function for details.
"""

(continues on next page)

12.3. Implementation 237

Advanced Quantitative Economics with Python

(continued from previous page)

T = len(path.c)

# Prepare axes
num_rows, num_cols = 2, 2
fig, axes = plt.subplots(num_rows, num_cols, figsize=(14, 10))
plt.subplots_adjust(hspace=0.4)
for i in range(num_rows):
for j in range(num_cols):
axes[i, j].grid()
axes[i, j].set_xlabel('Time')
bbox = (0., 1.02, 1., .102)
legend_args = {'bbox_to_anchor': bbox, 'loc': 3, 'mode': 'expand'}
p_args = {'lw': 2, 'alpha': 0.7}

# Plot consumption, govt expenditure and revenue

ax = axes[0, 0]
ax.plot(path.rvn, label=r'$\tau_t \ell_t$', **p_args)
ax.plot(path.g, label='$g_t$', **p_args)
ax.plot(path.c, label='$c_t$', **p_args)
ax.legend(ncol=3, **legend_args)

# Plot govt expenditure and debt

ax = axes[0, 1]
ax.plot(list(range(1, T+1)), path.rvn, label=r'$\tau_t \ell_t$', **p_args)
ax.plot(list(range(1, T+1)), path.g, label='$g_t$', **p_args)
ax.plot(list(range(1, T)), path.B[1:T], label='$B_{t+1}$', **p_args)
ax.legend(ncol=3, **legend_args)

# Plot risk-free return

ax = axes[1, 0]
ax.plot(list(range(1, T+1)), path.R - 1, label='$R_t - 1$', **p_args)
ax.legend(ncol=1, **legend_args)

# Plot revenue, expenditure and risk free rate

ax = axes[1, 1]
ax.plot(list(range(1, T+1)), path.rvn, label=r'$\tau_t \ell_t$', **p_args)
ax.plot(list(range(1, T+1)), path.g, label='$g_t$', **p_args)
axes[1, 1].plot(list(range(1, T)), path.π, label=r'$\pi_{t+1}$', **p_args)
ax.legend(ncol=3, **legend_args)

plt.show()

def gen_fig_2(path):
"""
The parameter is the path namedtuple returned by compute_paths(). See
the docstring of that function for details.
"""

T = len(path.c)

# Prepare axes
num_rows, num_cols = 2, 1
fig, axes = plt.subplots(num_rows, num_cols, figsize=(10, 10))
plt.subplots_adjust(hspace=0.5)

(continues on next page)

238 Chapter 12. Optimal Taxation in an LQ Economy

Advanced Quantitative Economics with Python

(continued from previous page)

bbox = (0., 1.02, 1., .102)
bbox = (0., 1.02, 1., .102)
legend_args = {'bbox_to_anchor': bbox, 'loc': 3, 'mode': 'expand'}
p_args = {'lw': 2, 'alpha': 0.7}

# Plot adjustment factor

ax = axes[0]
ax.plot(list(range(2, T+1)), path.ξ, label=r'$\xi_t$', **p_args)
ax.grid()
ax.set_xlabel('Time')
ax.legend(ncol=1, **legend_args)

# Plot adjusted cumulative return

ax = axes[1]
ax.plot(list(range(2, T+1)), path.Π, label=r'$\Pi_t$', **p_args)
ax.grid()
ax.set_xlabel('Time')
ax.legend(ncol=1, **legend_args)

plt.show()

12.3.1 Comments on the Code

The function var_quadratic_sum imported from quadsums is for computing the value of (12.11) when the ex-
ogenous process {𝑥𝑡 } is of the VAR type described above.
Below the definition of the function, you will see definitions of two namedtuple objects, Economy and Path.
The first is used to collect all the parameters and primitives of a given LQ economy, while the second collects output of
the computations.
In Python, a namedtuple is a popular data type from the collections module of the standard library that replicates
the functionality of a tuple, but also allows you to assign a name to each tuple element.
These elements can then be references via dotted attribute notation — see for example the use of path in the functions
gen_fig_1() and gen_fig_2().
The benefits of using namedtuples:
• Keeps content organized by meaning.
• Helps reduce the number of global variables.
Other than that, our code is long but relatively straightforward.

12.4 Examples

Let’s look at two examples of usage.

12.4. Examples 239

Advanced Quantitative Economics with Python

12.4.1 The Continuous Case

Our first example adopts the VAR specification described above.

Regarding the primitives, we set
• 𝛽 = 1/1.05
• 𝑏𝑡 = 2.135 and 𝑠𝑡 = 𝑑𝑡 = 0 for all 𝑡
Government spending evolves according to

𝑔𝑡+1 − 𝜇𝑔 = 𝜌(𝑔𝑡 − 𝜇𝑔 ) + 𝐶𝑔 𝑤𝑔,𝑡+1

with 𝜌 = 0.7, 𝜇𝑔 = 0.35 and 𝐶𝑔 = 𝜇𝑔 √1 − 𝜌2 /10.

Here’s the code

# == Parameters == #
β = 1 / 1.05
ρ, mg = .7, .35
A = eye(2)
A[0, :] = ρ, mg * (1-ρ)
C = np.zeros((2, 1))
C[0, 0] = np.sqrt(1 - ρ**2) * mg / 10
Sg = np.array((1, 0)).reshape(1, 2)
Sd = np.array((0, 0)).reshape(1, 2)
Sb = np.array((0, 2.135)).reshape(1, 2)
Ss = np.array((0, 0)).reshape(1, 2)

economy = Economy(β=β, Sg=Sg, Sd=Sd, Sb=Sb, Ss=Ss,

discrete=False, proc=(A, C))

T = 50
path = compute_paths(T, economy)
gen_fig_1(path)

240 Chapter 12. Optimal Taxation in an LQ Economy

Advanced Quantitative Economics with Python

The legends on the figures indicate the variables being tracked.

Most obvious from the figure is tax smoothing in the sense that tax revenue is much less variable than government
expenditure.

gen_fig_2(path)

12.4. Examples 241

Advanced Quantitative Economics with Python

See the original manuscript for comments and interpretation.

12.4.2 The Discrete Case

Our second example adopts a discrete Markov specification for the exogenous process

# == Parameters == #
β = 1 / 1.05
P = np.array([[0.8, 0.2, 0.0],
[0.0, 0.5, 0.5],
[0.0, 0.0, 1.0]])

# Possible states of the world

(continues on next page)

242 Chapter 12. Optimal Taxation in an LQ Economy

Advanced Quantitative Economics with Python

(continued from previous page)

# Each column is a state of the world. The rows are [g d b s 1]
x_vals = np.array([[0.5, 0.5, 0.25],
[0.0, 0.0, 0.0],
[2.2, 2.2, 2.2],
[0.0, 0.0, 0.0],
[1.0, 1.0, 1.0]])

Sg = np.array((1, 0, 0, 0, 0)).reshape(1, 5)
Sd = np.array((0, 1, 0, 0, 0)).reshape(1, 5)
Sb = np.array((0, 0, 1, 0, 0)).reshape(1, 5)
Ss = np.array((0, 0, 0, 1, 0)).reshape(1, 5)

economy = Economy(β=β, Sg=Sg, Sd=Sd, Sb=Sb, Ss=Ss,

discrete=True, proc=(P, x_vals))

T = 15
path = compute_paths(T, economy)
gen_fig_1(path)

/tmp/ipykernel_8650/2748685684.py:111: DeprecationWarning: Conversion of an array␣

↪with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you␣

↪extract a single element from your array before performing this operation.␣

↪(Deprecated NumPy 1.25.)

a0, b0 = float(a0), float(b0)

12.4. Examples 243

Advanced Quantitative Economics with Python

The call gen_fig_2(path) generates

gen_fig_2(path)

See the original manuscript for comments and interpretation.

244 Chapter 12. Optimal Taxation in an LQ Economy

Advanced Quantitative Economics with Python

12.5 Exercises

Exercise 12.5.1
Modify the VAR example given above, setting

𝑔𝑡+1 − 𝜇𝑔 = 𝜌(𝑔𝑡−3 − 𝜇𝑔 ) + 𝐶𝑔 𝑤𝑔,𝑡+1

with 𝜌 = 0.95 and 𝐶𝑔 = 0.7√1 − 𝜌2 .

Produce the corresponding figures.

Solution to Exercise 12.5.1

# == Parameters == #
β = 1 / 1.05
ρ, mg = .95, .35
A = np.array([[0, 0, 0, ρ, mg*(1-ρ)],
[1, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 0, 1]])
C = np.zeros((5, 1))
C[0, 0] = np.sqrt(1 - ρ**2) * mg / 8
Sg = np.array((1, 0, 0, 0, 0)).reshape(1, 5)
Sd = np.array((0, 0, 0, 0, 0)).reshape(1, 5)
# Chosen st. (Sc + Sg) * x0 = 1
Sb = np.array((0, 0, 0, 0, 2.135)).reshape(1, 5)
Ss = np.array((0, 0, 0, 0, 0)).reshape(1, 5)

economy = Economy(β=β, Sg=Sg, Sd=Sd, Sb=Sb,

Ss=Ss, discrete=False, proc=(A, C))

T = 50
path = compute_paths(T, economy)

gen_fig_1(path)

12.5. Exercises 245

Advanced Quantitative Economics with Python

gen_fig_2(path)

246 Chapter 12. Optimal Taxation in an LQ Economy

Advanced Quantitative Economics with Python

12.5. Exercises 247

Advanced Quantitative Economics with Python

248 Chapter 12. Optimal Taxation in an LQ Economy

Part III

Multiple Agent Models

249
CHAPTER

THIRTEEN

DEFAULT RISK AND INCOME FLUCTUATIONS

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install --upgrade quantecon

13.1 Overview

This lecture computes versions of Arellano’s [Arellano, 2008] model of sovereign default.
The model describes interactions among default risk, output, and an equilibrium interest rate that includes a premium for
endogenous default risk.
The decision maker is a government of a small open economy that borrows from risk-neutral foreign creditors.
The foreign lenders must be compensated for default risk.
The government borrows and lends abroad in order to smooth the consumption of its citizens.
The government repays its debt only if it wants to, but declining to pay has adverse consequences.
The interest rate on government debt adjusts in response to the state-dependent default probability chosen by government.
The model yields outcomes that help interpret sovereign default experiences, including
• countercyclical interest rates on sovereign debt
• countercyclical trade balances
• high volatility of consumption relative to output
Notably, long recessions caused by bad draws in the income process increase the government’s incentive to default.
This can lead to
• spikes in interest rates
• temporary losses of access to international credit markets
• large drops in output, consumption, and welfare
• large capital outflows during recessions
Such dynamics are consistent with experiences of many countries.
Let’s start with some imports:

251
Advanced Quantitative Economics with Python

import matplotlib.pyplot as plt

import numpy as np
import quantecon as qe
from numba import njit, prange

13.2 Structure

In this section we describe the main features of the model.

13.2.1 Output, Consumption and Debt

A small open economy is endowed with an exogenous stochastically fluctuating potential output stream {𝑦𝑡 }.
Potential output is realized only in periods in which the government honors its sovereign debt.
The output good can be traded or consumed.
The sequence {𝑦𝑡 } is described by a Markov process with stochastic density kernel 𝑝(𝑦, 𝑦′ ).
Households within the country are identical and rank stochastic consumption streams according to
∞
𝔼 ∑ 𝛽 𝑡 𝑢(𝑐𝑡 ) (13.1)
𝑡=0

Here
• 0 < 𝛽 < 1 is a time discount factor
• 𝑢 is an increasing and strictly concave utility function
Consumption sequences enjoyed by households are affected by the government’s decision to borrow or lend internationally.
The government is benevolent in the sense that its aim is to maximize (13.1).
The government is the only domestic actor with access to foreign credit.
Because household are averse to consumption fluctuations, the government will try to smooth consumption by borrowing
from (and lending to) foreign creditors.

13.2.2 Asset Markets

The only credit instrument available to the government is a one-period bond traded in international credit markets.
The bond market has the following features
• The bond matures in one period and is not state contingent.
• A purchase of a bond with face value 𝐵′ is a claim to 𝐵′ units of the consumption good next period.
• To purchase 𝐵′ next period costs 𝑞𝐵′ now, or, what is equivalent.
• For selling −𝐵′ units of next period goods the seller earns −𝑞𝐵′ of today’s goods.
– If 𝐵′ < 0, then −𝑞𝐵′ units of the good are received in the current period, for a promise to repay −𝐵′ units
next period.
– There is an equilibrium price function 𝑞(𝐵′ , 𝑦) that makes 𝑞 depend on both 𝐵′ and 𝑦.

252 Chapter 13. Default Risk and Income Fluctuations

Advanced Quantitative Economics with Python

Earnings on the government portfolio are distributed (or, if negative, taxed) lump sum to households.
When the government is not excluded from financial markets, the one-period national budget constraint is

𝑐 = 𝑦 + 𝐵 − 𝑞(𝐵′ , 𝑦)𝐵′ (13.2)

Here and below, a prime denotes a next period value or a claim maturing next period.
To rule out Ponzi schemes, we also require that 𝐵 ≥ −𝑍 in every period.
• 𝑍 is chosen to be sufficiently large that the constraint never binds in equilibrium.

13.2.3 Financial Markets

Foreign creditors
• are risk neutral
• know the domestic output stochastic process {𝑦𝑡 } and observe 𝑦𝑡 , 𝑦𝑡−1 , … , at time 𝑡
• can borrow or lend without limit in an international credit market at a constant international interest rate 𝑟
• receive full payment if the government chooses to pay
• receive zero if the government defaults on its one-period debt due
When a government is expected to default next period with probability 𝛿, the expected value of a promise to pay one unit
of consumption next period is 1 − 𝛿.
Therefore, the discounted expected value of a promise to pay 𝐵 next period is

1−𝛿
𝑞= (13.3)
1+𝑟
Next we turn to how the government in effect chooses the default probability 𝛿.

13.2.4 Government’s Decisions

At each point in time 𝑡, the government chooses between

1. defaulting
2. meeting its current obligations and purchasing or selling an optimal quantity of one-period sovereign debt
Defaulting means declining to repay all of its current obligations.
If the government defaults in the current period, then consumption equals current output.
But a sovereign default has two consequences:
1. Output immediately falls from 𝑦 to ℎ(𝑦), where 0 ≤ ℎ(𝑦) ≤ 𝑦.
• It returns to 𝑦 only after the country regains access to international credit markets.
2. The country loses access to foreign credit markets.

13.2. Structure 253

Advanced Quantitative Economics with Python

13.2.5 Reentering International Credit Market

While in a state of default, the economy regains access to foreign credit in each subsequent period with probability 𝜃.

13.3 Equilibrium

Informally, an equilibrium is a sequence of interest rates on its sovereign debt, a stochastic sequence of government default
decisions and an implied flow of household consumption such that
1. Consumption and assets satisfy the national budget constraint.
2. The government maximizes household utility taking into account
• the resource constraint
• the effect of its choices on the price of bonds
• consequences of defaulting now for future net output and future borrowing and lending opportunities
3. The interest rate on the government’s debt includes a risk-premium sufficient to make foreign creditors expect on
average to earn the constant risk-free international interest rate.
To express these ideas more precisely, consider first the choices of the government, which
1. enters a period with initial assets 𝐵, or what is the same thing, initial debt to be repaid now of −𝐵
2. observes current output 𝑦, and
3. chooses either
1. to default, or
2. to pay −𝐵 and set next period’s debt due to −𝐵′
In a recursive formulation,
• state variables for the government comprise the pair (𝐵, 𝑦)
• 𝑣(𝐵, 𝑦) is the optimum value of the government’s problem when at the beginning of a period it faces the choice of
whether to honor or default
• 𝑣𝑐 (𝐵, 𝑦) is the value of choosing to pay obligations falling due
• 𝑣𝑑 (𝑦) is the value of choosing to default
𝑣𝑑 (𝑦) does not depend on 𝐵 because, when access to credit is eventually regained, net foreign assets equal 0.
Expressed recursively, the value of defaulting is

𝑣𝑑 (𝑦) = 𝑢(ℎ(𝑦)) + 𝛽 ∫ {𝜃𝑣(0, 𝑦′ ) + (1 − 𝜃)𝑣𝑑 (𝑦′ )} 𝑝(𝑦, 𝑦′ )𝑑𝑦′

The value of paying is

𝑣𝑐 (𝐵, 𝑦) = max
′
{𝑢(𝑦 − 𝑞(𝐵′ , 𝑦)𝐵′ + 𝐵) + 𝛽 ∫ 𝑣(𝐵′ , 𝑦′ )𝑝(𝑦, 𝑦′ )𝑑𝑦′ }
𝐵 ≥−𝑍

The three value functions are linked by

𝑣(𝐵, 𝑦) = max{𝑣𝑐 (𝐵, 𝑦), 𝑣𝑑 (𝑦)}

The government chooses to default when

𝑣𝑐 (𝐵, 𝑦) < 𝑣𝑑 (𝑦)

254 Chapter 13. Default Risk and Income Fluctuations

Advanced Quantitative Economics with Python

and hence given 𝐵′ the probability of default next period is

𝛿(𝐵′ , 𝑦) ∶= ∫ 𝟙{𝑣𝑐 (𝐵′ , 𝑦′ ) < 𝑣𝑑 (𝑦′ )}𝑝(𝑦, 𝑦′ )𝑑𝑦′ (13.4)

Given zero profits for foreign creditors in equilibrium, we can combine (13.3) and (13.4) to pin down the bond price
function:
1 − 𝛿(𝐵′ , 𝑦)
𝑞(𝐵′ , 𝑦) = (13.5)
1+𝑟

13.3.1 Definition of Equilibrium

An equilibrium is
• a pricing function 𝑞(𝐵′ , 𝑦),
• a triple of value functions (𝑣𝑐 (𝐵, 𝑦), 𝑣𝑑 (𝑦), 𝑣(𝐵, 𝑦)),
• a decision rule telling the government when to default and when to pay as a function of the state (𝐵, 𝑦), and
• an asset accumulation rule that, conditional on choosing not to default, maps (𝐵, 𝑦) into 𝐵′
such that
• The three Bellman equations for (𝑣𝑐 (𝐵, 𝑦), 𝑣𝑑 (𝑦), 𝑣(𝐵, 𝑦)) are satisfied
• Given the price function 𝑞(𝐵′ , 𝑦), the default decision rule and the asset accumulation decision rule attain the
optimal value function 𝑣(𝐵, 𝑦), and
• The price function 𝑞(𝐵′ , 𝑦) satisfies equation (13.5)

13.4 Computation

Let’s now compute an equilibrium of Arellano’s model.

The equilibrium objects are the value function 𝑣(𝐵, 𝑦), the associated default decision rule, and the pricing function
𝑞(𝐵′ , 𝑦).
We’ll use our code to replicate Arellano’s results.
After that we’ll perform some additional simulations.
We use a slightly modified version of the algorithm recommended by Arellano.
• The appendix to [Arellano, 2008] recommends value function iteration until convergence, updating the price, and
then repeating.
• Instead, we update the bond price at every value function iteration step.
The second approach is faster and the two different procedures deliver very similar results.
Here is a more detailed description of our algorithm:
1. Guess a pair of non-default and default value functions 𝑣𝑐 and 𝑣𝑑 .
2. Using these functions, calculate the value function 𝑣, the corresponding default probabilities and the price function
𝑞.
3. At each pair (𝐵, 𝑦),
1. update the value of defaulting 𝑣𝑑 (𝑦).

13.4. Computation 255

Advanced Quantitative Economics with Python

2. update the value of remaining 𝑣𝑐 (𝐵, 𝑦).

4. Check for convergence. If converged, stop – if not, go to step 2.
We use simple discretization on a grid of asset holdings and income levels.
The output process is discretized using a quadrature method due to Tauchen.
As we have in other places, we accelerate our code using Numba.
We define a class that will store parameters, grids and transition probabilities.

class Arellano_Economy:
" Stores data and creates primitives for the Arellano economy. "

def __init__(self,
B_grid_size= 251, # Grid size for bonds
B_grid_min=-0.45, # Smallest B value
B_grid_max=0.45, # Largest B value
y_grid_size=51, # Grid size for income
β=0.953, # Time discount parameter
γ=2.0, # Utility parameter
r=0.017, # Lending rate
ρ=0.945, # Persistence in the income process
η=0.025, # Standard deviation of the income process
θ=0.282, # Prob of re-entering financial markets
def_y_param=0.969): # Parameter governing income in default

# Save parameters
self.β, self.γ, self.r, = β, γ, r
self.ρ, self.η, self.θ = ρ, η, θ

self.y_grid_size = y_grid_size
self.B_grid_size = B_grid_size
self.B_grid = np.linspace(B_grid_min, B_grid_max, B_grid_size)
mc = qe.markov.tauchen(y_grid_size, ρ, η, 0, 3)
self.y_grid, self.P = np.exp(mc.state_values), mc.P

# The index at which B_grid is (close to) zero

self.B0_idx = np.searchsorted(self.B_grid, 1e-10)

# Output recieved while in default, with same shape as y_grid

self.def_y = np.minimum(def_y_param * np.mean(self.y_grid), self.y_grid)

def params(self):
return self.β, self.γ, self.r, self.ρ, self.η, self.θ

def arrays(self):
return self.P, self.y_grid, self.B_grid, self.def_y, self.B0_idx

Notice how the class returns the data it stores as simple numerical values and arrays via the methods params and
arrays.
We will use this data in the Numba-jitted functions defined below.
Jitted functions prefer simple arguments, since type inference is easier.
Here is the utility function.

256 Chapter 13. Default Risk and Income Fluctuations

Advanced Quantitative Economics with Python

@njit
def u(c, γ):
return c**(1-γ)/(1-γ)

Here is a function to compute the bond price at each state, given 𝑣𝑐 and 𝑣𝑑 .

@njit
def compute_q(v_c, v_d, q, params, arrays):
"""
Compute the bond price function q(b, y) at each (b, y) pair.

This function writes to the array q that is passed in as an argument.

"""

# Unpack
β, γ, r, ρ, η, θ = params
P, y_grid, B_grid, def_y, B0_idx = arrays

for B_idx in range(len(B_grid)):

for y_idx in range(len(y_grid)):
# Compute default probability and corresponding bond price
delta = P[y_idx, v_c[B_idx, :] < v_d].sum()
q[B_idx, y_idx] = (1 - delta ) / (1 + r)

Next we introduce Bellman operators that updated 𝑣𝑑 and 𝑣𝑐 .

@njit
def T_d(y_idx, v_c, v_d, params, arrays):
"""
The RHS of the Bellman equation when income is at index y_idx and
the country has chosen to default. Returns an update of v_d.
"""
# Unpack
β, γ, r, ρ, η, θ = params
P, y_grid, B_grid, def_y, B0_idx = arrays

current_utility = u(def_y[y_idx], γ)
v = np.maximum(v_c[B0_idx, :], v_d)
cont_value = np.sum((θ * v + (1 - θ) * v_d) * P[y_idx, :])

return current_utility + β * cont_value

@njit
def T_c(B_idx, y_idx, v_c, v_d, q, params, arrays):
"""
The RHS of the Bellman equation when the country is not in a
defaulted state on their debt. Returns a value that corresponds to
v_c[B_idx, y_idx], as well as the optimal level of bond sales B'.
"""
# Unpack
β, γ, r, ρ, η, θ = params
P, y_grid, B_grid, def_y, B0_idx = arrays
B = B_grid[B_idx]
y = y_grid[y_idx]

(continues on next page)

13.4. Computation 257

Advanced Quantitative Economics with Python

(continued from previous page)

# Compute the RHS of Bellman equation
current_max = -1e10
# Step through choices of next period B'
for Bp_idx, Bp in enumerate(B_grid):
c = y + B - q[Bp_idx, y_idx] * Bp
if c > 0:
v = np.maximum(v_c[Bp_idx, :], v_d)
val = u(c, γ) + β * np.sum(v * P[y_idx, :])
if val > current_max:
current_max = val
Bp_star_idx = Bp_idx
return current_max, Bp_star_idx

Here is a fast function that calls these operators in the right sequence.

@njit(parallel=True)
def update_values_and_prices(v_c, v_d, # Current guess of value functions
B_star, q, # Arrays to be written to
params, arrays):

# Unpack
β, γ, r, ρ, η, θ = params
P, y_grid, B_grid, def_y, B0_idx = arrays
y_grid_size = len(y_grid)
B_grid_size = len(B_grid)

# Compute bond prices and write them to q

compute_q(v_c, v_d, q, params, arrays)

# Allocate memory
new_v_c = np.empty_like(v_c)
new_v_d = np.empty_like(v_d)

# Calculate and return new guesses for v_c and v_d

for y_idx in prange(y_grid_size):
new_v_d[y_idx] = T_d(y_idx, v_c, v_d, params, arrays)
for B_idx in range(B_grid_size):
new_v_c[B_idx, y_idx], Bp_idx = T_c(B_idx, y_idx,
v_c, v_d, q, params, arrays)
B_star[B_idx, y_idx] = Bp_idx

return new_v_c, new_v_d

We can now write a function that will use the Arellano_Economy class and the functions defined above to compute
the solution to our model.
We do not need to JIT compile this function since it only consists of outer loops (and JIT compiling makes almost zero
difference).
In fact, one of the jobs of this function is to take an instance of Arellano_Economy, which is hard for the JIT
compiler to handle, and strip it down to more basic objects, which are then passed out to jitted functions.

def solve(model, tol=1e-8, max_iter=10_000):

"""
Given an instance of Arellano_Economy, this function computes the optimal
policy and value functions.
(continues on next page)

258 Chapter 13. Default Risk and Income Fluctuations

Advanced Quantitative Economics with Python

(continued from previous page)

"""
# Unpack
params = model.params()
arrays = model.arrays()
y_grid_size, B_grid_size = model.y_grid_size, model.B_grid_size

# Initial conditions for v_c and v_d

v_c = np.zeros((B_grid_size, y_grid_size))
v_d = np.zeros(y_grid_size)

# Allocate memory
q = np.empty_like(v_c)
B_star = np.empty_like(v_c, dtype=int)

current_iter = 0
dist = np.inf
while (current_iter < max_iter) and (dist > tol):

if current_iter % 100 == 0:
print(f"Entering iteration {current_iter}.")

new_v_c, new_v_d = update_values_and_prices(v_c, v_d, B_star, q, params,␣

↪ arrays)
# Check tolerance and update
dist = np.max(np.abs(new_v_c - v_c)) + np.max(np.abs(new_v_d - v_d))
v_c = new_v_c
v_d = new_v_d
current_iter += 1

print(f"Terminating at iteration {current_iter}.")

return v_c, v_d, q, B_star

Finally, we write a function that will allow us to simulate the economy once we have the policy functions

def simulate(model, T, v_c, v_d, q, B_star, y_idx=None, B_idx=None):

"""
Simulates the Arellano 2008 model of sovereign debt

Here `model` is an instance of `Arellano_Economy` and `T` is the length of

the simulation. Endogenous objects `v_c`, `v_d`, `q` and `B_star` are
assumed to come from a solution to `model`.

"""
# Unpack elements of the model
B0_idx = model.B0_idx
y_grid = model.y_grid
B_grid, y_grid, P = model.B_grid, model.y_grid, model.P

# Set initial conditions to middle of grids

if y_idx == None:
y_idx = np.searchsorted(y_grid, y_grid.mean())
if B_idx == None:
B_idx = B0_idx
in_default = False

# Create Markov chain and simulate income process

(continues on next page)

13.4. Computation 259

Advanced Quantitative Economics with Python

(continued from previous page)

mc = qe.MarkovChain(P, y_grid)
y_sim_indices = mc.simulate_indices(T+1, init=y_idx)

# Allocate memory for outputs

y_sim = np.empty(T)
y_a_sim = np.empty(T)
B_sim = np.empty(T)
q_sim = np.empty(T)
d_sim = np.empty(T, dtype=int)

# Perform simulation
t = 0
while t < T:

# Store the value of y_t and B_t

y_sim[t] = y_grid[y_idx]
B_sim[t] = B_grid[B_idx]

# if in default:
if v_c[B_idx, y_idx] < v_d[y_idx] or in_default:
y_a_sim[t] = model.def_y[y_idx]
d_sim[t] = 1
Bp_idx = B0_idx
# Re-enter financial markets next period with prob θ
in_default = False if np.random.rand() < model.θ else True
else:
y_a_sim[t] = y_sim[t]
d_sim[t] = 0
Bp_idx = B_star[B_idx, y_idx]

q_sim[t] = q[Bp_idx, y_idx]

# Update time and indices

t += 1
y_idx = y_sim_indices[t]
B_idx = Bp_idx

return y_sim, y_a_sim, B_sim, q_sim, d_sim

13.5 Results

Let’s start by trying to replicate the results obtained in [Arellano, 2008].

In what follows, all results are computed using Arellano’s parameter values.
The values can be seen in the __init__ method of the Arellano_Economy shown above.
For example, r=0.017 matches the average quarterly rate on a 5 year US treasury over the period 1983–2001.
Details on how to compute the figures are reported as solutions to the exercises.
The first figure shows the bond price schedule and replicates Figure 3 of Arellano, where 𝑦𝐿 and 𝑌𝐻 are particular below
average and above average values of output 𝑦.
• 𝑦𝐿 is 5% below the mean of the 𝑦 grid values
• 𝑦𝐻 is 5% above the mean of the 𝑦 grid values

260 Chapter 13. Default Risk and Income Fluctuations

Advanced Quantitative Economics with Python

The grid used to compute this figure was relatively fine (y_grid_size, B_grid_size = 51, 251), which
explains the minor differences between this and Arrelano’s figure.
The figure shows that
• Higher levels of debt (larger −𝐵′ ) induce larger discounts on the face value, which correspond to higher interest
rates.
• Lower income also causes more discounting, as foreign creditors anticipate greater likelihood of default.
The next figure plots value functions and replicates the right hand panel of Figure 4 of [Arellano, 2008].
We can use the results of the computation to study the default probability 𝛿(𝐵′ , 𝑦) defined in (13.4).
The next plot shows these default probabilities over (𝐵′ , 𝑦) as a heat map.
As anticipated, the probability that the government chooses to default in the following period increases with indebtedness
and falls with income.
Next let’s run a time series simulation of {𝑦𝑡 }, {𝐵𝑡 } and 𝑞(𝐵𝑡+1 , 𝑦𝑡 ).
The grey vertical bars correspond to periods when the economy is excluded from financial markets because of a past
default.
One notable feature of the simulated data is the nonlinear response of interest rates.
Periods of relative stability are followed by sharp spikes in the discount rate on government debt.

13.5. Results 261

Advanced Quantitative Economics with Python

262 Chapter 13. Default Risk and Income Fluctuations

Advanced Quantitative Economics with Python

13.5. Results 263

Advanced Quantitative Economics with Python

13.6 Exercises

Exercise 13.6.1
To the extent that you can, replicate the figures shown above
• Use the parameter values listed as defaults in Arellano_Economy.
• The time series will of course vary depending on the shock draws.

Solution to Exercise 13.6.1

Compute the value function, policy and equilibrium prices

ae = Arellano_Economy()

v_c, v_d, q, B_star = solve(ae)

Entering iteration 0.

Entering iteration 100.

Entering iteration 200.

Entering iteration 300.

Terminating at iteration 399.

Compute the bond price schedule as seen in figure 3 of Arellano (2008)

# Unpack some useful names

B_grid, y_grid, P = ae.B_grid, ae.y_grid, ae.P
B_grid_size, y_grid_size = len(B_grid), len(y_grid)
r = ae.r

# Create "Y High" and "Y Low" values as 5% devs from mean
high, low = np.mean(y_grid) * 1.05, np.mean(y_grid) * .95
iy_high, iy_low = (np.searchsorted(y_grid, x) for x in (high, low))

fig, ax = plt.subplots(figsize=(10, 6.5))

ax.set_title("Bond price schedule $q(y, B')$")

# Extract a suitable plot grid

x = []
q_low = []
q_high = []
for i, B in enumerate(B_grid):
if -0.35 <= B <= 0: # To match fig 3 of Arellano
x.append(B)
q_low.append(q[i, iy_low])
(continues on next page)

264 Chapter 13. Default Risk and Income Fluctuations

Advanced Quantitative Economics with Python

(continued from previous page)

q_high.append(q[i, iy_high])
ax.plot(x, q_high, label="$y_H$", lw=2, alpha=0.7)
ax.plot(x, q_low, label="$y_L$", lw=2, alpha=0.7)
ax.set_xlabel("$B'$")
ax.legend(loc='upper left', frameon=False)
plt.show()

Draw a plot of the value functions

v = np.maximum(v_c, np.reshape(v_d, (1, y_grid_size)))

fig, ax = plt.subplots(figsize=(10, 6.5))

ax.set_title("Value Functions")
ax.plot(B_grid, v[:, iy_high], label="$y_H$", lw=2, alpha=0.7)
ax.plot(B_grid, v[:, iy_low], label="$y_L$", lw=2, alpha=0.7)
ax.legend(loc='upper left')
ax.set(xlabel="$B$", ylabel="$v(y, B)$")
ax.set_xlim(min(B_grid), max(B_grid))
plt.show()

13.6. Exercises 265

Advanced Quantitative Economics with Python

Draw a heat map for default probability

xx, yy = B_grid, y_grid

zz = np.empty_like(v_c)

for B_idx in range(B_grid_size):

for y_idx in range(y_grid_size):
zz[B_idx, y_idx] = P[y_idx, v_c[B_idx, :] < v_d].sum()

# Create figure
fig, ax = plt.subplots(figsize=(10, 6.5))
hm = ax.pcolormesh(xx, yy, zz.T)
cax = fig.add_axes([.92, .1, .02, .8])
fig.colorbar(hm, cax=cax)
ax.axis([xx.min(), 0.05, yy.min(), yy.max()])
ax.set(xlabel="$B'$", ylabel="$y$", title="Probability of Default")
plt.show()

266 Chapter 13. Default Risk and Income Fluctuations

Advanced Quantitative Economics with Python

Plot a time series of major variables simulated from the model

T = 250
np.random.seed(42)
y_sim, y_a_sim, B_sim, q_sim, d_sim = simulate(ae, T, v_c, v_d, q, B_star)

# Pick up default start and end dates

start_end_pairs = []
i = 0
while i < len(d_sim):
if d_sim[i] == 0:
i += 1
else:
# If we get to here we're in default
start_default = i
while i < len(d_sim) and d_sim[i] == 1:
i += 1
end_default = i - 1
start_end_pairs.append((start_default, end_default))

plot_series = (y_sim, B_sim, q_sim)

titles = 'output', 'foreign assets', 'bond price'

fig, axes = plt.subplots(len(plot_series), 1, figsize=(10, 12))

fig.subplots_adjust(hspace=0.3)

for ax, series, title in zip(axes, plot_series, titles):

# Determine suitable y limits
s_max, s_min = max(series), min(series)
s_range = s_max - s_min
y_max = s_max + s_range * 0.1
(continues on next page)

13.6. Exercises 267

Advanced Quantitative Economics with Python

(continued from previous page)

y_min = s_min - s_range * 0.1
ax.set_ylim(y_min, y_max)
for pair in start_end_pairs:
ax.fill_between(pair, (y_min, y_min), (y_max, y_max),
color='k', alpha=0.3)
ax.grid()
ax.plot(range(T), series, lw=2, alpha=0.7)
ax.set(title=title, xlabel="time")

plt.show()

268 Chapter 13. Default Risk and Income Fluctuations

Advanced Quantitative Economics with Python

13.6. Exercises 269

Advanced Quantitative Economics with Python

270 Chapter 13. Default Risk and Income Fluctuations

CHAPTER

FOURTEEN

GLOBALIZATION AND CYCLES

14.1 Overview

In this lecture, we review the paper Globalization and Synchronization of Innovation Cycles by Kiminori Matsuyama,
Laura Gardini and Iryna Sushko.
This model helps us understand several interesting stylized facts about the world economy.
One of these is synchronized business cycles across different countries.
Most existing models that generate synchronized business cycles do so by assumption, since they tie output in each country
to a common shock.
They also fail to explain certain features of the data, such as the fact that the degree of synchronization tends to increase
with trade ties.
By contrast, in the model we consider in this lecture, synchronization is both endogenous and increasing with the extent
of trade integration.
In particular, as trade costs fall and international competition increases, innovation incentives become aligned and coun-
tries synchronize their innovation cycles.
Let’s start with some imports:

import numpy as np
import matplotlib.pyplot as plt
from numba import jit
from ipywidgets import interact

14.1.1 Background

The model builds on work by Judd [Judd, 1985], Deneckner and Judd [Deneckere and Judd, 1992] and Helpman and
Krugman [Helpman and Krugman, 1985] by developing a two-country model with trade and innovation.
On the technical side, the paper introduces the concept of coupled oscillators to economic modeling.
As we will see, coupled oscillators arise endogenously within the model.
Below we review the model and replicate some of the results on synchronization of innovation across countries.

271
Advanced Quantitative Economics with Python

14.2 Key Ideas

It is helpful to begin with an overview of the mechanism.

14.2.1 Innovation Cycles

As discussed above, two countries produce and trade with each other.
In each country, firms innovate, producing new varieties of goods and, in doing so, receiving temporary monopoly power.
Imitators follow and, after one period of monopoly, what had previously been new varieties now enter competitive pro-
duction.
Firms have incentives to innovate and produce new goods when the mass of varieties of goods currently in production is
relatively low.
In addition, there are strategic complementarities in the timing of innovation.
Firms have incentives to innovate in the same period, so as to avoid competing with substitutes that are competitively
produced.
This leads to temporal clustering in innovations in each country.
After a burst of innovation, the mass of goods currently in production increases.
However, goods also become obsolete, so that not all survive from period to period.
This mechanism generates a cycle, where the mass of varieties increases through simultaneous innovation and then falls
through obsolescence.

14.2.2 Synchronization

In the absence of trade, the timing of innovation cycles in each country is decoupled.
This will be the case when trade costs are prohibitively high.
If trade costs fall, then goods produced in each country penetrate each other’s markets.
As illustrated below, this leads to synchronization of business cycles across the two countries.

14.3 Model

Let’s write down the model more formally.

(The treatment is relatively terse since full details can be found in the original paper)
Time is discrete with 𝑡 = 0, 1, ….
There are two countries indexed by 𝑗 or 𝑘.
In each country, a representative household inelastically supplies 𝐿𝑗 units of labor at wage rate 𝑤𝑗,𝑡 .
Without loss of generality, it is assumed that 𝐿1 ≥ 𝐿2 .
Households consume a single nontradeable final good which is produced competitively.
Its production involves combining two types of tradeable intermediate inputs via
𝑜 1−𝛼 𝛼
𝑋𝑘,𝑡 𝑋𝑘,𝑡
𝑌𝑘,𝑡 = 𝐶𝑘,𝑡 = ( ) ( )
1−𝛼 𝛼

272 Chapter 14. Globalization and Cycles

Advanced Quantitative Economics with Python

𝑜
Here 𝑋𝑘,𝑡 is a homogeneous input which can be produced from labor using a linear, one-for-one technology.
It is freely tradeable, competitively supplied, and homogeneous across countries.
By choosing the price of this good as numeraire and assuming both countries find it optimal to always produce the
homogeneous good, we can set 𝑤1,𝑡 = 𝑤2,𝑡 = 1.
The good 𝑋𝑘,𝑡 is a composite, built from many differentiated goods via
1
1− 1 1− 𝜎
𝑋𝑘,𝑡 𝜎 = ∫ [𝑥𝑘,𝑡 (𝜈)] 𝑑𝜈
Ω𝑡

Here 𝑥𝑘,𝑡 (𝜈) is the total amount of a differentiated good 𝜈 ∈ Ω𝑡 that is produced.
The parameter 𝜎 > 1 is the direct partial elasticity of substitution between a pair of varieties and Ω𝑡 is the set of varieties
available in period 𝑡.
We can split the varieties into those which are supplied competitively and those supplied monopolistically; that is, Ω𝑡 =
Ω𝑐𝑡 + Ω𝑚
𝑡 .

14.3.1 Prices

Demand for differentiated inputs is

−𝜎
𝑝𝑘,𝑡 (𝜈) 𝛼𝐿𝑘
𝑥𝑘,𝑡 (𝜈) = ( )
𝑃𝑘,𝑡 𝑃𝑘,𝑡
Here
• 𝑝𝑘,𝑡 (𝜈) is the price of the variety 𝜈 and
• 𝑃𝑘,𝑡 is the price index for differentiated inputs in 𝑘, defined by
1−𝜎
[𝑃𝑘,𝑡 ] = ∫ [𝑝𝑘,𝑡 (𝜈)]1−𝜎 𝑑𝜈
Ω𝑡

The price of a variety also depends on the origin, 𝑗, and destination, 𝑘, of the goods because shipping varieties between
countries incurs an iceberg trade cost 𝜏𝑗,𝑘 .
Thus the effective price in country 𝑘 of a variety 𝜈 produced in country 𝑗 becomes 𝑝𝑘,𝑡 (𝜈) = 𝜏𝑗,𝑘 𝑝𝑗,𝑡 (𝜈).
Using these expressions, we can derive the total demand for each variety, which is

𝐷𝑗,𝑡 (𝜈) = ∑ 𝜏𝑗,𝑘 𝑥𝑘,𝑡 (𝜈) = 𝛼𝐴𝑗,𝑡 (𝑝𝑗,𝑡 (𝜈))−𝜎

𝑘

where
𝜌𝑗,𝑘 𝐿𝑘
𝐴𝑗,𝑡 ∶= ∑ and 𝜌𝑗,𝑘 = (𝜏𝑗,𝑘 )1−𝜎 ≤ 1
𝑘
(𝑃𝑘,𝑡 )1−𝜎

It is assumed that 𝜏1,1 = 𝜏2,2 = 1 and 𝜏1,2 = 𝜏2,1 = 𝜏 for some 𝜏 > 1, so that

𝜌1,2 = 𝜌2,1 = 𝜌 ∶= 𝜏 1−𝜎 < 1

The value 𝜌 ∈ [0, 1) is a proxy for the degree of globalization.

Producing one unit of each differentiated variety requires 𝜓 units of labor, so the marginal cost is equal to 𝜓 for 𝜈 ∈ Ω𝑗,𝑡 .
Additionally, all competitive varieties will have the same price (because of equal marginal cost), which means that, for
all 𝜈 ∈ Ω𝑐 ,
𝑐 𝑐 𝑐 −𝜎
𝑝𝑗,𝑡 (𝜈) = 𝑝𝑗,𝑡 ∶= 𝜓 and 𝐷𝑗,𝑡 = 𝑦𝑗,𝑡 ∶= 𝛼𝐴𝑗,𝑡 (𝑝𝑗,𝑡 )

14.3. Model 273

Advanced Quantitative Economics with Python

Monopolists will have the same marked-up price, so, for all 𝜈 ∈ Ω𝑚 ,

𝑚 𝜓 𝑚 𝑚 −𝜎
𝑝𝑗,𝑡 (𝜈) = 𝑝𝑗,𝑡 ∶= 1 and 𝐷𝑗,𝑡 = 𝑦𝑗,𝑡 ∶= 𝛼𝐴𝑗,𝑡 (𝑝𝑗,𝑡 )
1− 𝜎

Define
𝑐
𝑝𝑗,𝑡 𝑐
𝑦𝑗,𝑡 1 1−𝜎
𝜃 ∶= 𝑚 𝑚 = (1 − )
𝑝𝑗,𝑡 𝑦𝑗,𝑡 𝜎

Using the preceding definitions and some algebra, the price indices can now be rewritten as
1−𝜎 𝑚
𝑃𝑘,𝑡 𝑐
𝑁𝑗,𝑡
( ) = 𝑀𝑘,𝑡 + 𝜌𝑀𝑗,𝑡 where 𝑀𝑗,𝑡 ∶= 𝑁𝑗,𝑡 +
𝜓 𝜃
𝑐 𝑚
The symbols 𝑁𝑗,𝑡 and 𝑁𝑗,𝑡 will denote the measures of Ω𝑐 and Ω𝑚 respectively.

14.3.2 New Varieties

To introduce a new variety, a firm must hire 𝑓 units of labor per variety in each country.
Monopolist profits must be less than or equal to zero in expectation, so
𝑚 𝑚 𝑚 𝑚 𝑚 𝑚
𝑁𝑗,𝑡 ≥ 0, 𝜋𝑗,𝑡 ∶= (𝑝𝑗,𝑡 − 𝜓)𝑦𝑗,𝑡 −𝑓 ≤0 and 𝜋𝑗,𝑡 𝑁𝑗,𝑡 =0

With further manipulations, this becomes

𝑚 𝑐 1 𝛼𝐿𝑗 𝛼𝐿𝑘
𝑁𝑗,𝑡 = 𝜃(𝑀𝑗,𝑡 − 𝑁𝑗,𝑡 ) ≥ 0, [ + ]≤𝑓
𝜎 𝜃(𝑀𝑗,𝑡 + 𝜌𝑀𝑘,𝑡 ) 𝜃(𝑀𝑗,𝑡 + 𝑀𝑘,𝑡 /𝜌)

14.3.3 Law of Motion

With 𝛿 as the exogenous probability of a variety becoming obsolete, the dynamic equation for the measure of firms
becomes
𝑐 𝑐 𝑚 𝑐 𝑐
𝑁𝑗,𝑡+1 = 𝛿(𝑁𝑗,𝑡 + 𝑁𝑗,𝑡 ) = 𝛿(𝑁𝑗,𝑡 + 𝜃(𝑀𝑗,𝑡 − 𝑁𝑗,𝑡 ))

We will work with a normalized measure of varieties

𝑐 𝑚
𝜃𝜎𝑓𝑁𝑗,𝑡 𝜃𝜎𝑓𝑁𝑗,𝑡 𝜃𝜎𝑓𝑀𝑗,𝑡 𝑖𝑗,𝑡
𝑛𝑗,𝑡 ∶= , 𝑖𝑗,𝑡 ∶= , 𝑚𝑗,𝑡 ∶= = 𝑛𝑗,𝑡 +
𝛼(𝐿1 + 𝐿2 ) 𝛼(𝐿1 + 𝐿2 ) 𝛼(𝐿1 + 𝐿2 ) 𝜃
𝐿𝑗
We also use 𝑠𝑗 ∶= 𝐿1 +𝐿2 to be the share of labor employed in country 𝑗.
We can use these definitions and the preceding expressions to obtain a law of motion for 𝑛𝑡 ∶= (𝑛1,𝑡 , 𝑛2,𝑡 ).
In particular, given an initial condition, 𝑛0 = (𝑛1,0 , 𝑛2,0 ) ∈ ℝ2+ , the equilibrium trajectory, {𝑛𝑡 }∞ ∞
𝑡=0 = {(𝑛1,𝑡 , 𝑛2,𝑡 )}𝑡=0 ,
2 2
is obtained by iterating on 𝑛𝑡+1 = 𝐹 (𝑛𝑡 ) where 𝐹 ∶ ℝ+ → ℝ+ is given by

⎧(𝛿(𝜃𝑠1 (𝜌) + (1 − 𝜃)𝑛1,𝑡 ), 𝛿(𝜃𝑠2 (𝜌) + (1 − 𝜃)𝑛2,𝑡 )) for 𝑛𝑡 ∈ 𝐷𝐿𝐿

{
{(𝛿𝑛1,𝑡 , 𝛿𝑛2,𝑡 ) for 𝑛𝑡 ∈ 𝐷𝐻𝐻
𝐹 (𝑛𝑡 ) = ⎨
{(𝛿𝑛1,𝑡 , 𝛿(𝜃ℎ2 (𝑛1,𝑡 ) + (1 − 𝜃)𝑛2,𝑡 )) for 𝑛𝑡 ∈ 𝐷𝐻𝐿
{(𝛿(𝜃ℎ (𝑛 ) + (1 − 𝜃)𝑛 , 𝛿𝑛 )) for 𝑛𝑡 ∈ 𝐷𝐿𝐻
⎩ 1 2,𝑡 1,𝑡 2,𝑡

274 Chapter 14. Globalization and Cycles

Advanced Quantitative Economics with Python

Here
𝐷𝐿𝐿 ∶= {(𝑛1 , 𝑛2 ) ∈ ℝ2+ |𝑛𝑗 ≤ 𝑠𝑗 (𝜌)}
𝐷𝐻𝐻 ∶= {(𝑛1 , 𝑛2 ) ∈ ℝ2+ |𝑛𝑗 ≥ ℎ𝑗 (𝑛𝑘 )}
𝐷𝐻𝐿 ∶= {(𝑛1 , 𝑛2 ) ∈ ℝ2+ |𝑛1 ≥ 𝑠1 (𝜌) and 𝑛2 ≤ ℎ2 (𝑛1 )}
𝐷𝐿𝐻 ∶= {(𝑛1 , 𝑛2 ) ∈ ℝ2+ |𝑛1 ≤ ℎ1 (𝑛2 ) and 𝑛2 ≥ 𝑠2 (𝜌)}

while
𝑠1 − 𝜌𝑠2
𝑠1 (𝜌) = 1 − 𝑠2 (𝜌) = min { , 1}
1−𝜌

and ℎ𝑗 (𝑛𝑘 ) is defined implicitly by the equation

𝑠𝑗 𝑠𝑘
1= +
ℎ𝑗 (𝑛𝑘 ) + 𝜌𝑛𝑘 ℎ𝑗 (𝑛𝑘 ) + 𝑛𝑘 /𝜌

Rewriting the equation above gives us a quadratic equation in terms of ℎ𝑗 (𝑛𝑘 ).

Since we know ℎ𝑗 (𝑛𝑘 ) > 0 then we can just solve the quadratic equation and return the positive root.
This gives us
1 𝑠𝑗 𝑛𝑘
ℎ𝑗 (𝑛𝑘 )2 + ((𝜌 + )𝑛𝑘 − 𝑠𝑗 − 𝑠𝑘 ) ℎ𝑗 (𝑛𝑘 ) + (𝑛2𝑘 − − 𝑠𝑘 𝑛𝑘 𝜌) = 0
𝜌 𝜌

14.4 Simulation

Let’s try simulating some of these trajectories.

We will focus in particular on whether or not innovation cycles synchronize across the two countries.
As we will see, this depends on initial conditions.
For some parameterizations, synchronization will occur for “most” initial conditions, while for others synchronization will
be rare.
The computational burden of testing synchronization across many initial conditions is not trivial.
In order to make our code fast, we will use just in time compiled functions that will get called and handled by our class.
These are the @jit statements that you see below (review this lecture if you don’t recall how to use JIT compilation).
Here’s the main body of code

@jit(nopython=True)
def _hj(j, nk, s1, s2, θ, δ, ρ):
"""
If we expand the implicit function for h_j(n_k) then we find that
it is quadratic. We know that h_j(n_k) > 0 so we can get its
value by using the quadratic form
"""
# Find out who's h we are evaluating
if j == 1:
sj = s1
sk = s2
else:
sj = s2
(continues on next page)

14.4. Simulation 275

Advanced Quantitative Economics with Python

(continued from previous page)

sk = s1

# Coefficients on the quadratic a x^2 + b x + c = 0

a = 1.0
b = ((ρ + 1 / ρ) * nk - sj - sk)
c = (nk * nk - (sj * nk) / ρ - sk * ρ * nk)

# Positive solution of quadratic form

root = (-b + np.sqrt(b * b - 4 * a * c)) / (2 * a)

return root

@jit(nopython=True)
def DLL(n1, n2, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ):
"Determine whether (n1, n2) is in the set DLL"
return (n1 <= s1_ρ) and (n2 <= s2_ρ)

@jit(nopython=True)
def DHH(n1, n2, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ):
"Determine whether (n1, n2) is in the set DHH"
return (n1 >= _hj(1, n2, s1, s2, θ, δ, ρ)) and \
(n2 >= _hj(2, n1, s1, s2, θ, δ, ρ))

@jit(nopython=True)
def DHL(n1, n2, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ):
"Determine whether (n1, n2) is in the set DHL"
return (n1 >= s1_ρ) and (n2 <= _hj(2, n1, s1, s2, θ, δ, ρ))

@jit(nopython=True)
def DLH(n1, n2, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ):
"Determine whether (n1, n2) is in the set DLH"
return (n1 <= _hj(1, n2, s1, s2, θ, δ, ρ)) and (n2 >= s2_ρ)

@jit(nopython=True)
def one_step(n1, n2, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ):
"""
Takes a current value for (n_{1, t}, n_{2, t}) and returns the
values (n_{1, t+1}, n_{2, t+1}) according to the law of motion.
"""
# Depending on where we are, evaluate the right branch
if DLL(n1, n2, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ):
n1_tp1 = δ * (θ * s1_ρ + (1 - θ) * n1)
n2_tp1 = δ * (θ * s2_ρ + (1 - θ) * n2)
elif DHH(n1, n2, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ):
n1_tp1 = δ * n1
n2_tp1 = δ * n2
elif DHL(n1, n2, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ):
n1_tp1 = δ * n1
n2_tp1 = δ * (θ * _hj(2, n1, s1, s2, θ, δ, ρ) + (1 - θ) * n2)
elif DLH(n1, n2, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ):
n1_tp1 = δ * (θ * _hj(1, n2, s1, s2, θ, δ, ρ) + (1 - θ) * n1)
n2_tp1 = δ * n2

return n1_tp1, n2_tp1

@jit(nopython=True)

(continues on next page)

276 Chapter 14. Globalization and Cycles

Advanced Quantitative Economics with Python

(continued from previous page)

def n_generator(n1_0, n2_0, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ):
"""
Given an initial condition, continues to yield new values of
n1 and n2
"""
n1_t, n2_t = n1_0, n2_0
while True:
n1_tp1, n2_tp1 = one_step(n1_t, n2_t, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ)
yield (n1_tp1, n2_tp1)
n1_t, n2_t = n1_tp1, n2_tp1

@jit(nopython=True)
def _pers_till_sync(n1_0, n2_0, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ, maxiter, npers):
"""
Takes initial values and iterates forward to see whether
the histories eventually end up in sync.

If countries are symmetric then as soon as the two countries have the
same measure of firms then they will be synchronized -- However, if
they are not symmetric then it is possible they have the same measure
of firms but are not yet synchronized. To address this, we check whether
firms stay synchronized for `npers` periods with Euclidean norm

Parameters
----------
n1_0 : scalar(Float)
Initial normalized measure of firms in country one
n2_0 : scalar(Float)
Initial normalized measure of firms in country two
maxiter : scalar(Int)
Maximum number of periods to simulate
npers : scalar(Int)
Number of periods we would like the countries to have the
same measure for

Returns
-------
synchronized : scalar(Bool)
Did the two economies end up synchronized
pers_2_sync : scalar(Int)
The number of periods required until they synchronized
"""
# Initialize the status of synchronization
synchronized = False
pers_2_sync = maxiter
iters = 0

# Initialize generator
n_gen = n_generator(n1_0, n2_0, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ)

# Will use a counter to determine how many times in a row

# the firm measures are the same
nsync = 0

while (not synchronized) and (iters < maxiter):

# Increment the number of iterations and get next values

(continues on next page)

14.4. Simulation 277

Advanced Quantitative Economics with Python

(continued from previous page)

iters += 1
n1_t, n2_t = next(n_gen)

# Check whether same in this period

if abs(n1_t - n2_t) < 1e-8:
nsync += 1
# If not, then reset the nsync counter
else:
nsync = 0

# If we have been in sync for npers then stop and countries

# became synchronized nsync periods ago
if nsync > npers:
synchronized = True
pers_2_sync = iters - nsync

return synchronized, pers_2_sync

@jit(nopython=True)
def _create_attraction_basis(s1_ρ, s2_ρ, s1, s2, θ, δ, ρ,
maxiter, npers, npts):
# Create unit range with npts
synchronized, pers_2_sync = False, 0
unit_range = np.linspace(0.0, 1.0, npts)

# Allocate space to store time to sync

time_2_sync = np.empty((npts, npts))
# Iterate over initial conditions
for (i, n1_0) in enumerate(unit_range):
for (j, n2_0) in enumerate(unit_range):
synchronized, pers_2_sync = _pers_till_sync(n1_0, n2_0, s1_ρ,
s2_ρ, s1, s2, θ, δ,
ρ, maxiter, npers)
time_2_sync[i, j] = pers_2_sync

return time_2_sync

# == Now we define a class for the model == #

class MSGSync:
"""
The paper "Globalization and Synchronization of Innovation Cycles" presents
a two-country model with endogenous innovation cycles. Combines elements
from Deneckere Judd (1985) and Helpman Krugman (1985) to allow for a
model with trade that has firms who can introduce new varieties into
the economy.

We focus on being able to determine whether the two countries eventually

synchronize their innovation cycles. To do this, we only need a few
of the many parameters. In particular, we need the parameters listed
below

Parameters
----------
s1 : scalar(Float)

(continues on next page)

278 Chapter 14. Globalization and Cycles

Advanced Quantitative Economics with Python

(continued from previous page)

Amount of total labor in country 1 relative to total worldwide labor
θ : scalar(Float)
A measure of how much more of the competitive variety is used in
production of final goods
δ : scalar(Float)
Percentage of firms that are not exogenously destroyed every period
ρ : scalar(Float)
Measure of how expensive it is to trade between countries
"""
def __init__(self, s1=0.5, θ=2.5, δ=0.7, ρ=0.2):
# Store model parameters
self.s1, self.θ, self.δ, self.ρ = s1, θ, δ, ρ

# Store other cutoffs and parameters we use

self.s2 = 1 - s1
self.s1_ρ = self._calc_s1_ρ()
self.s2_ρ = 1 - self.s1_ρ

def _unpack_params(self):
return self.s1, self.s2, self.θ, self.δ, self.ρ

def _calc_s1_ρ(self):
# Unpack params
s1, s2, θ, δ, ρ = self._unpack_params()

# s_1(ρ) = min(val, 1)
val = (s1 - ρ * s2) / (1 - ρ)
return min(val, 1)

def simulate_n(self, n1_0, n2_0, T):

"""
Simulates the values of (n1, n2) for T periods

Parameters
----------
n1_0 : scalar(Float)
Initial normalized measure of firms in country one
n2_0 : scalar(Float)
Initial normalized measure of firms in country two
T : scalar(Int)
Number of periods to simulate

Returns
-------
n1 : Array(Float64, ndim=1)
A history of normalized measures of firms in country one
n2 : Array(Float64, ndim=1)
A history of normalized measures of firms in country two
"""
# Unpack parameters
s1, s2, θ, δ, ρ = self._unpack_params()
s1_ρ, s2_ρ = self.s1_ρ, self.s2_ρ

# Allocate space
n1 = np.empty(T)
n2 = np.empty(T)

(continues on next page)

14.4. Simulation 279

Advanced Quantitative Economics with Python

(continued from previous page)

# Create the generator

n1[0], n2[0] = n1_0, n2_0
n_gen = n_generator(n1_0, n2_0, s1_ρ, s2_ρ, s1, s2, θ, δ, ρ)

# Simulate for T periods

for t in range(1, T):
# Get next values
n1_tp1, n2_tp1 = next(n_gen)

# Store in arrays
n1[t] = n1_tp1
n2[t] = n2_tp1

return n1, n2

def pers_till_sync(self, n1_0, n2_0, maxiter=500, npers=3):

"""
Takes initial values and iterates forward to see whether
the histories eventually end up in sync.

Returns
-------
synchronized : scalar(Bool)
Did the two economies end up synchronized
pers_2_sync : scalar(Int)
The number of periods required until they synchronized
"""
# Unpack parameters
s1, s2, θ, δ, ρ = self._unpack_params()
s1_ρ, s2_ρ = self.s1_ρ, self.s2_ρ

return _pers_till_sync(n1_0, n2_0, s1_ρ, s2_ρ,

s1, s2, θ, δ, ρ, maxiter, npers)

def create_attraction_basis(self, maxiter=250, npers=3, npts=50):

"""
Creates an attraction basis for values of n on [0, 1] X [0, 1]

(continues on next page)

280 Chapter 14. Globalization and Cycles

Advanced Quantitative Economics with Python

(continued from previous page)

with npts in each dimension
"""
# Unpack parameters
s1, s2, θ, δ, ρ = self._unpack_params()
s1_ρ, s2_ρ = self.s1_ρ, self.s2_ρ

ab = _create_attraction_basis(s1_ρ, s2_ρ, s1, s2, θ, δ,

ρ, maxiter, npers, npts)

return ab

14.4.1 Time Series of Firm Measures

We write a short function below that exploits the preceding code and plots two time series.
Each time series gives the dynamics for the two countries.
The time series share parameters but differ in their initial condition.
Here’s the function

def plot_timeseries(n1_0, n2_0, s1=0.5, θ=2.5,

δ=0.7, ρ=0.2, ax=None, title=''):
"""
Plot a single time series with initial conditions
"""
if ax is None:
fig, ax = plt.subplots()

# Create the MSG Model and simulate with initial conditions

model = MSGSync(s1, θ, δ, ρ)
n1, n2 = model.simulate_n(n1_0, n2_0, 25)

ax.plot(np.arange(25), n1, label="$n_1$", lw=2)

ax.plot(np.arange(25), n2, label="$n_2$", lw=2)

ax.legend()
ax.set(title=title, ylim=(0.15, 0.8))

return ax

# Create figure
fig, ax = plt.subplots(2, 1, figsize=(10, 8))

plot_timeseries(0.15, 0.35, ax=ax[0], title='Not Synchronized')

plot_timeseries(0.4, 0.3, ax=ax[1], title='Synchronized')

fig.tight_layout()

plt.show()

14.4. Simulation 281

Advanced Quantitative Economics with Python

In the first case, innovation in the two countries does not synchronize.
In the second case, different initial conditions are chosen, and the cycles become synchronized.

14.4.2 Basin of Attraction

Next, let’s study the initial conditions that lead to synchronized cycles more systematically.
We generate time series from a large collection of different initial conditions and mark those conditions with different
colors according to whether synchronization occurs or not.
The next display shows exactly this for four different parameterizations (one for each subfigure).
Dark colors indicate synchronization, while light colors indicate failure to synchronize.
As you can see, larger values of 𝜌 translate to more synchronization.
You are asked to replicate this figure in the exercises.
In the solution to the exercises, you’ll also find a figure with sliders, allowing you to experiment with different parameters.
Here’s one snapshot from the interactive figure

282 Chapter 14. Globalization and Cycles

Advanced Quantitative Economics with Python

14.4. Simulation 283

Advanced Quantitative Economics with Python

284 Chapter 14. Globalization and Cycles

Advanced Quantitative Economics with Python

14.5 Exercises

Exercise 14.5.1
Replicate the figure shown above by coloring initial conditions according to whether or not synchronization occurs from
those conditions.

Solution to Exercise 14.5.1

def plot_attraction_basis(s1=0.5, θ=2.5, δ=0.7, ρ=0.2, npts=250, ax=None):

if ax is None:
fig, ax = plt.subplots()

# Create attraction basis

unitrange = np.linspace(0, 1, npts)
model = MSGSync(s1, θ, δ, ρ)
ab = model.create_attraction_basis(npts=npts)
cf = ax.pcolormesh(unitrange, unitrange, ab, cmap="viridis")

return ab, cf

fig = plt.figure(figsize=(14, 12))

# Left - Bottom - Width - Height

ax0 = fig.add_axes((0.05, 0.475, 0.38, 0.35), label="axes0")
ax1 = fig.add_axes((0.5, 0.475, 0.38, 0.35), label="axes1")
ax2 = fig.add_axes((0.05, 0.05, 0.38, 0.35), label="axes2")
ax3 = fig.add_axes((0.5, 0.05, 0.38, 0.35), label="axes3")

params = [[0.5, 2.5, 0.7, 0.2],

[0.5, 2.5, 0.7, 0.4],
[0.5, 2.5, 0.7, 0.6],
[0.5, 2.5, 0.7, 0.8]]

ab0, cf0 = plot_attraction_basis(*params[0], npts=500, ax=ax0)

ab1, cf1 = plot_attraction_basis(*params[1], npts=500, ax=ax1)
ab2, cf2 = plot_attraction_basis(*params[2], npts=500, ax=ax2)
ab3, cf3 = plot_attraction_basis(*params[3], npts=500, ax=ax3)

cbar_ax = fig.add_axes([0.9, 0.075, 0.03, 0.725])

plt.colorbar(cf0, cax=cbar_ax)

ax0.set_title(r"$s_1=0.5$, $\theta=2.5$, $\delta=0.7$, $\rho=0.2$",

fontsize=22)
ax1.set_title(r"$s_1=0.5$, $\theta=2.5$, $\delta=0.7$, $\rho=0.4$",
fontsize=22)
ax2.set_title(r"$s_1=0.5$, $\theta=2.5$, $\delta=0.7$, $\rho=0.6$",
fontsize=22)
ax3.set_title(r"$s_1=0.5$, $\theta=2.5$, $\delta=0.7$, $\rho=0.8$",
fontsize=22)

fig.suptitle("Synchronized versus Asynchronized 2-cycles",

x=0.475, y=0.915, size=26)
plt.show()

14.5. Exercises 285

Advanced Quantitative Economics with Python

Additionally, instead of just seeing 4 plots at once, we might want to manually be able to change 𝜌 and see how it affects
the plot in real-time. Below we use an interactive plot to do this.
Note, interactive plotting requires the ipywidgets module to be installed and enabled.

Note: This interactive plot is disabled on this static webpage. In order to use this, we recommend to run this notebook
locally.

def interact_attraction_basis(ρ=0.2, maxiter=250, npts=250):

# Create the figure and axis that we will plot on
fig, ax = plt.subplots(figsize=(12, 10))
# Create model and attraction basis
s1, θ, δ = 0.5, 2.5, 0.75
model = MSGSync(s1, θ, δ, ρ)
ab = model.create_attraction_basis(maxiter=maxiter, npts=npts)
# Color map with colormesh
unitrange = np.linspace(0, 1, npts)
cf = ax.pcolormesh(unitrange, unitrange, ab, cmap="viridis")
cbar_ax = fig.add_axes([0.95, 0.15, 0.05, 0.7])
plt.colorbar(cf, cax=cbar_ax)
plt.show()
return None

286 Chapter 14. Globalization and Cycles

Advanced Quantitative Economics with Python

fig = interact(interact_attraction_basis,
ρ=(0.0, 1.0, 0.05),
maxiter=(50, 5000, 50),
npts=(25, 750, 25))

14.5. Exercises 287

Advanced Quantitative Economics with Python

288 Chapter 14. Globalization and Cycles

CHAPTER

FIFTEEN

COASE’S THEORY OF THE FIRM

15.1 Overview

In 1937, Ronald Coase wrote a brilliant essay on the nature of the firm [Coase, 1937].
Coase was writing at a time when the Soviet Union was rising to become a significant industrial power.
At the same time, many free-market economies were afflicted by a severe and painful depression.
This contrast led to an intensive debate on the relative merits of decentralized, price-based allocation versus top-down
planning.
In the midst of this debate, Coase made an important observation: even in free-market economies, a great deal of top-
down planning does in fact take place.
This is because firms form an integral part of free-market economies and, within firms, allocation is by planning.
In other words, free-market economies blend both planning (within firms) and decentralized production coordinated by
prices.
The question Coase asked is this: if prices and free markets are so efficient, then why do firms even exist?
Couldn’t the associated within-firm planning be done more efficiently by the market?
We’ll use the following imports:

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import fminbound

15.1.1 Why Firms Exist

On top of asking a deep and fascinating question, Coase also supplied an illuminating answer: firms exist because of
transaction costs.
Here’s one example of a transaction cost:
Suppose agent A is considering setting up a small business and needs a web developer to construct and help run an online
store.
She can use the labor of agent B, a web developer, by writing up a freelance contract for these tasks and agreeing on a
suitable price.
But contracts like this can be time-consuming and difficult to verify
• How will agent A be able to specify exactly what she wants, to the finest detail, when she herself isn’t sure how the
business will evolve?

289
Advanced Quantitative Economics with Python

• And what if she isn’t familiar with web technology? How can she specify all the relevant details?
• And, if things go badly, will failure to comply with the contract be verifiable in court?
In this situation, perhaps it will be easier to employ agent B under a simple labor contract.
The cost of this contract is far smaller because such contracts are simpler and more standard.
The basic agreement in a labor contract is: B will do what A asks him to do for the term of the contract, in return for a
given salary.
Making this agreement is much easier than trying to map every task out in advance in a contract that will hold up in a
court of law.
So agent A decides to hire agent B and a firm of nontrivial size appears, due to transaction costs.

15.1.2 A Trade-Off

Actually, we haven’t yet come to the heart of Coase’s investigation.

The issue of why firms exist is a binary question: should firms have positive size or zero size?
A better and more general question is: what determines the size of firms?
The answer Coase came up with was that “a firm will tend to expand until the costs of organizing an extra transaction
within the firm become equal to the costs of carrying out the same transaction by means of an exchange on the open
market…” ([Coase, 1937], p. 395).
But what are these internal and external costs?
In short, Coase envisaged a trade-off between
• transaction costs, which add to the expense of operating between firms, and
• diminishing returns to management, which adds to the expense of operating within firms
We discussed an example of transaction costs above (contracts).
The other cost, diminishing returns to management, is a catch-all for the idea that big operations are increasingly costly
to manage.
For example, you could think of management as a pyramid, so hiring more workers to implement more tasks requires
expansion of the pyramid, and hence labor costs grow at a rate more than proportional to the range of tasks.
Diminishing returns to management makes in-house production expensive, favoring small firms.

15.1.3 Summary

Here’s a summary of our discussion:

• Firms grow because transaction costs encourage them to take some operations in house.
• But as they get large, in-house operations become costly due to diminishing returns to management.
• The size of firms is determined by balancing these effects, thereby equalizing the marginal costs of each form of
operation.

290 Chapter 15. Coase’s Theory of the Firm

Advanced Quantitative Economics with Python

15.1.4 A Quantitative Interpretation

Coases ideas were expressed verbally, without any mathematics.

In fact, his essay is a wonderful example of how far you can get with clear thinking and plain English.
However, plain English is not good for quantitative analysis, so let’s bring some mathematical and computation tools to
bear.
In doing so we’ll add a bit more structure than Coase did, but this price will be worth paying.
Our exposition is based on [Kikuchi et al., 2018].

15.2 The Model

The model we study involves production of a single unit of a final good.

Production requires a linearly ordered chain, requiring sequential completion of a large number of processing stages.
The stages are indexed by 𝑡 ∈ [0, 1], with 𝑡 = 0 indicating that no tasks have been undertaken and 𝑡 = 1 indicating that
the good is complete.

15.2.1 Subcontracting

The subcontracting scheme by which tasks are allocated across firms is illustrated in the figure below

In this example,
• Firm 1 receives a contract to sell one unit of the completed good to a final buyer.
• Firm 1 then forms a contract with firm 2 to purchase the partially completed good at stage 𝑡1 , with the intention of
implementing the remaining 1 − 𝑡1 tasks in-house (i.e., processing from stage 𝑡1 to stage 1).
• Firm 2 repeats this procedure, forming a contract with firm 3 to purchase the good at stage 𝑡2 .
• Firm 3 decides to complete the chain, selecting 𝑡3 = 0.
At this point, production unfolds in the opposite direction (i.e., from upstream to downstream).

15.2. The Model 291

Advanced Quantitative Economics with Python

• Firm 3 completes processing stages from 𝑡3 = 0 up to 𝑡2 and transfers the good to firm 2.
• Firm 2 then processes from 𝑡2 up to 𝑡1 and transfers the good to firm 1,
• Firm 1 processes from 𝑡1 to 1 and delivers the completed good to the final buyer.
The length of the interval of stages (range of tasks) carried out by firm 𝑖 is denoted by ℓ𝑖 .

Each firm chooses only its upstream boundary, treating its downstream boundary as given.
The benefit of this formulation is that it implies a recursive structure for the decision problem for each firm.
In choosing how many processing stages to subcontract, each successive firm faces essentially the same decision problem
as the firm above it in the chain, with the only difference being that the decision space is a subinterval of the decision
space for the firm above.
We will exploit this recursive structure in our study of equilibrium.

15.2.2 Costs

Recall that we are considering a trade-off between two types of costs.

Let’s discuss these costs and how we represent them mathematically.
Diminishing returns to management means rising costs per task when a firm expands the range of productive activities
coordinated by its managers.
We represent these ideas by taking the cost of carrying out ℓ tasks in-house to be 𝑐(ℓ), where 𝑐 is increasing and strictly
convex.
Thus, the average cost per task rises with the range of tasks performed in-house.
We also assume that 𝑐 is continuously differentiable, with 𝑐(0) = 0 and 𝑐′ (0) > 0.
Transaction costs are represented as a wedge between the buyer’s and seller’s prices.
It matters little for us whether the transaction cost is borne by the buyer or the seller.
Here we assume that the cost is borne only by the buyer.
In particular, when two firms agree to a trade at face value 𝑣, the buyer’s total outlay is 𝛿𝑣, where 𝛿 > 1.
The seller receives only 𝑣, and the difference is paid to agents outside the model.

292 Chapter 15. Coase’s Theory of the Firm

Advanced Quantitative Economics with Python

15.3 Equilibrium

We assume that all firms are ex-ante identical and act as price takers.
As price takers, they face a price function 𝑝, which is a map from [0, 1] to ℝ+ , with 𝑝(𝑡) interpreted as the price of the
good at processing stage 𝑡.
There is a countable infinity of firms indexed by 𝑖 and no barriers to entry.
The cost of supplying the initial input (the good processed up to stage zero) is set to zero for simplicity.
Free entry and the infinite fringe of competitors rule out positive profits for incumbents, since any incumbent could be
replaced by a member of the competitive fringe filling the same role in the production chain.
Profits are never negative in equilibrium because firms can freely exit.

15.3.1 Informal Definition of Equilibrium

An equilibrium in this setting is an allocation of firms and a price function such that
1. all active firms in the chain make zero profits, including suppliers of raw materials
2. no firm in the production chain has an incentive to deviate, and
3. no inactive firms can enter and extract positive profits

15.3.2 Formal Definition of Equilibrium

Let’s make this definition more formal.

(You might like to skip this section on first reading)
An allocation of firms is a nonnegative sequence {ℓ𝑖 }𝑖∈ℕ such that ℓ𝑖 = 0 for all sufficiently large 𝑖.
Recalling the figures above,
• ℓ𝑖 represents the range of tasks implemented by the 𝑖-th firm
As a labeling convention, we assume that firms enter in order, with firm 1 being the furthest downstream.
An allocation {ℓ𝑖 } is called feasible if ∑ 𝑖≥1 ℓ𝑖 = 1.
In a feasible allocation, the entire production process is completed by finitely many firms.
Given a feasible allocation, {ℓ𝑖 }, let {𝑡𝑖 } represent the corresponding transaction stages, defined by

𝑡0 = 𝑠 and 𝑡𝑖 = 𝑡𝑖−1 − ℓ𝑖 (15.1)

In particular, 𝑡𝑖−1 is the downstream boundary of firm 𝑖 and 𝑡𝑖 is its upstream boundary.
As transaction costs are incurred only by the buyer, its profits are

𝜋𝑖 = 𝑝(𝑡𝑖−1 ) − 𝑐(ℓ𝑖 ) − 𝛿𝑝(𝑡𝑖 ) (15.2)

Given a price function 𝑝 and a feasible allocation {ℓ𝑖 }, let

• {𝑡𝑖 } be the corresponding firm boundaries.
• {𝜋𝑖 } be corresponding profits, as defined in (15.2).
This price-allocation pair is called an equilibrium for the production chain if

15.3. Equilibrium 293

Advanced Quantitative Economics with Python

1. 𝑝(0) = 0,
2. 𝜋𝑖 = 0 for all 𝑖, and
3. 𝑝(𝑠) − 𝑐(𝑠 − 𝑡) − 𝛿𝑝(𝑡) ≤ 0 for any pair 𝑠, 𝑡 with 0 ≤ 𝑠 ≤ 𝑡 ≤ 1.
The rationale behind these conditions was given in our informal definition of equilibrium above.

15.4 Existence, Uniqueness and Computation of Equilibria

We have defined an equilibrium but does one exist? Is it unique? And, if so, how can we compute it?

15.4.1 A Fixed Point Method

To address these questions, we introduce the operator 𝑇 mapping a nonnegative function 𝑝 on [0, 1] to 𝑇 𝑝 via
𝑇 𝑝(𝑠) = min {𝑐(𝑠 − 𝑡) + 𝛿𝑝(𝑡)} for all 𝑠 ∈ [0, 1]. (15.3)
𝑡≤𝑠

Here and below, the restriction 0 ≤ 𝑡 in the minimum is understood.

The operator 𝑇 is similar to a Bellman operator.
Under this analogy, 𝑝 corresponds to a value function and 𝛿 to a discount factor.
But 𝛿 > 1, so 𝑇 is not a contraction in any obvious metric, and in fact, 𝑇 𝑛 𝑝 diverges for many choices of 𝑝.
Nevertheless, there exists a domain on which 𝑇 is well-behaved: the set of convex increasing continuous functions
𝑝 ∶ [0, 1] → ℝ such that 𝑐′ (0)𝑠 ≤ 𝑝(𝑠) ≤ 𝑐(𝑠) for all 0 ≤ 𝑠 ≤ 1.
We denote this set of functions by 𝒫.
In [Kikuchi et al., 2018] it is shown that the following statements are true:
1. 𝑇 maps 𝒫 into itself.
2. 𝑇 has a unique fixed point in 𝒫, denoted below by 𝑝∗ .
3. For all 𝑝 ∈ 𝒫 we have 𝑇 𝑘 𝑝 → 𝑝∗ uniformly as 𝑘 → ∞.
Now consider the choice function
𝑡∗ (𝑠) ∶= the solution to min{𝑐(𝑠 − 𝑡) + 𝛿𝑝∗ (𝑡)} (15.4)
𝑡≤𝑠

By definition, 𝑡∗ (𝑠) is the cost-minimizing upstream boundary for a firm that is contracted to deliver the good at stage 𝑠
and faces the price function 𝑝∗ .
Since 𝑝∗ lies in 𝒫 and since 𝑐 is strictly convex, it follows that the right-hand side of (15.4) is continuous and strictly
convex in 𝑡.
Hence the minimizer 𝑡∗ (𝑠) exists and is uniquely defined.
We can use 𝑡∗ to construct an equilibrium allocation as follows:
Recall that firm 1 sells the completed good at stage 𝑠 = 1, its optimal upstream boundary is 𝑡∗ (1).
Hence firm 2’s optimal upstream boundary is 𝑡∗ (𝑡∗ (1)).
Continuing in this way produces the sequence {𝑡∗𝑖 } defined by

𝑡∗0 = 1 and 𝑡∗𝑖 = 𝑡∗ (𝑡𝑖−1 ) (15.5)

The sequence ends when a firm chooses to complete all remaining tasks.

294 Chapter 15. Coase’s Theory of the Firm

Advanced Quantitative Economics with Python

We label this firm (and hence the number of firms in the chain) as

𝑛∗ ∶= inf{𝑖 ∈ ℕ ∶ 𝑡∗𝑖 = 0} (15.6)

The task allocation corresponding to (15.5) is given by ℓ𝑖∗ ∶= 𝑡∗𝑖−1 − 𝑡∗𝑖 for all 𝑖.
In [Kikuchi et al., 2018] it is shown that
1. The value 𝑛∗ in (15.6) is well-defined and finite,
2. the allocation {ℓ𝑖∗ } is feasible, and
3. the price function 𝑝∗ and this allocation together forms an equilibrium for the production chain.
While the proofs are too long to repeat here, much of the insight can be obtained by observing that, as a fixed point of
𝑇 , the equilibrium price function must satisfy

𝑝∗ (𝑠) = min {𝑐(𝑠 − 𝑡) + 𝛿𝑝∗ (𝑡)} for all 𝑠 ∈ [0, 1] (15.7)

𝑡≤𝑠

From this equation, it is clear that so profits are zero for all incumbent firms.

15.4.2 Marginal Conditions

We can develop some additional insights on the behavior of firms by examining marginal conditions associated with the
equilibrium.
As a first step, let ℓ∗ (𝑠) ∶= 𝑠 − 𝑡∗ (𝑠).
This is the cost-minimizing range of in-house tasks for a firm with downstream boundary 𝑠.
In [Kikuchi et al., 2018] it is shown that 𝑡∗ and ℓ∗ are increasing and continuous, while 𝑝∗ is continuously differentiable
at all 𝑠 ∈ (0, 1) with

(𝑝∗ )′ (𝑠) = 𝑐′ (ℓ∗ (𝑠)) (15.8)

Equation (15.8) follows from 𝑝∗ (𝑠) = min𝑡≤𝑠 {𝑐(𝑠 − 𝑡) + 𝛿𝑝∗ (𝑡)} and the envelope theorem for derivatives.
A related equation is the first order condition for 𝑝∗ (𝑠) = min𝑡≤𝑠 {𝑐(𝑠 − 𝑡) + 𝛿𝑝∗ (𝑡)}, the minimization problem for a
firm with upstream boundary 𝑠, which is

𝛿(𝑝∗ )′ (𝑡∗ (𝑠)) = 𝑐′ (𝑠 − 𝑡∗ (𝑠)) (15.9)

This condition matches the marginal condition expressed verbally by Coase that we stated above:
“A firm will tend to expand until the costs of organizing an extra transaction within the firm become equal
to the costs of carrying out the same transaction by means of an exchange on the open market…”
Combining (15.8) and (15.9) and evaluating at 𝑠 = 𝑡𝑖 , we see that active firms that are adjacent satisfy

𝛿 𝑐′ (ℓ𝑖+1
∗
) = 𝑐′ (ℓ𝑖∗ ) (15.10)

In other words, the marginal in-house cost per task at a given firm is equal to that of its upstream partner multiplied by
gross transaction cost.
This expression can be thought of as a Coase–Euler equation, which determines inter-firm efficiency by indicating how
two costly forms of coordination (markets and management) are jointly minimized in equilibrium.

15.4. Existence, Uniqueness and Computation of Equilibria 295

Advanced Quantitative Economics with Python

15.5 Implementation

For most specifications of primitives, there is no closed-form solution for the equilibrium as far as we are aware.
However, we know that we can compute the equilibrium corresponding to a given transaction cost parameter 𝛿 and a cost
function 𝑐 by applying the results stated above.
In particular, we can
1. fix initial condition 𝑝 ∈ 𝒫,
2. iterate with 𝑇 until 𝑇 𝑛 𝑝 has converged to 𝑝∗ , and
3. recover firm choices via the choice function (15.3)
At each iterate, we will use continuous piecewise linear interpolation of functions.
To begin, here’s a class to store primitives and a grid:

class ProductionChain:

def __init__(self,
n=1000,
delta=1.05,
c=lambda t: np.exp(10 * t) - 1):

self.n, self.delta, self.c = n, delta, c

self.grid = np.linspace(1e-04, 1, n)

Now let’s implement and iterate with 𝑇 until convergence.

Recalling that our initial condition must lie in 𝒫, we set 𝑝0 = 𝑐

def compute_prices(pc, tol=1e-5, max_iter=5000):

"""
Compute prices by iterating with T

* pc is an instance of ProductionChain
* The initial condition is p = c

"""
delta, c, n, grid = pc.delta, pc.c, pc.n, pc.grid
p = c(grid) # Initial condition is c(s), as an array
new_p = np.empty_like(p)
error = tol + 1
i = 0

while error > tol and i < max_iter:

for j, s in enumerate(grid):
Tp = lambda t: delta * np.interp(t, grid, p) + c(s - t)
new_p[j] = Tp(fminbound(Tp, 0, s))
error = np.max(np.abs(p - new_p))
p = new_p
i = i + 1

if i < max_iter:
print(f"Iteration converged in {i} steps")
else:
print(f"Warning: iteration hit upper bound {max_iter}")
(continues on next page)

296 Chapter 15. Coase’s Theory of the Firm

Advanced Quantitative Economics with Python

(continued from previous page)

p_func = lambda x: np.interp(x, grid, p)

return p_func

The next function computes optimal choice of upstream boundary and range of task implemented for a firm face price
function p_function and with downstream boundary 𝑠.

def optimal_choices(pc, p_function, s):

"""
Takes p_func as the true function, minimizes on [0,s]

Returns optimal upstream boundary t_star and optimal size of

firm ell_star

In fact, the algorithm minimizes on [-1,s] and then takes the

max of the minimizer and zero. This results in better results
close to zero

"""
delta, c = pc.delta, pc.c
f = lambda t: delta * p_function(t) + c(s - t)
t_star = max(fminbound(f, -1, s), 0)
ell_star = s - t_star
return t_star, ell_star

The allocation of firms can be computed by recursively stepping through firms’ choices of their respective upstream
boundary, treating the previous firm’s upstream boundary as their own downstream boundary.
In doing so, we start with firm 1, who has downstream boundary 𝑠 = 1.

def compute_stages(pc, p_function):

s = 1.0
transaction_stages = [s]
while s > 0:
s, ell = optimal_choices(pc, p_function, s)
transaction_stages.append(s)
return np.array(transaction_stages)

Let’s try this at the default parameters.

The next figure shows the equilibrium price function, as well as the boundaries of firms as vertical lines

pc = ProductionChain()
p_star = compute_prices(pc)

transaction_stages = compute_stages(pc, p_star)

fig, ax = plt.subplots()

ax.plot(pc.grid, p_star(pc.grid))
ax.set_xlim(0.0, 1.0)
ax.set_ylim(0.0)
for s in transaction_stages:
ax.axvline(x=s, c="0.5")
plt.show()

15.5. Implementation 297

Advanced Quantitative Economics with Python

Iteration converged in 2 steps

Here’s the function ℓ∗ , which shows how large a firm with downstream boundary 𝑠 chooses to be

ell_star = np.empty(pc.n)
for i, s in enumerate(pc.grid):
t, e = optimal_choices(pc, p_star, s)
ell_star[i] = e

fig, ax = plt.subplots()
ax.plot(pc.grid, ell_star, label=r"$\ell^*$")
ax.legend(fontsize=14)
plt.show()

298 Chapter 15. Coase’s Theory of the Firm

Advanced Quantitative Economics with Python

Note that downstream firms choose to be larger, a point we return to below.

15.6 Exercises

Exercise 15.6.1
The number of firms is endogenously determined by the primitives.
What do you think will happen in terms of the number of firms as 𝛿 increases? Why?
Check your intuition by computing the number of firms at delta in (1.01, 1.05, 1.1).

Solution to Exercise 15.6.1

Here is one solution

for delta in (1.01, 1.05, 1.1):

pc = ProductionChain(delta=delta)
p_star = compute_prices(pc)
transaction_stages = compute_stages(pc, p_star)
num_firms = len(transaction_stages)
print(f"When delta={delta} there are {num_firms} firms")

Iteration converged in 2 steps

When delta=1.01 there are 64 firms

15.6. Exercises 299

Advanced Quantitative Economics with Python

Iteration converged in 2 steps

When delta=1.05 there are 41 firms

Iteration converged in 2 steps

When delta=1.1 there are 35 firms

Exercise 15.6.2
The value added of firm 𝑖 is 𝑣𝑖 ∶= 𝑝∗ (𝑡𝑖−1 ) − 𝑝∗ (𝑡𝑖 ).
One of the interesting predictions of the model is that value added is increasing with downstreamness, as are several other
measures of firm size.
Can you give any intution?
Try to verify this phenomenon (value added increasing with downstreamness) using the code above.

Solution to Exercise 15.6.2

Firm size increases with downstreamness because 𝑝∗ , the equilibrium price function, is increasing and strictly convex.
This means that, for a given producer, the marginal cost of the input purchased from the producer just upstream from
itself in the chain increases as we go further downstream.
Hence downstream firms choose to do more in house than upstream firms — and are therefore larger.
The equilibrium price function is strictly convex due to both transaction costs and diminishing returns to management.
One way to put this is that firms are prevented from completely mitigating the costs associated with diminishing returns
to management — which induce convexity — by transaction costs. This is because transaction costs force firms to have
nontrivial size.
Here’s one way to compute and graph value added across firms

pc = ProductionChain()
p_star = compute_prices(pc)
stages = compute_stages(pc, p_star)

va = []

for i in range(len(stages) - 1):

va.append(p_star(stages[i]) - p_star(stages[i+1]))

fig, ax = plt.subplots()
ax.plot(va, label="value added by firm")
ax.set_xticks((5, 25))
ax.set_xticklabels(("downstream firms", "upstream firms"))
plt.show()

Iteration converged in 2 steps

300 Chapter 15. Coase’s Theory of the Firm

Advanced Quantitative Economics with Python

15.6. Exercises 301

Advanced Quantitative Economics with Python

302 Chapter 15. Coase’s Theory of the Firm

CHAPTER

SIXTEEN

COMPOSITE SORTING

16.1 Overview

Optimal transport theory is studies how one (marginal) probabilty measure can be related to another (marginal) probability
measure in an ideal way.
The output of such a theory is a coupling of the two probability measures, i.e., a joint probabilty measure having those
two marginal probability measures.
This lecture describes how Job Boerma, Aleh Tsyvinski, Ruodo Wang, and Zhenyuan Zhang [Boerma et al., 2024] used
optimal transport theory to formulate and solve an equilibrium of a model in which wages and allocations of workers
across jobs adjust to match measures of different types with measures of different types of occupations.
Production technologies allow firms to affect shape costs of mismatch with the consequence that costs of mismatch can
be concave.
That means that it is possible that equilibrium there is neither positive assortive nor negative assorting matching, an
outcome that [Boerma et al., 2024] call composite assortive matching.
For example, in an equilibrium with composite matching, identical workers can sort into different occupations, some
positively and some negatively.
[Boerma et al., 2024] show how this can generate distinct distributions of labor earnings within and across occupations.
This lecture describes the [Boerma et al., 2024] model and presents Python code for computing equilibria.
The lecture applies the code to the [Boerma et al., 2024] model of labor markets.
As with an earlier QuantEcon lecture on optimal transport, a key tool will be linear programming.

16.2 Setup

𝑋 and 𝑌 are finite sets that represent two distinct types of people to be matched.
For each 𝑥 ∈ 𝑋, let a positive integer 𝑛𝑥 be the number of agents of type 𝑥.
Similarly, let a positive integer 𝑚𝑦 be the agents of agents of type 𝑦 ∈ 𝑌 .
We refer to these two measures as marginals.
We assume that

∑ 𝑛𝑥 = ∑ 𝑚𝑦 =∶ 𝑁
𝑥∈𝑋 𝑦∈𝑌

so that the matching problem is balanced.

303
Advanced Quantitative Economics with Python

Given a cost function 𝑐 ∶ 𝑋 × 𝑌 → ℝ, the (discrete) optimal transport problem is

min ∑ 𝜇𝑥𝑦 𝑐𝑥𝑦

𝜇≥0
(𝑥,𝑦)∈𝑋×𝑌

s.t. ∑ 𝜇𝑥𝑦 = 𝑛𝑥
𝑥∈𝑋

∑ 𝜇𝑥𝑦 = 𝑚𝑦
𝑦∈𝑌

Given our discreteness assumptions about 𝑋 and 𝑌 , the problem admits an integer solution 𝜇 ∈ ℤ𝑋×𝑌
+ , i.e., 𝜇𝑥𝑦 is a
non-negative integer for each 𝑥 ∈ 𝑋, 𝑦 ∈ 𝑌 .
We will study integer solutions.
Two points about restricting ourselves to integer solutions are worth mentioning:
• it is without loss of generality for computational purposes, since every problem with float marginals can be trans-
formed into an equivalent problem with integer marginals;
• although the mathematical structure that we present actually works for arbitrary real marginals, some of our Python
implementations would fail to work with float arithmetic.
We focus on a specific instance of an optimal transport problem:
We assume that 𝑋 and 𝑌 are finite subsets of ℝ and that the cost function satisfies 𝑐𝑥𝑦 = ℎ(|𝑥 − 𝑦|) for all 𝑥, 𝑦 ∈ ℝ, for
an ℎ ∶ ℝ+ → ℝ+ that is strictly concave and strictly increasing and grounded (i.e., ℎ(0) = 0).
Such an ℎ satisfies the following
Lemma. If ℎ ∶ ℝ+ → ℝ+ is strictly concave and grounded, then ℎ is strictly subadditive, i.e. for all 𝑥, 𝑦 ∈ ℝ+ , 0 < 𝑥 < 𝑦,
we have

ℎ(𝑥 + 𝑦) < ℎ(𝑥) + ℎ(𝑦)

Proof. For 𝛼 ∈ (0, 1) and 𝑥 > 0 we have, by strict concavity and groundedness, ℎ(𝛼𝑥) > 𝛼ℎ(𝑥)+(1−𝛼)ℎ(0) = 𝛼ℎ(𝑥).
𝑥
Now fix 𝑥, 𝑦 ∈ ℝ+ , 0 < 𝑥 < 𝑦, and let 𝛼 = 𝑥+𝑦 ; the previous observation gives ℎ(𝑥) = ℎ(𝛼(𝑥 + 𝑦)) > 𝛼ℎ(𝑥 + 𝑦) and
ℎ(𝑦) = ℎ((1 − 𝛼)(𝑥 + 𝑦)) > (1 − 𝛼)ℎ(𝑥 + 𝑦); summing these inequality delivers the result. □
In the following implementation we assume that the cost function is 𝑐𝑥𝑦 = |𝑥 − 𝑦|1/𝜁 for 𝜁 > 1, i.e. ℎ(𝑧) = 𝑧 1/𝜁 for
𝑧 ∈ ℝ+ .
Hence, our problem is

min ∑ 𝜇𝑥𝑦 |𝑥 − 𝑦|1/𝜁

𝜇∈ℤ𝑋×𝑌
+ (𝑥,𝑦)∈𝑋×𝑌

s.t. ∑ 𝜇𝑥𝑦 = 𝑛𝑥
𝑥∈𝑋

∑ 𝜇𝑥𝑦 = 𝑚𝑦
𝑦∈𝑌

Let’s start setting up some Python code.

We use the following imports:

import numpy as np
from scipy.optimize import linprog
from itertools import chain
import pandas as pd
from collections import namedtuple
(continues on next page)

304 Chapter 16. Composite Sorting

Advanced Quantitative Economics with Python

(continued from previous page)

import matplotlib.pyplot as plt

import matplotlib.patches as patches
from matplotlib.ticker import MaxNLocator
from matplotlib import cm
from matplotlib.colors import Normalize

The following Python class takes as inputs sets of types 𝑋, 𝑌 ⊂ ℝ, marginals 𝑛, 𝑚 with positive integer entries such that
∑𝑥∈𝑋 𝑛𝑥 = ∑𝑦∈𝑌 𝑚𝑦 and cost parameter 𝜁 > 1.

The cost function is stored as an |𝑋| × |𝑌 | matrix with (𝑥, 𝑦)-entry equal to |𝑥 − 𝑦|1/𝜁 , i.e., the cost of matching an agent
of type 𝑥 ∈ 𝑋 with an agent of type 𝑦 ∈ 𝑌 .

class ConcaveCostOT():
def __init__(self, X_types=None, Y_types=None, n_x =None, m_y=None, ζ=2):

# Sets of types
self.X_types, self.Y_types = X_types, Y_types

# Marginals
if X_types is not None and Y_types is not None:
non_empty_types = True
self.n_x = np.ones(len(X_types), dtype=int) if n_x is None else n_x
self.m_y = np.ones(len(Y_types), dtype=int) if m_y is None else m_y
else:
non_empty_types = False
self.n_x, self.m_y = n_x, m_y

# Cost function: |X|x|Y| matrix

self.ζ = ζ
if non_empty_types:
self.cost_x_y = np.abs(X_types[:, None] - Y_types[None, :]) \
** (1 / ζ)
else:
self.cost_x_y = None

Let’s consider a random instance with given numbers of types |𝑋| and |𝑌 | and a given number of agents.
First, we generate random types 𝑋 and 𝑌 .
Then we generate random quantities for each type so that there are 𝑁 agents for each side.

number_of_x_types = 20
number_of_y_types = 20
N_agents_per_side = 60

np.random.seed(1)

## Genetate random types

# generate random support for distributions of types
support_size = 50
random_support = np.unique(np.random.uniform(0,200, size=support_size))

# generate types
X_types_example = np.random.choice(random_support,
(continues on next page)

16.2. Setup 305

Advanced Quantitative Economics with Python

(continued from previous page)

size=number_of_x_types, replace=False)
Y_types_example = np.random.choice(random_support,
size=number_of_y_types, replace=False)

## Generate random integer types quantities summing to N_agents_per_side

# generate integer vectors of lenght n_types summing to n_agents

def random_marginal(n_types, n_agents):
cuts = np.sort(np.random.choice(np.arange(1,n_agents),
size= n_types-1, replace=False))
segments = np.diff(np.concatenate(([0], cuts, [n_agents])))
return segments

# Create a method to assign random marginals to our class

def assign_random_marginals(self,random_seed):
np.random.seed(random_seed)
self.n_x = random_marginal(len(self.X_types), N_agents_per_side)
self.m_y = random_marginal(len(self.Y_types), N_agents_per_side)

ConcaveCostOT.assign_random_marginals = assign_random_marginals

# Create an instance of our class and generate random marginals

example_pb = ConcaveCostOT(X_types_example, Y_types_example, ζ=2)
example_pb.assign_random_marginals(random_seed=1)

We use 𝐹 (resp. 𝐺) to denote the cumulative distribution function associated to the measure 𝑛 (resp. 𝑚)
Thus, 𝐹 (𝑧) = ∑𝑥≤𝑧∶𝑛 𝑛𝑥 and 𝐺(𝑧) = ∑𝑦≤𝑧∶𝑚 𝑚𝑦 for 𝑧 ∈ ℝ.
𝑥 >0 𝑦 >0

Notice that we not normalizing the measures so 𝐹 (∞) = 𝐺(∞) = 𝑁 .

The following method plots the marginals on the real line
• blue for 𝑋 types,
• red for 𝑌 types.
Note that there are possible overlaps between 𝑋 and 𝑌 .

def plot_marginals(self, figsize=(15, 8), title='Distributions of types'):

plt.figure(figsize=figsize)

# Scatter plot n_x

plt.scatter(self.X_types, self.n_x, color='blue', label='n_x')
plt.vlines(self.X_types, ymin=0, ymax= self.n_x,
color='blue', linestyles='dashed')

# Scatter plot m_y

plt.scatter(self.Y_types, - self.m_y, color='red', label='m_y')
plt.vlines(self.Y_types, ymin=0, ymax=- self.m_y,
color='red', linestyles='dashed')

# Add grid and y=0 axis

plt.grid(True)
plt.axhline(0, color='black', linewidth=1)
plt.gca().spines['bottom'].set_position(('data', 0))

(continues on next page)

306 Chapter 16. Composite Sorting

Advanced Quantitative Economics with Python

(continued from previous page)

# Labeling the axes and the title
plt.ylabel('frequency')
plt.title(title)
plt.gca().yaxis.set_major_locator(MaxNLocator(integer=True))
plt.legend()
plt.show()

ConcaveCostOT.plot_marginals = plot_marginals

example_pb.plot_marginals()

16.3 Characterization of primal solution

16.3.1 Three properties of an optimal solution

We now indicate important properties that are satisfied by an optimal solution.

1. Maximal number of perfect pairs
2. No intersecting pairs
3. Layering
(Maximal number of perfect pairs)
If (𝑧, 𝑧) ∈ 𝑋 × 𝑌 for some 𝑧 ∈ ℝ then in each optimal solution there are min{𝑛𝑧 , 𝑚𝑧 } matches between type 𝑧 ∈ 𝑋
and 𝑧 ∈ 𝑌 .
Indeed, assume by contradiction that at an optimal solution we have (𝑧, 𝑦) and (𝑥, 𝑧) matched in positive amounts for
𝑦, 𝑥 ≠ 𝑧.

16.3. Characterization of primal solution 307

Advanced Quantitative Economics with Python

We can verify that reassigning the minimum of such quantities to the pairs (𝑧, 𝑧) and (𝑥, 𝑦) improves upon the current
matching since

ℎ(|𝑥 − 𝑦|) ≤ ℎ(|𝑥 − 𝑧| + |𝑧 − 𝑦|) < ℎ(|𝑥 − 𝑧|) + ℎ(|𝑧 − 𝑦|)

where the first inequality follows from triangle inequality and the fact that ℎ is increasing and the strict inequality from
strict subadditivity.
We can then repeat the operation for any other analogous pair of matches involving 𝑧, while improving the value, until
we have mass min{𝑛𝑧 , 𝑚𝑧 } on match (𝑧, 𝑧).
Viewing the matching 𝜇 as a measure on 𝑋 × 𝑌 with marginals 𝑛 and 𝑚, this property says that in any optimal 𝜇 we
have 𝜇𝑧𝑧 = 𝑛𝑧 ∧ 𝑚𝑧 for (𝑧, 𝑧) in the diagonal {(𝑥, 𝑦) ∈ 𝑋 × 𝑌 ∶ 𝑥 = 𝑦} of ℝ × ℝ.
The following method finds perfect pairs and returns the on-diagonal matchings as well as the residual off-diagonal
marginals.

def match_perfect_pairs(self):

# Find pairs on diagonal and related mass

perfect_pairs_x, perfect_pairs_y = np.where(
self.X_types[:,None] == self.Y_types[None,:])
Δ_q = np.minimum(self.n_x[perfect_pairs_x] ,self.m_y[perfect_pairs_y])

# Compute off-diagonal residual masses for each side

n_x_off_diag = self.n_x.copy()
n_x_off_diag[perfect_pairs_x]-= Δ_q

m_y_off_diag = self.m_y.copy()
m_y_off_diag[perfect_pairs_y] -= Δ_q

# Compute on-diagonal matching

matching_diag = np.zeros((len(self.X_types), len(self.Y_types)), dtype=int)
matching_diag[perfect_pairs_x, perfect_pairs_y] = Δ_q

return n_x_off_diag, m_y_off_diag , matching_diag

ConcaveCostOT.match_perfect_pairs = match_perfect_pairs

n_x_off_diag, m_y_off_diag , matching_diag = example_pb.match_perfect_pairs()

print(f"On-diagonal matches: {matching_diag.sum()}")
print(f"Residual types in X: {len(n_x_off_diag[n_x_off_diag >0])}")
print(f"Residual types in Y: {len(m_y_off_diag[m_y_off_diag >0])}")

On-diagonal matches: 15
Residual types in X: 14
Residual types in Y: 16

We can therefore create a new instance with the residual marginals that will feature no perfect pairs.
Later we shall add the on-diagonal matching to the solution of this new instance.
We refer to this instance as “off-diagonal” since the product measure of the residual marginals 𝑛 ⊗ 𝑚 feature zeros mass
on the diagonal of ℝ × ℝ.
In the rest of this section, we will focus on this instance.
We create a subclass to study the residual off-diagonal problem.

308 Chapter 16. Composite Sorting

Advanced Quantitative Economics with Python

The subclass inherits the attributes and the modules from the original class.
We let 𝑍 ∶= 𝑋 ⊔ 𝑌 , where ⊔ denotes the union of disjoint sets. We will
• index types 𝑋 as {0, … , |𝑋| − 1} and types 𝑌 as {|𝑋|, … , |𝑋| + |𝑌 | − 1};
• store the cost function as a |𝑍| × |𝑍| matrix with entry (𝑧, 𝑧 ′ ) equal to 𝑐𝑥𝑦 if 𝑧 = 𝑥 ∈ 𝑋 and 𝑧 ′ = 𝑦 ∈ 𝑌 or
𝑧 = 𝑦 ∈ 𝑌 and 𝑧 ′ = 𝑥 ∈ 𝑋 or equal to +∞ if 𝑧 and 𝑧 ′ belong to the same side
– (the latter is just customary, since these “infinitely penalized” entries are actually never accessed in the im-
plementation);
• let 𝑞 be a vector of size |𝑍| whose 𝑧-th entry equals 𝑛𝑥 if type 𝑥 is the 𝑧-th smallest type in 𝑍 and −𝑚𝑦 if type 𝑦
is the 𝑧-th smallest type in 𝑍; hence 𝑞 encodes capacities of both sides on the (ascending) sorted set of types.
Finally, we add a method to flexibly add a pair (𝑖, 𝑗) with 𝑖 ∈ {0, … , |𝑋| − 1}, 𝑗 ∈ {|𝑋|, … , |𝑋| + |𝑌 | − 1} or
𝑗 ∈ {0, … , |𝑋| − 1}, 𝑖 ∈ {|𝑋|, … , |𝑋| + |𝑌 | − 1} to a matching matrix of size |𝑋| × |𝑌 |.

class OffDiagonal(ConcaveCostOT):
def __init__(self, X_types, Y_types, n_x, m_y, ζ):
super().__init__(X_types, Y_types, n_x, m_y, ζ)

# Types (unsorted)
self.types_list = np.concatenate((X_types,Y_types))

# Cost function: |Z|x|Z| matrix

self.cost_z_z = np.ones((len(self.types_list),
len(self.types_list))) * np.inf

# upper-right block
self.cost_z_z[:len(self.X_types), len(self.X_types):] = self.cost_x_y

# lower-left block
self.cost_z_z[len(self.X_types):, :len(self.X_types)] = self.cost_x_y.T

## Distributions of types
# sorted types and index identifier for each z in support
self.type_z = np.argsort(self.types_list)
self.support_z = self.types_list[self.type_z]

# signed quantity for each type z

self.q_z = np.concatenate([n_x, - m_y])[self.type_z]

# Mathod that adds to matching matrix a pair (i,j)

def add_pair_to_matching(self, pair_ids, matching):
if pair_ids[0] < pair_ids[1]:
# the pair of indices correspond to a pair (x,y)
matching[pair_ids[0], pair_ids[1]-len(self.X_types)] = 1
else:
# the pair of indices correspond to a pair (y,x)
matching[pair_ids[1], pair_ids[0]-len(self.X_types)] = 1

We add a function that returns an instance of the off-diagonal subclass as well as the on-diagonal matching and the indices
of the residual off-diagonal types.
These indices will come handy for adding the off-diagonal matching matrix to the diagonal matching matrix we just found,
since the former will have a smaller size if there are perfect pairs in the original problem.

16.3. Characterization of primal solution 309

Advanced Quantitative Economics with Python

def generate_offD_onD_matching(self):
# Match perfect pairs and compute on-diagonal matching
n_x_off_diag, m_y_off_diag , matching_diag = self.match_perfect_pairs()

# Find indices of residual non-zero quantities for each side

nonzero_id_x = np.flatnonzero(n_x_off_diag)
nonzero_id_y = np.flatnonzero(m_y_off_diag)

# Create new instance with off-diagonal types

off_diagonal = OffDiagonal(self.X_types[nonzero_id_x],
self.Y_types[nonzero_id_y],
n_x_off_diag[nonzero_id_x],
m_y_off_diag[nonzero_id_y],
self.ζ)

return off_diagonal, (nonzero_id_x, nonzero_id_y, matching_diag)

ConcaveCostOT.generate_offD_onD_matching = generate_offD_onD_matching

We apply it to our example:

example_off_diag, _ = example_pb.generate_offD_onD_matching()

Let’s plot the residual marginals to verify visually that there are no overlappings between types from distinct sides in the
off-diagonal instance.

example_off_diag.plot_marginals(title='Distributions of types: off-diagonal')

(No intersecting pairs) This property summarizes the following fact:

• represent both types on the real line and draw a semicirle joining (𝑥, 𝑦) for all pairs (𝑥, 𝑦) ∈ 𝑋 × 𝑌 that are
matched in a solution
• these semicirles do not intersect (unless they share one of the endpoints).

310 Chapter 16. Composite Sorting

Advanced Quantitative Economics with Python

A proof proceeds by contradiction.

Let’s consider types 𝑥, 𝑥′ ∈ 𝑋 and 𝑦, 𝑦′ ∈ 𝑌 .
Matched pairs cain “intersect” (or be tangent).
We will show that in both cases the partial matching among types 𝑥, 𝑥′ , 𝑦, 𝑦′ can be improved by uncrossing, i.e. reas-
signing the quantities while improving on the solution and reducing the number of intersecting pairs.
The first case of intersecting pairs is

𝑥 < 𝑦 < 𝑦 ′ < 𝑥′

with pairs (𝑥, 𝑦′ ) and (𝑥′ , 𝑦) matched in positive quantities.

Then it follows from strict monotonicity of ℎ that ℎ(|𝑥 − 𝑦|) < ℎ(|𝑥 − 𝑦′ |) and ℎ(|𝑥′ − 𝑦′ |) < ℎ(|𝑥′ − 𝑦|), hence
ℎ(|𝑥 − 𝑦|) + ℎ(|𝑥′ − 𝑦′ |) < ℎ(|𝑥 − 𝑦′ |) + ℎ(|𝑥′ − 𝑦|).
Therefore, we can take the minimum of the masses of the matched pairs (𝑥, 𝑦′ ) and (𝑥′ , 𝑦) and reallocate it to the pairs
(𝑥, 𝑦) and (𝑥′ , 𝑦′ ), therby strictly improving the cost among 𝑥, 𝑦, 𝑥′ , 𝑦′ .
The second case of intersecting pairs is

𝑥 < 𝑥′ < 𝑦 ′ < 𝑦

with pairs (𝑥, 𝑦′ ) and (𝑥′ , 𝑦) matched.

In this case we have

|𝑥 − 𝑦′ | + |𝑥′ − 𝑦| = |𝑥 − 𝑦| + |𝑥′ − 𝑦′ |

|𝑥−𝑦|+|𝑥′ −𝑦|
Letting 𝛼 ∶= |𝑥−𝑦′ |−|𝑥′ −𝑦| ∈ (0, 1), we have |𝑥−𝑦| = 𝛼|𝑥−𝑦′ |+(1−𝛼)|𝑥′ −𝑦| and |𝑥′ −𝑦′ | = (1−𝛼)|𝑥−𝑦′ |+𝛼|𝑥′ −𝑦|.
Hence, by strict concavity of ℎ,

ℎ(|𝑥 − 𝑦|) + ℎ(|𝑥′ − 𝑦′ |) < 𝛼ℎ(|𝑥 − 𝑦′ |) + (1 − 𝛼)ℎ(|𝑥′ − 𝑦|) + (1 − 𝛼)ℎ(|𝑥 − 𝑦′ |) + 𝛼ℎ(|𝑥′ − 𝑦|) = ℎ(|𝑥 − 𝑦′ |) + ℎ(|𝑥′ − 𝑦|).

Therefore, as in the first case, we can strictly improve the cost among 𝑥, 𝑦, 𝑥′ , 𝑦′ by uncrossing the pairs.
Finally, it remains to argue that in both cases uncrossing operations do not increase the number of intersections with other
matched pairs.
It can indeed be shown on a case-by-case basis that, in both of the above cases, for any other matched pair (𝑥″ , 𝑦″ ) the
number of intersections between pairs (𝑥, 𝑦), (𝑥′ , 𝑦′ ) and the pair (𝑥″ , 𝑦″ ) (i.e., after uncrossing) is not larger than the
number of intersections between pairs (𝑥, 𝑦′ ), (𝑥′ , 𝑦) and the pair (𝑥″ , 𝑦″ ) (i.e., before uncrossing), hence the uncrossing
operations above reduce the number of intersections.
We conclude that if a matching features intersecting pairs, it can be modified via a sequence of uncrossing operations
into a matching without intersecting pairs while improving on the value.
(Layering) Recall that there are 2𝑁 individual agents, each agent 𝑖 having type 𝑧𝑖 ∈ 𝑋 ⊔ 𝑌 .
When we introduce the off diagonal matching, to stress that the types sets are disjoint now.
To simplify our explanation of this property, assume for now that each agent has its own distinct type (i.e., |𝑋| = |𝑌 | = 𝑁
and 𝑛 = 𝑚 = 1𝑁 ), in which case the optimal transport problem is also referred to as assignment problem.
Let’s index agents according to their types:

𝑧1 < 𝑧2 ⋯ < 𝑧2𝑁−1 < 𝑧2𝑁 .

Suppose that agents 𝑖 of type 𝑧𝑖 and 𝑗 of type 𝑧𝑗 , with 𝑧𝑖 < 𝑧𝑗 , are matched in a particular optimal solution.

16.3. Characterization of primal solution 311

Advanced Quantitative Economics with Python

Then there is an equal number of agents from each side in {𝑖 + 1, … , 𝑗 − 1}, if this set is not empty.
Indeed, if this were not the case, then some agent 𝑘 ∈ {𝑖 + 1, 𝑗 − 1} would be matched with some agent ℓ with
ℓ ∉ {𝑖, … , 𝑗}, i.e., there would be types

𝑧𝑖 < 𝑧𝑘 < 𝑧𝑗 < 𝑧ℓ

with matches (𝑧𝑖 , 𝑧𝑗 ) and (𝑧𝑘 , 𝑧ℓ ), violating the no intersecting pairs property.
We conclude that we can define a binary relation on [𝑁 ] such that 𝑖 ∼ 𝑗 if there is an equal number of agents of each
side in {𝑖, 𝑖 + 1, … , 𝑗} (or if this set is empty).
This is an equivalence relation, so we can find associated equivalence classes that we call layers.
By the reasoning above, in an optimal solution all pairs 𝑖, 𝑗 (of opposite sides) which are matched belong to the same
layer, hence we can solve the assignment problem associated to each layer and then add up the solutions.
In terms of distributions, 𝑖 and 𝑗, of types 𝑥 ∈ 𝑋 and 𝑦 ∈ 𝑌 respectively, belong to the same layer (i.e., 𝑥 ∼ 𝑦) if and
only if 𝐹 (𝑦−) − 𝐹 (𝑥) = 𝐺(𝑦−) − 𝐺(𝑥).
If 𝐹 and 𝐺 were continuous, then 𝐹 (𝑦) − 𝐹 (𝑥) = 𝐺(𝑦) − 𝐺(𝑥) ⟺ 𝐹 (𝑥) − 𝐺(𝑥) = 𝐹 (𝑦) − 𝐺(𝑦).
This suggests that the following quantity plays an important role:

𝐻(𝑧) ∶= 𝐹 (𝑧) − 𝐺(𝑧), for 𝑧 ∈ ℝ.

Returning to our general (integer) discrete setting, let’s plot 𝐻.

Notice that 𝐻 is right-continuous (being the difference of right-continuous functions) and that upward (resp. downward)
jumps correspond to point masses of agents with types from 𝑋 (resp. 𝑌 ).

def plot_H_z(self, figsize=(15, 8), range_x_axis=None, scatter=True):

# Determine H(z) = F(z) - G(z)
H_z = np.cumsum(self.q_z)

# Plot H(z)
plt.figure(figsize=figsize)
plt.axhline(0, color='black', linewidth=1)

# determine the step points for horizontal lines

step = np.concatenate(([self.support_z.min() - .05 * self.support_z.ptp()],
self.support_z,
[self.support_z.max() + .05 * self.support_z.ptp()]))
height = np.concatenate(([0], H_z, [0]))

# plot the horizontal lines of the step function

for i in range(len(step) - 1):
plt.plot([step[i], step[i+1]], [height[i], height[i]], color='black')

# draw dashed vertical lines for the step function

for i in range(1, len(step) - 1):
plt.plot([step[i], step[i]], [height[i-1], height[i]],
color='black', linestyle='--')

# plot discontinuities points of H(z)

if scatter:
plt.scatter(np.sort(self.X_types), H_z[self.q_z > 0], color='blue')
plt.scatter(np.sort(self.Y_types), H_z[self.q_z < 0], color='red')

if range_x_axis is not None:

(continues on next page)

312 Chapter 16. Composite Sorting

Advanced Quantitative Economics with Python

(continued from previous page)

plt.xlim(range_x_axis)

# Add labels and title

plt.title('Underqualification Measure (Off-Diagonal)')
plt.xlabel('$z$')
plt.ylabel('$H(z)$')
plt.grid(False)
plt.gca().yaxis.set_major_locator(MaxNLocator(integer=True))
plt.show()

OffDiagonal.plot_H_z = plot_H_z

example_off_diag.plot_H_z()

The layering property extends to the general discrete setting.

There are |𝐻(ℝ)| − 1 layers in total.
Enumerating the range of 𝐻 as 𝐻(ℝ) = {ℎ1 , ℎ2 , … , ℎ|𝐻(ℝ)| } with ℎ1 < ℎ2 < ⋯ < ℎ|𝐻(ℝ)| , we can define layer 𝐿ℓ , for
ℓ ∈ {1, … , |𝐻(ℝ)| − 1} as the collection of types 𝑧 ∈ 𝑍 such that

𝐻(𝑧−) ≤ ℎℓ−1 < ℎℓ ≤ 𝐻(𝑧),

(which are types in 𝑋), or

𝐻(𝑧) ≤ ℎℓ−1 < ℎℓ ≤ 𝐻(𝑧−),

which are types in 𝑌 .

The mass associated with layer 𝐿ℓ is 𝑀ℓ = ℎℓ+1 − ℎℓ .
Intuitively, a layer 𝐿ℓ consists of some mass 𝑀ℓ , of multiple types in 𝑍, i.e. the problem within the layer is unitary.
A unitary problem is essentially an assignment problem up to a constant: we can solve the problem with unit mass and
then rescale a solution by 𝑀ℓ .

16.3. Characterization of primal solution 313

Advanced Quantitative Economics with Python

Moreover, each layer 𝐿ℓ contains an even number of types 𝑁ℓ ∈ 2ℕ, which are alternating, i.e., ordering them as
𝑧1 < 𝑧2 ⋯ < 𝑧𝑁ℓ −1 < 𝑧𝑁ℓ all odd (or even, respectively) indexed types belong to the same side.
The following method finds the layers associated with distributions 𝐹 and 𝐺.
Again, types in 𝑋 are indexed with {0, … , |𝑋| − 1} and types in 𝑌 with {|𝑋|, … , |𝑋| + |𝑌 | − 1}.
Using these indices (instead of the types themselves) to represent the layers allows keeping track of sides types in each
layer, without adding an additional bit of information that would identify the side of the first type in the layer, which,
because a layer is alternating, would then allow identifying sides of all types in the layer.
In addition, using indices will let us extract the cost function within a layer from the cost function 𝑐𝑧𝑧′ computed offline.

def find_layers(self):
# Compute H(z) on the joint support
H_z = np.concatenate([[0], np.cumsum(self.q_z)])

# Compute the range of H, i.e. H(R), stored in ascending order

layers_height = np.unique(H_z)

# Compute the mass of each layer

layers_mass = np.diff(layers_height)

# Compute layers
# the following |H(R)|x|Z| matrix has entry (z,l) equal to 1 iff type z belongs␣
↪to layer l

layers_01 = ((H_z[None, :-1] <= layers_height[:-1, None])

* (layers_height[1:, None] <= H_z[None, 1:]) |
(H_z[None, 1:] <= layers_height[:-1, None])
* (layers_height[1:, None] <= H_z[None, :-1]))

# each layer is reshaped as a list of indices correponding to types

layers = [self.type_z[layers_01[ell]]
for ell in range(len(layers_height)-1)]

return layers, layers_mass, layers_height, H_z

OffDiagonal.find_layers = find_layers

layers_list_example, layers_mass_example, _, _ = example_off_diag.find_layers()

print(layers_list_example)

[array([23, 10]), array([27, 3, 23, 10]), array([16, 2, 21, 3, 25, 8, 23, 12]),
↪ array([16, 2, 21, 3, 25, 12]), array([22, 0, 16, 2, 21, 3, 18, 12]),␣
↪array([15, 0, 16, 2, 14, 5, 21, 3, 18, 9]), array([20, 0, 16, 2, 14, 5,␣
↪21, 3, 19, 11, 24, 1, 18, 9]), array([ 2, 26, 5, 21, 3, 19, 4, 18]),␣
↪array([ 2, 26, 7, 21, 3, 19, 4, 17, 6, 18]), array([13, 26, 7, 21, 3, 19, ␣
↪6, 18]), array([ 6, 18]), array([ 6, 28]), array([ 6, 29])]

The following method gives a graphical representation of the layers.

From the picture it is easy to spot two key features described above:
• types are alternating
• the layer problem is unitary

314 Chapter 16. Composite Sorting

Advanced Quantitative Economics with Python

def plot_layers(self, figsize=(15, 8)):

# Find layers
layers, layers_mass , layers_height, H_z = self.find_layers()

plt.figure(figsize=figsize)

# Plot H(z)
step = np.concatenate(([self.support_z.min() - .05 * self.support_z.ptp()],
self.support_z,
[self.support_z.max() + .05 * self.support_z.ptp()]))
height = np.concatenate((H_z, [0]))
plt.step(step, height, where='post', color='black', label='CDF', zorder=1)

# Plot layers
colors = cm.viridis(np.linspace(0, 1, len(layers)))
for ell, layer in enumerate(layers):
plt.vlines(self.types_list[layer], layers_height[ell] ,
layers_height[ell] + layers_mass[ell],
color=colors[ell], linewidth=2)
plt.scatter(self.types_list[layer],
np.ones(len(layer)) * layers_height[ell]
+.5 * layers_mass[ell],
color=colors[ell], s=50)

plt.axhline(layers_height[ell], color=colors[ell],
linestyle=':', linewidth=1.5, zorder=0)

# Add labels and title

plt.xlabel('$z$')
plt.title('Layers')
plt.gca().yaxis.set_major_locator(MaxNLocator(integer=True))
plt.show()

OffDiagonal.plot_layers = plot_layers

example_off_diag.plot_layers()

16.3. Characterization of primal solution 315

Advanced Quantitative Economics with Python

16.3.2 Solving a layer

Recall that layer 𝐿ℓ consists of a list of distinct types from 𝑌 ⊔ 𝑋

𝑧1 < 𝑧2 ⋯ < 𝑧𝑁ℓ −1 < 𝑧𝑁ℓ ,

which is alternating.
The problem within a layer is unitary.
Hence we can solve the problem with unit masses and later rescale the solution by the layer’s mass 𝑀ℓ .
Let us select a layer from the example above (we pick the one with maximum number of types) and plot the types on the
real line

# Pick layer with maximum number of types

layer_id_example = max(enumerate(layers_list_example),
key = lambda x: len(x[1]))[0]
layer_example = layers_list_example[layer_id_example]

# Plot layer types

def plot_layer_types(self, layer, mass, figsize=(15, 3)):

plt.figure(figsize=figsize)

# Scatter plot n_x

x_layer = layer[layer < len(self.X_types)]
y_layer = layer[layer >= len(self.X_types)] - len(self.X_types)
M_ell = np.ones(len(x_layer))* mass

plt.scatter(self.X_types[x_layer], M_ell, color='blue', label='X types')

plt.vlines(self.X_types[x_layer], ymin=0, ymax= M_ell,
(continues on next page)

316 Chapter 16. Composite Sorting

Advanced Quantitative Economics with Python

(continued from previous page)

color='blue', linestyles='dashed')

# Scatter plot m_y

plt.scatter(self.Y_types[y_layer], - M_ell, color='red', label='Y types')
plt.vlines(self.Y_types[y_layer], ymin=0, ymax=- M_ell,
color='red', linestyles='dashed')

# Add grid and y=0 axis

# plt.grid(True)
plt.axhline(0, color='black', linewidth=1)
plt.gca().spines['bottom'].set_position(('data', 0))

# Labeling the axes and the title

plt.ylabel('mass')
plt.title('Distributions of types in the layer')
plt.gca().yaxis.set_major_locator(MaxNLocator(integer=True))
plt.gca().spines['top'].set_visible(False)
plt.gca().spines['right'].set_visible(False)
plt.legend()
plt.show()

ConcaveCostOT.plot_layer_types = plot_layer_types

example_off_diag.plot_layer_types(layer_example,
layers_mass_example[layer_id_example])

Given the structure of a layer and the no intersecting pairs property, the optimal matching and value of the layer can be
found recursively.
Indeed, if in certain optimal matching 1 and 𝑗 ∈ [𝑁ℓ ], 𝑗 − 1 odd, are paired, then there is no matching between agents
in [2, 𝑗 − 1] and those in [𝑗 + 1, 𝑁ℓ ] (if both are non empty, i.e., 𝑗 is not 2 or 𝑁ℓ ).
Hence in such optimal solution agents in [2, 𝑗 − 1] are matched among themselves.
Since [𝑧2 , 𝑧𝑗−1 ] (as well as [𝑧𝑗+1 , 𝑧𝑁ℓ ]) is alternating, we can reason recursively.
Let 𝑉𝑖𝑗 be the optimal value of matching agents in [𝑖, 𝑗] with 𝑖, 𝑗 ∈ [𝑁ℓ ], 𝑗 − 𝑖 ∈ {1, 3, … , 𝑁ℓ − 1}.
Suppose that we computed the value 𝑉𝑖𝑗 for all 𝑖, 𝑗 ∈ [𝑁ℓ ] with 𝑖 − 𝑗 ∈ {1, 3, … , 𝑡 − 2} for some odd natural number 𝑡.
Then, for 𝑖, 𝑗 ∈ [𝑁ℓ ] with 𝑖 − 𝑗 = 𝑡 we have

𝑉𝑖𝑗 = min {𝑐𝑖𝑘 + 𝑉𝑖+1,𝑘−1 + 𝑉𝑘+1,𝑗 }

𝑘∈{𝑖+1,𝑖+3,…,𝑗}

with the RHS depending only on previously computed values.

We set the boundary conditions at 𝑡 = −1: 𝑉𝑖+1,𝑖 = 0 for each 𝑖 ∈ [𝑁ℓ ], so that we can apply the same Bellman equation
at 𝑡 = 1.

16.3. Characterization of primal solution 317

Advanced Quantitative Economics with Python

The following method takes as input the layer types indices and computes the value function as a matrix
[𝑉𝑖𝑗 ]𝑖∈[𝑁ℓ +1],𝑗∈[𝑁ℓ ] .
In order to distinguish entries that are relevant for our computations from those that are never accessed, we initialize this
matrix as full of NaN values.

def solve_bellman_eqs(self,layer):
# Recover cost function within the layer
cost_i_j = self.cost_z_z[layer[:,None],layer[None,:]]

# Initialize value function

V_i_j = np.full((len(layer)+1,len(layer)), np.nan)

# Add boundary conditions

i_bdry = np.arange(len(layer))
V_i_j[i_bdry+1, i_bdry] = 0

t = 1
while t < len(layer):
# Select agents i in [n_L-t] (with potential partners j's in [t,n_L])
i_t = np.arange(len(layer)-t)

# For each i, select each k with |k-i| <= t

# (potential partners of i within segment)
index_ik = i_t[:,None] + np.arange(1, t+1, 2)[None,:]

# Compute optimal value for pairs with |i-j| = t

V_i_j[i_t, i_t + t] = (cost_i_j[i_t[:,None], index_ik] +
V_i_j[i_t[:,None] + 1, index_ik - 1] +
V_i_j[index_ik + 1, i_t[:,None] + t]).min(1)
# Go to next odd integer
t += 2

return V_i_j

OffDiagonal.solve_bellman_eqs = solve_bellman_eqs

Let’s compute values for the layer from our example.

Only non-NaN entries are actually used in the computations.

# Compute layer value function

V_i_j = example_off_diag.solve_bellman_eqs(layer_example)

print(f"Type indices in the layer: {layer_example}")

print('##########################')
print("Section of the Value function of the layer:")
print(V_i_j.round(2)[:min(10, V_i_j.shape[0]),
:min(10, V_i_j.shape[1])])

Type indices in the layer: [20 0 16 2 14 5 21 3 19 11 24 1 18 9]

##########################
Section of the Value function of the layer:
[[ nan 4.29 nan 5.73 nan 9.82 nan 13.9 nan 14.52]
[ 0. nan 2.75 nan 6.17 nan 8.44 nan 10.56 nan]
[ nan 0. nan 1.44 nan 5.52 nan 9.6 nan 10.22]
[ nan nan 0. nan 3.58 nan 5.84 nan 7.96 nan]
(continues on next page)

318 Chapter 16. Composite Sorting

Advanced Quantitative Economics with Python

(continued from previous page)

[ nan nan nan 0. nan 4.08 nan 8.16 nan 8.78]
[ nan nan nan nan 0. nan 2.26 nan 4.38 nan]
[ nan nan nan nan nan 0. nan 4.08 nan 4.7 ]
[ nan nan nan nan nan nan 0. nan 2.12 nan]
[ nan nan nan nan nan nan nan 0. nan 0.62]
[ nan nan nan nan nan nan nan nan 0. nan]]

Having computed the value function, we can proceed to compute the optimal matching as the policy that attains the value
function that solves the Bellman equation (policy evaluation).
We start from agent 1 and match it with the 𝑘 that achieves the minimum in the equation associated with 𝑉1,2𝑁ℓ .
Then we store segments [2, 𝑘 − 1] and [𝑘 + 1, 2𝑁ℓ ] (if not empty).
In general, given a segment [𝑖, 𝑗], we match 𝑖 with 𝑘 that achieves the minimum in the equation associated with 𝑉𝑖𝑗 and
store the segments [𝑖, 𝑘 − 1] and [𝑘 + 1, 𝑗] (if not empty).
The algorithm proceeds until there are no segments left.

def find_layer_matching(self, V_i_j, layer):

# Initialize
segments_to_process = [np.arange(len(layer))]
matching = np.zeros((len(self.X_types),len(self.Y_types)), bool)

while segments_to_process:
# Pick i, first agent of the segment
# and potential partners i+1,i+3,..., in the segment
segment = segments_to_process[0]
i_0 = segment[0]
potential_matches = np.arange(i_0, segment[-1], 2) + 1

# Compute optimal partner j_i

obj = (self.cost_z_z[layer[i_0],layer[potential_matches]] +
V_i_j[i_0 +1, potential_matches -1] +
V_i_j[potential_matches +1,segment[-1]])

j_i_0 = np.argmin(obj)*2 + (i_0 + 1)

# Add matched pair (i,j_i)

self.add_pair_to_matching(layer[[i_0,j_i_0]], matching)

# Update segments to process:

# remove current segment
segments_to_process = segments_to_process[1:]

# add [i+1,j-1] and [j+1,last agent of the segment]

if j_i_0 > i_0 + 1:
segments_to_process.append(np.arange(i_0 + 1, j_i_0))
if j_i_0 < segment[-1]:
segments_to_process.append(np.arange(j_i_0 + 1, segment[-1] + 1))

return matching

OffDiagonal.find_layer_matching = find_layer_matching

Lets apply this method our example to find the matching within the layer and then rescale it by 𝑀ℓ .
Note that the unscaled value equals 𝑉1,𝑁ℓ .

16.3. Characterization of primal solution 319

Advanced Quantitative Economics with Python

matching_layer = example_off_diag.find_layer_matching(V_i_j,layer_example)
print(f"Value of the layer (unscaled): {(matching_layer * example_off_diag.cost_x_y).
↪sum()}")

print(f"Value of the layer (scaled by the mass = {layers_mass_example[layer_id_

↪example]}): "

f"{layers_mass_example[layer_id_example] * (matching_layer * example_off_diag.

↪cost_x_y).sum()}")

Value of the layer (unscaled): 24.764959193288938

Value of the layer (scaled by the mass = 1): 24.764959193288938

The following method plots the matching within a layer.

We apply it to the layer from our example.

def plot_layer_matching(self, layer, matching_layer):

# Create the figure and axis
fig, ax = plt.subplots(figsize=(15, 15))

# Plot the points on the x-axis

X_types_layer = self.X_types[layer[layer < len(self.X_types)]]
Y_types_layer = self.Y_types[layer[layer >= len(self.X_types)]
- len(self.X_types)]
ax.scatter(X_types_layer, np.zeros_like(X_types_layer), color='blue',
s = 20, zorder=5)
ax.scatter(Y_types_layer, np.zeros_like(Y_types_layer), color='red',
s = 20, zorder=5)

# Draw semicircles for each row in matchings

matched_types = np.where(matching_layer >0)
matched_types_x = self.X_types[matched_types[0]]
matched_types_y = self.Y_types[matched_types[1]]

for iter in range(len(matched_types_x)):

width = abs(matched_types_x[iter] - matched_types_y[iter])
center = (matched_types_x[iter] + matched_types_y[iter]) / 2
height = width
semicircle = patches.Arc((center, 0), width, height, theta1=0,
theta2=180, lw=3)
ax.add_patch(semicircle)

# Add title and layout settings

plt.title('Optimal Layer Matching' )
ax.set_aspect('equal')
plt.gca().spines['bottom'].set_position(('data', 0))
ax.spines['left'].set_color('none')
ax.spines['top'].set_color('none')
ax.spines['right'].set_color('none')
ax.yaxis.set_ticks([])
ax.set_ylim(bottom= -self.support_z.ptp() / 100)

plt.show()

ConcaveCostOT.plot_layer_matching = plot_layer_matching

320 Chapter 16. Composite Sorting

Advanced Quantitative Economics with Python

example_off_diag.plot_layer_matching(layer_example, matching_layer)

Solving a layer in a smarter way

We now present two key results in the context of OT with concave type costs.
We refer [Boerma et al., 2024] and [Delon et al., 2011] for proofs.
Consider the problem faced within a layer, i.e., types from 𝑌 ⊔ 𝑋

𝑧1 < 𝑧2 ⋯ < 𝑧𝑁ℓ −1 < 𝑧𝑁ℓ , 𝑁ℓ ∈ 2ℕ

are alternating and the problem is unitary.

Given a matching on [1, 𝑘], 𝑘 ∈ [𝑁ℓ ], 𝑘 even, we say that a matched pair (𝑖, 𝑗) within this matching is hidden if there is
a matched pair (𝑖′ , 𝑗′ ) with 𝑖′ < 𝑖 < 𝑗 < 𝑗′ .
Visually, the arc joining (𝑖′ , 𝑗′ ) surmounts the arc joining (𝑖, 𝑗).
Theorem (DSS) Given an optimal matching on [1, 𝑘], if (𝑖, 𝑗) is hidden in this matching, then the pair (𝑖, 𝑗) belongs to
every optimal matching on [1, 2𝑁ℓ ] and is hidden in this matching too.
As a consequence, there exists a more efficient way to compute the value function within a layer.
It can be shown that the solving the following second-order difference equations delivers the same result as the Bellman
equations above:

𝑉𝑖𝑗 = min{𝑐𝑖𝑗 + 𝑉𝑖+1,𝑗−1 , 𝑉𝑖+2,𝑗 + 𝑉𝑖,𝑗−2 − 𝑉𝑖+2,𝑗−2 }

for 𝑖, 𝑗 ∈ [𝑁ℓ ], 𝑗 − 𝑖 odd, with boundary conditions 𝑉𝑖+1,𝑖 = 0 for 𝑖 ∈ [0, 𝑁ℓ ] and 𝑉𝑖+2,𝑖−1 = −𝑐𝑖,𝑖+1 for 𝑖 ∈ [𝑁ℓ − 1] .
The following method uses these equations to compute the value function that is stored as a matrix [𝑉𝑖𝑗 ]𝑖∈[𝑁ℓ +1],𝑗∈[𝑁ℓ +1] .

def solve_bellman_eqs_DSS(self,layer):
# Recover cost function within the layer
cost_i_j = self.cost_z_z[layer[:,None],layer[None,:]]

# Initialize value function

V_i_j = np.full((len(layer)+1,len(layer)+1), np.nan)

# Add boundary conditions

V_i_j[np.arange(len(layer)+1), np.arange(len(layer)+1)] = 0
i_bdry = np.arange(len(layer)-1)
V_i_j[i_bdry+2,i_bdry] = - cost_i_j[i_bdry, i_bdry+1]

t = 1
while t < len(layer):
# Select agents i in [n_l-t] and potential partner j=i+t for each i
i_t = np.arange(len(layer)-t)
j_t = i_t + t +1

(continues on next page)

16.3. Characterization of primal solution 321

Advanced Quantitative Economics with Python

(continued from previous page)

# Compute optimal values for ij with j-i = t
V_i_j[i_t, j_t] = np.minimum(cost_i_j[i_t, j_t-1]
+ V_i_j[i_t + 1, j_t - 1],
V_i_j[i_t, j_t - 2] + V_i_j[i_t + 2, j_t]
- V_i_j[i_t + 2, j_t - 2])

## Go to next odd integer

t += 2

return V_i_j

OffDiagonal.solve_bellman_eqs_DSS = solve_bellman_eqs_DSS

Let’s apply the algorithm to our example and compare outcomes with those attained with the Bellman equations above.

V_i_j_DSS = example_off_diag.solve_bellman_eqs_DSS(layer_example)

print(f"Type indices of the layer: {layer_example}")

print('##########################')

print("Section of Value function of the layer:")

print(V_i_j_DSS.round(2)[:min(10, V_i_j_DSS.shape[0]),
:min(10, V_i_j_DSS.shape[1])])

print('##########################')
print(f"Difference with previous Bellman equations: \
{(V_i_j_DSS[:,1:] - V_i_j)[V_i_j >= 0].sum()}")

Type indices of the layer: [20 0 16 2 14 5 21 3 19 11 24 1 18 9]

##########################
Section of Value function of the layer:
[[ 0. nan 4.29 nan 5.73 nan 9.82 nan 13.9 nan]
[ nan 0. nan 2.75 nan 6.17 nan 8.44 nan 10.56]
[-4.29 nan 0. nan 1.44 nan 5.52 nan 9.6 nan]
[ nan -2.75 nan 0. nan 3.58 nan 5.84 nan 7.96]
[ nan nan -1.44 nan 0. nan 4.08 nan 8.16 nan]
[ nan nan nan -3.58 nan 0. nan 2.26 nan 4.38]
[ nan nan nan nan -4.08 nan 0. nan 4.08 nan]
[ nan nan nan nan nan -2.26 nan 0. nan 2.12]
[ nan nan nan nan nan nan -4.08 nan 0. nan]
[ nan nan nan nan nan nan nan -2.12 nan 0. ]]
##########################
Difference with previous Bellman equations: 4.440892098500626e-14

We can actually compute the optimal matching within the layer simultaneously with computing the value function, rather
than sequentially.
The key idea is that, if at some step of the computation of the values the left branch of the minimum above achieves the
minimum, say 𝑉𝑖𝑗 = 𝑐𝑖𝑗 + 𝑉𝑖+1,𝑗−1 , then (𝑖, 𝑗) are optimally matched on [𝑖, 𝑗] and by the theorem above we get that a
matching on [𝑖 + 1, 𝑗 − 1] which achieves 𝑉𝑖+1,𝑗−1 belongs to an optimal matching on the whole layer (since it is covered
by the arc (𝑖, 𝑗) in [𝑖, 𝑗]).
We can therefore proceed as follows
We initialize an empty matching and a list with all the agents in the layer (representing the agents which are not matched
yet).

322 Chapter 16. Composite Sorting

Advanced Quantitative Economics with Python

Then whenever the left branch of the minimum is achieved for some (𝑖, 𝑗) in the computation of 𝑉 , we take the collections
of agents 𝑘1 , … , 𝑘𝑀 in [𝑖 + 1, 𝑗 − 1] (in ascending order, i.e. with 𝑧𝑘𝑝 < 𝑧𝑘𝑝+1 ) that are not matched yet (if any) and
add to the matching the pairs (𝑘1 , 𝑘2 ), (𝑘3 , 𝑘4 ), … , (𝑘𝑀−1 , 𝑘𝑀 ).
Thus, we match each unmatched agent 𝑘𝑝 in [𝑖 + 1, 𝑗 − 1] with the closest unmatched right neighbour 𝑘𝑝+1 (starting from
𝑘1 ).
Intuitively, if 𝑘𝑝 were optimally matched with some 𝑘𝑞 in [𝑖 + 1, 𝑗 − 1] and not with 𝑘𝑝+1 , then 𝑘𝑝+1 would have already
been hidden by the match (𝑘𝑝 , 𝑘𝑞 ) from some previous computation (because |𝑘𝑝 − 𝑘𝑞 | < |𝑖 − 𝑗|) and it would therefore
be matched.
Finally, if the process above leaves some umatched agents, we proceed by matching each of these agent with the closest
unmatched right neighbour, starting again from the leftmost of these collection.
To gain understanding, note that this situation happens when the left branch is achieved only for pairs 𝑖, 𝑗 with |𝑖 − 𝑗| = 1,
which leads to the optimal matching (1, 2), (2, 3), … , (𝑛ℓ − 1, 𝑛ℓ ).

def find_layer_matching_DSS(self,layer):
# Recover cost function within the layer
cost_i_j = self.cost_z_z[layer[:,None],layer[None,:]]

# Add boundary conditions

V_i_j = np.zeros((len(layer)+1,len(layer)+1))
i_bdry = np.arange(len(layer)-1)
V_i_j[i_bdry+2,i_bdry] = - cost_i_j[i_bdry, i_bdry+1]

# Initialize matching and list of to-match agents

unmatched = np.ones(len(layer), dtype = bool)
matching = np.zeros((len(self.X_types),len(self.Y_types)), bool)

t = 1
while t < len(layer):
# Compute optimal value for pairs with |i-j| = t
i_t = np.arange(len(layer)-t)
j_t = i_t + t + 1

left_branch = cost_i_j[i_t, j_t-1] + V_i_j[i_t + 1, j_t - 1]

V_i_j[i_t, j_t] = np.minimum(left_branch, V_i_j[i_t, j_t - 2]
+ V_i_j[i_t + 2, j_t] - V_i_j[i_t + 2, j_t - 2])

# Select each i for which left branch achieves minimum in the V_{i,i+t}
left_branch_achieved = i_t[left_branch == V_i_j[i_t, j_t]]

# Update matching
for i in left_branch_achieved:
# for each agent k in [i+1,i+t-1]
for k in np.arange(i+1,i+t)[unmatched[range(i+1,i+t)]]:
# if k is unmatched
if unmatched[k] == True:
# find unmatched right neighbour
j_k = np.arange(k+1,len(layer))[unmatched[k+1:]][0]
# add pair to matching
self.add_pair_to_matching(layer[[k, j_k]], matching)
# remove pair from unmatched agents list
unmatched[[k, j_k]] = False

# go to next odd integer

t += 2
(continues on next page)

16.3. Characterization of primal solution 323

Advanced Quantitative Economics with Python

(continued from previous page)

# Each umatched agent is matched with next unmatched agent

for i in np.arange(len(layer))[unmatched]:
# if i is unmatched
if unmatched[i] == True:
# find unmatched right neighbour
j_i = np.arange(i+1,len(layer))[unmatched[i+1:]][0]
# add pair to matching
self.add_pair_to_matching(layer[[i, j_i]], matching)
# remove pair from unmatched agents list
unmatched[[i, j_i]] = False

return matching

OffDiagonal.find_layer_matching_DSS = find_layer_matching_DSS

matching_layer_DSS = example_off_diag.find_layer_matching_DSS(layer_example)
print(f" Value of layer with DSS recursive equations \
{(matching_layer_DSS * example_off_diag.cost_x_y).sum()}")
print(f" Value of layer with Bellman equations \
{(matching_layer * example_off_diag.cost_x_y).sum()}")

Value of layer with DSS recursive equations 24.764959193288938

Value of layer with Bellman equations 24.764959193288938

example_off_diag.plot_layer_matching(layer_example, matching_layer_DSS)

16.4 Solving primal problem

The following method assembles our components in order to solve the primal problem.
First, if matches are perfect pairs, we store the on-diagonal matching and create an off-diagonal instance with the residual
marginals.
Then we compute the set of layers of the residual distributions.
Finally, we solve each layer and put together matchings within each layer with the on-diagonal matchings.
We then return the full matching, the off-diagonal matching, and the off-diagonal instance.

def solve_primal_pb(self):
# Compute on-diagonal matching, create new instance with resitual types
off_diagoff_diagonal, match_tuple = self.generate_offD_onD_matching()
nonzero_id_x, nonzero_id_y, matching_diag = match_tuple

# Compute layers
(continues on next page)

324 Chapter 16. Composite Sorting

Advanced Quantitative Economics with Python

(continued from previous page)

layers_list, layers_mass, _, _ = off_diagoff_diagonal.find_layers()

# Solve layers to compute off-diagonal matching

matching_off_diag = np.zeros_like(off_diagoff_diagonal.cost_x_y, dtype=int)

for ell, layer in enumerate(layers_list):

V_i_j = off_diagoff_diagonal.solve_bellman_eqs(layer)
matching_off_diag += layers_mass[ell] \
* off_diagoff_diagonal.find_layer_matching(V_i_j, layer)

# Add together on- and off-diagonal matchings

matching = matching_diag.copy()
matching[np.ix_(nonzero_id_x, nonzero_id_y)] += matching_off_diag

return matching, matching_off_diag, off_diagoff_diagonal

ConcaveCostOT.solve_primal_pb = solve_primal_pb

matching, matching_off_diag, off_diagoff_diagonal = example_pb.solve_primal_pb()

We implement a similar method that adopts the DSS algorithm

def solve_primal_DSS(self):
# Compute on-diagonal matching, create new instance with resitual types
off_diagoff_diagonal, match_tuple = self.generate_offD_onD_matching()
nonzero_id_x, nonzero_id_y, matching_diag = match_tuple

# Find layers
layers, layers_mass, _, _ = off_diagoff_diagonal.find_layers()

# Solve layers to compute off-diagonal matching

matching_off_diag = np.zeros_like(off_diagoff_diagonal.cost_x_y, dtype=int)

for ell, layer in enumerate(layers):

matching_off_diag += layers_mass[ell] \
* off_diagoff_diagonal.find_layer_matching_DSS(layer)

# Add together on- and off-diagonal matchings

matching = matching_diag.copy()
matching[np.ix_(nonzero_id_x, nonzero_id_y)] += matching_off_diag

return matching, matching_off_diag, off_diagoff_diagonal

ConcaveCostOT.solve_primal_DSS = solve_primal_DSS

DSS_tuple = example_pb.solve_primal_DSS()
matching_DSS, matching_off_diag_DSS, off_diagoff_diagonal_DSS = DSS_tuple

By drawing semicircles joining the matched agents (with distinct types), we can visualize the off-diagonal matching.
In the following figure, widths and colors of semicirles indicate relative numbers of agents that are “transported” along
an arc.

def plot_matching(self, matching_off_diag, title, figsize=(15, 15),

(continues on next page)

16.4. Solving primal problem 325

Advanced Quantitative Economics with Python

(continued from previous page)

add_labels=False, plot_H_z=False, scatter=True):

# Create the figure and axis

fig, ax = plt.subplots(figsize=figsize)

# Plot types on the real line

if scatter:
ax.scatter(self.X_types, np.zeros_like(self.X_types), color='blue',
s=20, zorder=5)
ax.scatter(self.Y_types, np.zeros_like(self.Y_types), color='red',
s=20, zorder=5)

# Add labels for X_types and Y_types if add_labels is True

if add_labels:
# Remove x-axis ticks
ax.set_xticks([])

# Add labels
for i, x in enumerate(self.X_types):
ax.annotate(f'$x_{{{i }}}$', (x, 0), textcoords="offset points",
xytext=(0, -15), ha='center', color='blue', fontsize=12)
for j, y in enumerate(self.Y_types):
ax.annotate(f'$y_{{{j }}}$', (y, 0), textcoords="offset points",
xytext=(0, -15), ha='center', color='red', fontsize=12)

# Draw semicircles for each pair of matched types

matched_types = np.where(matching_off_diag > 0)
matched_types_x = self.X_types[matched_types[0]]
matched_types_y = self.Y_types[matched_types[1]]

count = matching_off_diag[matched_types]
colors = plt.cm.Greys(np.linspace(0.5, 1.5, count.max() + 1))
max_height = 0
for iter in range(len(count)):
width = abs(matched_types_x[iter] - matched_types_y[iter])
center = (matched_types_x[iter] + matched_types_y[iter]) / 2
height = width
max_height = max(max_height, height)
semicircle = patches.Arc((center, 0), width, height,
theta1=0, theta2=180,
color=colors[count[iter]],
lw=count[iter] * (2.2 / count.max()))
ax.add_patch(semicircle)

# Title and layout settings for the main plot

plt.title(title)
ax.set_aspect('equal')
plt.axhline(0, color='black', linewidth=1)
ax.spines['bottom'].set_position(('data', 0))
ax.spines['left'].set_color('none')
ax.spines['top'].set_color('none')
ax.spines['right'].set_color('none')
ax.yaxis.set_ticks([])
ax.set_ylim(- self.X_types.ptp() / 10,
(max_height / 2) + self.X_types.ptp()*.01)

(continues on next page)

326 Chapter 16. Composite Sorting

Advanced Quantitative Economics with Python

(continued from previous page)

# Plot H_z on the main axis if enabled
if plot_H_z:
H_z = np.cumsum(self.q_z)

step = np.concatenate(([self.support_z.min()
- .02 * self.support_z.ptp()],
self.support_z,
[self.support_z.max()
+ .02 * self.support_z.ptp()]))

H_z = H_z/H_z.ptp() * self.support_z.ptp() /2

height = np.concatenate(([0], H_z, [0]))

# Plot the compressed H_z on the same main x-axis

ax.step(step, height, color='green', lw=2,
label='$H_z$', where='post')

# Set the y-limit to keep H_z and maximum circle size in the plot
ax.set_ylim(np.min(H_z) - H_z.ptp() *.01,
np.maximum(np.max(H_z), max_height / 2) + H_z.ptp() *.01)

# Add label and legend for H_z

ax.legend(loc="upper right")

plt.show()

ConcaveCostOT.plot_matching = plot_matching

off_diagoff_diagonal.plot_matching(matching_off_diag,
tit