0% found this document useful (0 votes)
557 views1,362 pages

QuantEconlectures Python3

This document presents a series of lectures on quantitative economic modeling using Python and Julia programming languages. It covers topics such as an introduction to Python, setting up the Python environment, the scientific libraries for Python including NumPy, Matplotlib and Pandas, advanced Python programming techniques, data and empirics, tools and techniques for modeling, dynamic programming applications, and multiple agent models.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
557 views1,362 pages

QuantEconlectures Python3

This document presents a series of lectures on quantitative economic modeling using Python and Julia programming languages. It covers topics such as an introduction to Python, setting up the Python environment, the scientific libraries for Python including NumPy, Matplotlib and Pandas, advanced Python programming techniques, data and empirics, tools and techniques for modeling, dynamic programming applications, and multiple agent models.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1362

QuantEcon.

lectures-python3 PDF
Release 2018-Sep-29

Thomas J. Sargent and John Stachurski

Sep 29, 2018


PREFACE

This pdf presents a series of lectures on quantitative economic modeling, designed and written
by Thomas J. Sargent and John Stachurski. The primary programming languages are Python and
Julia. You can send feedback to the authors via [email protected].

Note: You are currently viewing an automatically generated pdf version of our online
lectures, which are located at

https://lectures.quantecon.org

Please visit the website for more information on the aims and scope of the lectures and the two
language options (Julia or Python).

Due to automatic generation of this pdf, presentation quality is likely to be lower than that
of the website.

i
ii
CONTENTS

1 Introduction to Python 1
1.1 About Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Setting up Your Python Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3 An Introductory Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
1.4 Python Essentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
1.5 OOP I: Introduction to Object Oriented Programming . . . . . . . . . . . . . . . . . . . . 78

2 The Scientific Libraries 85


2.1 NumPy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
2.2 Matplotlib . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
2.3 SciPy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
2.4 Numba . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
2.5 Other Scientific Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

3 Advanced Python Programming 149


3.1 Writing Good Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
3.2 OOP II: Building Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
3.3 OOP III: The Samuelson Accelerator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
3.4 More Language Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
3.5 Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

4 Data and Empirics 249


4.1 Pandas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
4.2 Pandas for Panel Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
4.3 Linear Regression in Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
4.4 Maximum Likelihood Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304

5 Tools and Techniques 329


5.1 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
5.2 Orthogonal Projections and Their Applications . . . . . . . . . . . . . . . . . . . . . . . . 353
5.3 LLN and CLT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370
5.4 Linear State Space Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
5.5 Finite Markov Chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
5.6 Continuous State Markov Chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442
5.7 A First Look at the Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465

iii
6 Dynamic Programming 485
6.1 Shortest Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485
6.2 Job Search I: The McCall Search Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493
6.3 Job Search II: Search and Separation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506
6.4 A Problem that Stumped Milton Friedman . . . . . . . . . . . . . . . . . . . . . . . . . . 518
6.5 Job Search III: Search with Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 538
6.6 Job Search IV: Modeling Career Choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556
6.7 Job Search V: On-the-Job Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 570
6.8 Optimal Growth I: The Stochastic Optimal Growth Model . . . . . . . . . . . . . . . . . . 583
6.9 Optimal Growth II: Time Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 602
6.10 Optimal Growth III: The Endogenous Grid Method . . . . . . . . . . . . . . . . . . . . . . 621
6.11 LQ Dynamic Programming Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 630
6.12 Optimal Savings I: The Permanent Income Model . . . . . . . . . . . . . . . . . . . . . . 662
6.13 Optimal Savings II: LQ Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 679
6.14 Consumption and Tax Smoothing with Complete and Incomplete Markets . . . . . . . . . . 697
6.15 Optimal Savings III: Occasionally Binding Constraints . . . . . . . . . . . . . . . . . . . . 716
6.16 Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 736
6.17 Discrete State Dynamic Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 755

7 Multiple Agent Models 779


7.1 Schellings Segregation Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 779
7.2 A Lake Model of Employment and Unemployment . . . . . . . . . . . . . . . . . . . . . . 791
7.3 Rational Expectations Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 819
7.4 Markov Perfect Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 833
7.5 Robust Markov Perfect Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 851
7.6 Asset Pricing I: Finite State Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 871
7.7 Asset Pricing II: The Lucas Asset Pricing Model . . . . . . . . . . . . . . . . . . . . . . . 892
7.8 Asset Pricing III: Incomplete Markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 904
7.9 Uncertainty Traps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 916
7.10 The Aiyagari Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 930
7.11 Default Risk and Income Fluctuations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 938
7.12 Globalization and Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 960

8 Time Series Models 979


8.1 Covariance Stationary Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 979
8.2 Estimation of Spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 997
8.3 Additive Functionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1012
8.4 Multiplicative Functionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1030
8.5 Classical Control with Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1051
8.6 Classical Filtering With Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1075

9 Dynamic Programming Squared 1097


9.1 Dynamic Stackelberg Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1097
9.2 Ramsey plans, Time Inconsistency, Sustainable Plans . . . . . . . . . . . . . . . . . . . . . 1113
9.3 Optimal Taxation in an LQ Economy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1138
9.4 Optimal Taxation with State-Contingent Debt . . . . . . . . . . . . . . . . . . . . . . . . . 1159
9.5 Optimal Taxation without State-Contingent Debt . . . . . . . . . . . . . . . . . . . . . . . 1193
9.6 Fluctuating Interest Rates Deliver Fiscal Insurance . . . . . . . . . . . . . . . . . . . . . . 1220

iv
9.7 Fiscal Risk and Government Debt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1246
9.8 Competitive Equilibria of Chang Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1272
9.9 Credible Government Policies in Chang Model . . . . . . . . . . . . . . . . . . . . . . . . 1303

10 Quantitative Economics 1329

11 Tools and Techniques 1331

12 Dynamic Programming 1333

13 Multiple Agent Models 1335

14 Quantitative Economics 1337

15 Tools and Techniques 1339

16 Dynamic Programming 1341

17 Multiple Agent Models 1343

18 Time Series Models 1345

Bibliography 1347

v
vi
CHAPTER

ONE

INTRODUCTION TO PYTHON

This first part of the course provides a relatively fast-paced introduction to the Python programming language

1.1 About Python

Contents

• About Python
– Overview
– Whats Python?
– Scientific Programming
– Learn More

1.1.1 Overview

In this lecture we will


• Outline what Python is
• Showcase some of its abilities
• Compare it to some other languages
At this stage its not our intention that you try to replicate all you see
We will work through what follows at a slow pace later in the lecture series
Our only objective for this lecture is to give you some feel of what Python is, and what it can do

1.1.2 Whats Python?

Python is a general purpose programming language conceived in 1989 by Dutch programmer Guido van
Rossum

1
QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Python is free and open source, with development coordinated through the Python Software Foundation
Python has experienced rapid adoption in the last decade, and is now one of the most popular programming
languages

Common Uses

Python is a general purpose language used in almost all application domains


• communications
• web development
• CGI and graphical user interfaces
• games
• multimedia, data processing, security, etc., etc., etc.
Used extensively by Internet service and high tech companies such as
• Google
• Dropbox
• Reddit
• YouTube
• Walt Disney Animation, etc., etc.
Often used to teach computer science and programming
For reasons we will discuss, Python is particularly popular within the scientific community
• academia, NASA, CERN, Wall St., etc., etc.

Relative Popularity

The following chart, produced using Stack Overflow Trends, shows one measure of the relative popularity
of Python

2 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

The figure indicates not only that Python is widely used but also that adoption of Python has accelerated
significantly since 2012
We suspect this is driven at least in part by uptake in the scientific domain, particularly in rapidly growing
fields like data science
For example, the popularity of pandas, a library for data analysis with Python has exploded, as seen here
(The corresponding time path for MATLAB is shown for comparison)

1.1. About Python 3


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Note that pandas takes off in 2012, which is the same year that we seek Pythons popularity begin to spike in
the first figure
Overall, its clear that
• Python is one of the most popular programming languages worldwide
• Python is a major tool for scientific computing, accounting for a rapidly rising share of scientific work
around the globe

Features

Python is a high level language suitable for rapid development


It has a relatively small core language supported by many libraries
Other features:
• A multiparadigm language, in that multiple programming styles are supported (procedural, object-
oriented, functional, etc.)
• Interpreted rather than compiled

Syntax and Design

One nice feature of Python is its elegant syntax well see many examples later on

4 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Elegant code might sound superfluous but in fact its highly beneficial because it makes the syntax easy to
read and easy to remember
Remembering how to read from files, sort dictionaries and other such routine tasks means that you dont need
to break your flow in order to hunt down correct syntax
Closely related to elegant syntax is elegant design
Features like iterators, generators, decorators, list comprehensions, etc. make Python highly expressive,
allowing you to get more done with less code
Namespaces improve productivity by cutting down on bugs and syntax errors

1.1.3 Scientific Programming

Python has become one of the core languages of scientific computing


Its either the dominant player or a major player in
• Machine learning and data science
• Astronomy
• Artificial intelligence
• Chemistry
• Computational biology
• Meteorology
• etc., etc.
Its popularity in economics is also beginning to rise
This section briefly showcases some examples of Python for scientific programming
• All of these topics will be covered in detail later on

Numerical programming

Fundamental matrix and array processing capabilities are provided by the excellent NumPy library
NumPy provides the basic array data type plus some simple processing operations
For example, lets build some arrays

import numpy as np # Load the library

a = np.linspace(-np.pi, np.pi, 100) # Create even grid from -π to π


b = np.cos(a) # Apply cosine to each element of a
c = np.sin(a) # Apply sin to each element of a

Now lets take the inner product:

1.1. About Python 5


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

b @ c

1.5265566588595902e-16

The number you see here might vary slightly but its essentially zero
(For older versions of Python and NumPy you need to use the np.dot function)
The SciPy library is built on top of NumPy and provides additional functionality
∫2
For example, lets calculate −2 ϕ(z)dz where ϕ is the standard normal density

from scipy.stats import norm


from scipy.integrate import quad

= norm()
value, error = quad(.pdf, -2, 2) # Integrate using Gaussian quadrature
value

0.9544997361036417

SciPy includes many of the standard routines used in


• linear algebra
• integration
• interpolation
• optimization
• distributions and random number generation
• signal processing
• etc., etc.

Graphics

The most popular and comprehensive Python library for creating figures and graphs is Matplotlib
• Plots, histograms, contour images, 3D, bar charts, etc., etc.
• Output in many formats (PDF, PNG, EPS, etc.)
• LaTeX integration
Example 2D plot with embedded LaTeX annotations

6 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Example contour plot

1.1. About Python 7


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Example 3D plot

More examples can be found in the Matplotlib thumbnail gallery

8 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Other graphics libraries include


• Plotly
• Bokeh
• VPython 3D graphics and animations

Symbolic Algebra

Its useful to be able to manipulate symbolic expressions, as in Mathematica or Maple


The SymPy library provides this functionality from within the Python shell

from sympy import Symbol

x, y = Symbol('x'), Symbol('y') # Treat 'x' and 'y' as algebraic symbols


x + x + x + y

3*x + y

We can manipulate expressions

expression = (x + y)**2
expression.expand()

x**2 + 2*x*y + y**2

solve polynomials

from sympy import solve

solve(x**2 + x + 2)

[-1/2 - sqrt(7)*I/2, -1/2 + sqrt(7)*I/2]

and calculate limits, derivatives and integrals

from sympy import limit, sin, diff

limit(1 / x, x, 0)

oo

limit(sin(x) / x, x, 0)

diff(sin(x), x)

1.1. About Python 9


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

cos(x)

The beauty of importing this functionality into Python is that we are working within a fully fledged pro-
gramming language
Can easily create tables of derivatives, generate LaTeX output, add it to figures, etc., etc.

Statistics

Pythons data manipulation and statistics libraries have improved rapidly over the last few years

Pandas

One of the most popular libraries for working with data is pandas
Pandas is fast, efficient, flexible and well designed
Heres a simple example, using some fake data

import pandas as pd
np.random.seed(1234)

data = np.random.randn(5, 2) # 5x2 matrix of N(0, 1) random draws


dates = pd.date_range('28/12/2010', periods=5)

df = pd.DataFrame(data, columns=('price', 'weight'), index=dates)


print(df)

price weight
2010-12-28 0.471435 -1.190976
2010-12-29 1.432707 -0.312652
2010-12-30 -0.720589 0.887163
2010-12-31 0.859588 -0.636524
2011-01-01 0.015696 -2.242685

df.mean()

price 0.411768
weight -0.699135
dtype: float64

Other Useful Statistics Libraries

• statsmodels various statistical routines


• scikit-learn machine learning in Python (sponsored by Google, among others)
• pyMC for Bayesian data analysis

10 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

• pystan Bayesian analysis based on stan

Networks and Graphs

Python has many libraries for studying graphs


One well-known example is NetworkX
• Standard graph algorithms for analyzing network structure, etc.
• Plotting routines
• etc., etc.
Heres some example code that generates and plots a random graph, with node color determined by shortest
path length from a central node

import networkx as nx
import matplotlib.pyplot as plt
np.random.seed(1234)

# Generate random graph


p = dict((i,(np.random.uniform(0, 1),np.random.uniform(0, 1))) for i in
,→range(200))

G = nx.random_geometric_graph(200, 0.12, pos=p)


pos = nx.get_node_attributes(G, 'pos')

# find node nearest the center point (0.5, 0.5)


dists = [(x - 0.5)**2 + (y - 0.5)**2 for x, y in list(pos.values())]
ncenter = np.argmin(dists)

# Plot graph, coloring by path length from central node


p = nx.single_source_shortest_path_length(G, ncenter)
plt.figure()
nx.draw_networkx_edges(G, pos, alpha=0.4)
nx.draw_networkx_nodes(G,
pos,
nodelist=list(p.keys()),
node_size=120, alpha=0.5,
node_color=list(p.values()),
cmap=plt.cm.jet_r)
plt.show()

1.1. About Python 11


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Cloud Computing

Running your Python code on massive servers in the cloud is becoming easier and easier
A nice example is Anaconda Enterprise
See also
• Amazon Elastic Compute Cloud
• The Google App Engine (Python, Java, PHP or Go)
• Pythonanywhere
• Sagemath Cloud

Parallel Processing

Apart from the cloud computing options listed above, you might like to consider
• Parallel computing through IPython clusters
• The Starcluster interface to Amazons EC2
• GPU programming through PyCuda, PyOpenCL, Theano or similar

12 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Other Developments

There are many other interesting developments with scientific programming in Python
Some representative examples include
• Jupyter Python in your browser with code cells, embedded images, etc.
• Numba Make Python run at the same speed as native machine code!
• Blaze a generalization of NumPy
• PyTables manage large data sets
• CVXPY convex optimization in Python

1.1.4 Learn More

• Browse some Python projects on GitHub


• Have a look at some of the Jupyter notebooks people have shared on various scientific topics
• Visit the Python Package Index
• View some of the question people are asking about Python on Stackoverflow
• Keep up to date on whats happening in the Python community with the Python subreddit

1.2 Setting up Your Python Environment

Contents

• Setting up Your Python Environment


– Overview
– Anaconda
– Jupyter Notebooks
– QuantEcon.py
– Keeping Software up to Date
– Working with Files
– Editors and IDEs
– Exercises

1.2. Setting up Your Python Environment 13


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

1.2.1 Overview

In this lecture you will learn how to


1. get a Python environment up and running with all the necessary tools
2. execute simple Python commands
3. run a sample program
4. install the code libraries that underpin these lectures

1.2.2 Anaconda

The core Python package is easy to install but not what you should choose for these lectures
These lectures require the entire scientific programming ecosystem, which
• the core installation doesnt provide
• is painful to install one piece at a time
Hence the best approach for our purposes is to install a free Python distribution that contains
1. the core Python language and
2. the most popular scientific libraries
The best such distribution is Anaconda
Anaconda is
• very popular
• cross platform
• comprehensive
• completely unrelated to the Nicki Minaj song of the same name
Anaconda also comes with a great package management system to organize your code libraries
All of what follows assumes that you adopt this recommendation!

Installing Anaconda

Installing Anaconda is straightforward: download the binary and follow the instructions
Important points:
• Install the latest version
• If you are asked during the installation process whether youd like to make Anaconda your default
Python installation, say yes
• Otherwise you can accept all of the defaults

14 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Get a Modern Browser

Well be using your browser to interact with Python, so now might be a good time to
1. update your browser, or
2. install a free modern browser such as Chrome or Firefox

1.2.3 Jupyter Notebooks

Jupyter notebooks are one of the many possible ways to interact with Python and the scientific libraries
They use a browser-based interface to Python with
• The ability to write and execute Python commands
• Formatted output in the browser, including tables, figures, animation, etc.
• The option to mix in formatted text and mathematical expressions
Because of these possibilities, Jupyter is fast turning into a major player in the scientific computing ecosys-
tem
Heres an image of showing execution of some code (borrowed from here) in a Jupyter notebook

1.2. Setting up Your Python Environment 15


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

You can find a nice example of the kinds of things you can do in a Jupyter notebook (such as include maths
and text) here
Further examples can be found at QuantEcons notebook archive or the NB viewer site
While Jupyter isnt the only way to code in Python, its great for when you wish to

16 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

• start coding in Python


• test new ideas or interact with small pieces of code
• share or collaborate scientific ideas with students or colleagues
These lectures are designed for executing in Jupyter notebooks

Starting the Jupyter Notebook

Once you have installed Anaconda, you can start the Jupyter notebook
Either
• search for Jupyter in your applications menu, or
• open up a terminal and type jupyter notebook
– Windows users should substitute Anaconda command prompt for terminal in the previous line
If you use the second option, you will see something like this (click to enlarge)

The output tells us the notebook is running at http://localhost:8888/


• localhost is the name of the local machine
• 8888 refers to port number 8888 on your computer
Thus, the Jupyter kernel is listening for Python commands on port 8888 of our local machine

1.2. Setting up Your Python Environment 17


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Hopefully your default browser has also opened up with a web page that looks something like this (click to
enlarge)

What you see here is called the Jupyter dashboard


If you look at the URL at the top, it should be localhost:8888 or similar, matching the message above
Assuming all this has worked OK, you can now click on New at top right and select Python 3 or similar
Heres what shows up on our machine:

18 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

The notebook displays an active cell, into which you can type Python commands

Notebook Basics

Lets start with how to edit code and run simple programs

Running Cells

Notice that in the previous figure the cell is surrounded by a green border
This means that the cell is in edit mode

1.2. Setting up Your Python Environment 19


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

As a result, you can type in Python code and it will appear in the cell
When youre ready to execute the code in a cell, hit Shift-Enter instead of the usual Enter

(Note: There are also menu and button options for running code in a cell that you can find by exploring)

Modal Editing

The next thing to understand about the Jupyter notebook is that it uses a modal editing system
This means that the effect of typing at the keyboard depends on which mode you are in
The two modes are
1. Edit mode

20 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

• Indicated by a green border around one cell


• Whatever you type appears as is in that cell
2. Command mode
• The green border is replaced by a grey border
• Key strokes are interpreted as commands for example, typing b adds a new cell below the current
one
To switch to
• command mode from edit mode, hit the Esc key or Ctrl-M
• edit mode from command mode, hit Enter or click in a cell
The modal behavior of the Jupyter notebook is a little tricky at first but very efficient when you get used to
it

User Interface Tour

At this stage we recommend you take your time to


• look at the various options in the menus and see what they do
• take the user interface tour, which can be accessed through the help menu

Inserting unicode (e.g., Greek letters)

Python 3 introduced support for unicode characters, allowing the use of characters such as α and β in your
code
Unicode characters can be typed quickly in Jupyter using the tab key
Try creating a new code cell and typing \alpha, then hitting the tab key on your keyboard

A Test Program

Lets run a test program


Heres an arbitrary program we can use: http://matplotlib.org/1.4.1/examples/pie_and_polar_charts/polar_
bar_demo.html
On that page youll see the following code
import numpy as np
import matplotlib.pyplot as plt

N = 20
θ = np.linspace(0.0, 2 * np.pi, N, endpoint=False)
radii = 10 * np.random.rand(N)
width = np.pi / 4 * np.random.rand(N)

1.2. Setting up Your Python Environment 21


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

ax = plt.subplot(111, polar=True)
bars = ax.bar(θ, radii, width=width, bottom=0.0)

# Use custom colors and opacity


for r, bar in zip(radii, bars):
bar.set_facecolor(plt.cm.jet(r / 10.))
bar.set_alpha(0.5)

plt.show()

Dont worry about the details for now lets just run it and see what happens
The easiest way to run this code is to copy and paste into a cell in the notebook
You should see something like this

22 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

(In older versions of Jupyter you might need to add the command %matplotlib inline before you
generate the figure)

Working with the Notebook

Here are a few more tips on working with Jupyter notebooks

Tab Completion

In the previous program we executed the line import numpy as np

1.2. Setting up Your Python Environment 23


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

• NumPy is a numerical library well work with in depth


After this import command, functions in NumPy can be accessed with np.<function_name> type
syntax
• For example, try np.random.randn(3)
We can explore this attributes of np using the Tab key
For example, here we type np.ran and hit Tab (click to enlarge)

Jupyter offers up the two possible completions, random and rank


In this way, the Tab key helps remind you of whats available, and also saves you typing

24 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

On-Line Help

To get help on np.rank, say, we can execute np.rank?


Documentation appears in a split window of the browser, like so

Clicking in the top right of the lower split closes the on-line help

Other Content

In addition to executing code, the Jupyter notebook allows you to embed text, equations, figures and even
videos in the page

1.2. Setting up Your Python Environment 25


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

For example, here we enter a mixture of plain text and LaTeX instead of code

Next we Esc to enter command mode and then type m to indicate that we are writing Markdown, a mark-up
language similar to (but simpler than) LaTeX
(You can also use your mouse to select Markdown from the Code drop-down box just below the list of
menu items)
Now we Shift+Enter to produce this

26 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Sharing Notebooks

Notebook files are just text files structured in JSON and typically ending with .ipynb
You can share them in the usual way that you share files or by using web services such as nbviewer
The notebooks you see on that site are static html representations
To run one, download it as an ipynb file by clicking on the download icon at the top right
Save it somewhere, navigate to it from the Jupyter dashboard and then run as discussed above

1.2. Setting up Your Python Environment 27


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

1.2.4 QuantEcon.py

In these lectures well make extensive use of code from the QuantEcon organization
On the Python side well be using the QuantEcon.py version
This code has been organized into a Python package
• A Python package is a software library that has been bundled for distribution
• Hosted Python packages can be found through channels like Anaconda and PyPi
You can install QuantEcon.py by starting Jupyter and typing
!pip install quantecon
into a cell
Alternatively, you can type the following into a terminal
pip install quantecon
More instructions can be found on the library page

Note: The QuantEcon.py package can also be installed using conda by:

conda config --add channels conda-forge


conda install quantecon

1.2.5 Keeping Software up to Date

For these lectures to run without error you need to keep your software up to date

Updating Anaconda

Anaconda supplies a tool called conda to manage and upgrade your Anaconda packages
One conda command you should execute regularly is the one that updates the whole Anaconda distribution
As a practice run, please execute the following
1. Open up a terminal
2. Type conda update anaconda
For more information on conda, type conda help in a terminal

Updating QuantEcon.py

Open up a terminal and type


pip install --upgrade quantecon

28 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Or open up Jupyter and type the same thing in a notebook cell with ! in front of it

1.2.6 Working with Files

How does one run a locally saved Python file?


There are a number of ways to do this but lets focus on methods using Jupyter notebooks

Option 1: Copy and Paste

The steps are:


1. Navigate to your file with your mouse / trackpad using a file browser
2. Click on your file to open it with a text editor
3. Copy and paste into a cell and Shift-Enter

Method 2: Run

Using the run command is often easier than copy and paste
• For example, %run test.py will run the file test.py
(You might find that the % is unnecessary use %automagic to toggle the need for %)
Note that Jupyter only looks for test.py in the present working directory (PWD)
If test.py isnt in that directory, you will get an error
Lets look at a successful example, where we run a file test.py with contents:

for i in range(5):
print('foobar')

Heres the notebook (click to enlarge)

1.2. Setting up Your Python Environment 29


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Here
• pwd asks Jupyter to show the PWD (or %pwd see the comment about automagic above)
– This is where Jupyter is going to look for files to run
– Your output will look a bit different depending on your OS
• ls asks Jupyter to list files in the PWD (or %ls)
– Note that test.py is there (on our computer, because we saved it there earlier)
• cat test.py asks Jupyter to print the contents of test.py (or !type test.py on Windows)
• run test.py runs the file and prints any output

30 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

But file X isnt in my PWD!

If youre trying to run a file not in the present working director, youll get an error
To fix this error you need to either
1. Shift the file into the PWD, or
2. Change the PWD to where the file lives
One way to achieve the first option is to use the Upload button
• The button is on the top level dashboard, where Jupyter first opened to
• Look where the pointer is in this picture

The second option can be achieved using the cd command


• On Windows it might look like this cd C:/Python27/Scripts/dir
• On Linux / OSX it might look like this cd /home/user/scripts/dir
Note: You can type the first letter or two of each directory name and then use the tab key to expand

Loading Files

Its often convenient to be able to see your code before you run it

1.2. Setting up Your Python Environment 31


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

In the following example we execute load white_noise_plot.py where white_noise_plot.


py is in the PWD
(Use %load if automagic is off)
Now the code from the file appears in a cell ready to execute

Saving Files

To save the contents of a cell as file foo.py


• put %%file foo.py as the first line of the cell
• Shift+Enter

32 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Here %%file is an example of a cell magic

1.2.7 Editors and IDEs

The preceding discussion covers most of what you need to know to interact with this website
However, as you start to write longer programs, you might want to experiment with your workflow
There are many different options and we mention them only in passing

JupyterLab

JupyterLab is an integrated development environment centered around Jupyter notebooks


It is available through Anaconda and will soon be made the default environment for Jupyter notebooks
Reading the docs or searching for a recent YouTube video will give you more information

Text Editors

A text editor is an application that is specifically designed to work with text files such as Python programs
Nothing beats the power and efficiency of a good text editor for working with program text
A good text editor will provide
• efficient text editing commands (e.g., copy, paste, search and replace)
• syntax highlighting, etc.
Among the most popular are Sublime Text and Atom
For a top quality open source text editor with a steeper learning curve, try Emacs
If you want an outstanding free text editor and dont mind a seemingly vertical learning curve plus long days
of pain and suffering while all your neural pathways are rewired, try Vim

Text Editors Plus IPython Shell

A text editor is for writing programs


To run them you can continue to use Jupyter as described above
Another option is to use the excellent IPython shell
To use an IPython shell, open up a terminal and type ipython
You should see something like this

1.2. Setting up Your Python Environment 33


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

The IPython shell has many of the features of the notebook: tab completion, color syntax, etc.
It also has command history through the arrow key
The up arrow key to brings previously typed commands to the prompt
This saves a lot of typing
Heres one set up, on a Linux box, with
• a file being edited in Vim
• An IPython shell next to it, to run the file

34 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

IDEs

IDEs are Integrated Development Environments, which allow you to edit, execute and interact with code
from an integrated environment
One of the most popular in recent times is VS Code, which is now available via Anaconda
We hear good things about VS Code please tell us about your experiences on the forum

1.2.8 Exercises

Exercise 1

If Jupyter is still running, quit by using Ctrl-C at the terminal where you started it
Now launch again, but this time using jupyter notebook --no-browser
This should start the kernel without launching the browser
Note also the startup message: It should give you a URL such as http://localhost:8888 where the
notebook is running

1.2. Setting up Your Python Environment 35


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Now
1. Start your browser or open a new tab if its already running
2. Enter the URL from above (e.g. http://localhost:8888) in the address bar at the top
You should now be able to run a standard Jupyter notebook session
This is an alternative way to start the notebook that can also be handy

Exercise 2

This exercise will familiarize you with git and GitHub


Git is a version control system a piece of software used to manage digital projects such as code libraries
In many cases the associated collections of files called repositories are stored on GitHub
GitHub is a wonderland of collaborative coding projects
For example, it hosts many of the scientific libraries well be using later on, such as this one
Git is the underlying software used to manage these projects
Git is an extremely powerful tool for distributed collaboration for example, we use it to share and synchro-
nize all the source files for these lectures
There are two main flavors of Git
1. the plain vanilla command line Git version
2. the various point-and-click GUI versions
• See, for example, the GitHub version
As an exercise, try
1. Installing Git
2. Getting a copy of QuantEcon.py using Git
For example, if youve installed the command line version, open up a terminal and enter
git clone https://github.com/QuantEcon/QuantEcon.py
(This is just git clone in front of the URL for the repository)
Even better,
1. Sign up to GitHub
2. Look into forking GitHub repositories (forking means making your own copy of a GitHub repository,
stored on GitHub)
3. Fork QuantEcon.py
4. Clone your fork to some local directory, make edits, commit them, and push them back up to your
forked GitHub repo
5. If you made a valuable improvement, send us a pull request!

36 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

For reading on these and other topics, try


• The official Git documentation
• Reading through the docs on GitHub
• Pro Git Book by Scott Chacon and Ben Straub
• One of the thousands of Git tutorials on the Net

1.3 An Introductory Example

Contents

• An Introductory Example
– Overview
– The Task: Plotting a White Noise Process
– Version 1
– Alternative Versions
– Exercises
– Solutions

Were now ready to start learning the Python language itself


The level of this and the next few lectures will suit those with some basic knowledge of programming
But dont give up if you have noneyou are not excluded
You just need to cover a few of the fundamentals of programming before returning here
Good references for first time programmers include:
• The first 5 or 6 chapters of How to Think Like a Computer Scientist
• Automate the Boring Stuff with Python
• The start of Dive into Python 3
Note: These references offer help on installing Python but you should probably stick with the method on
our set up page
Youll then have an outstanding scientific computing environment (Anaconda) and be ready to move on to
the rest of our course

1.3.1 Overview

In this lecture we will write and then pick apart small Python programs

1.3. An Introductory Example 37


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

The objective is to introduce you to basic Python syntax and data structures
Deeper concepts will be covered in later lectures

Prerequisites

The lecture on getting started with Python

1.3.2 The Task: Plotting a White Noise Process

Suppose we want to simulate and plot the white noise process ϵ0 , ϵ1 , . . . , ϵT , where each draw ϵt is indepen-
dent standard normal
In other words, we want to generate figures that look something like this:

Well do this several different ways

38 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

1.3.3 Version 1

Heres a few lines of code that perform the task we set

import numpy as np
import matplotlib.pyplot as plt

x = np.random.randn(100)
plt.plot(x)
plt.show()

Lets break this program down and see how it works

Import Statements

The first two lines of the program import functionality


The first line imports NumPy, a favorite Python package for tasks like
• working with arrays (vectors and matrices)
• common mathematical functions like cos and sqrt

1.3. An Introductory Example 39


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

• generating random numbers


• linear algebra, etc.
After import numpy as np we have access to these attributes via the syntax np.
Heres another example

import numpy as np

np.sqrt(4)

2.0

We could also just write

import numpy

numpy.sqrt(4)

2.0

But the former method is convenient and more standard

Why all the imports?

Remember that Python is a general purpose language


The core language is quite small so its easy to learn and maintain
When you want to do something interesting with Python, you almost always need to import additional
functionality
Scientific work in Python is no exception
Most of our programs start off with lines similar to the import statements seen above

Packages

As stated above, NumPy is a Python package


Packages are used by developers to organize a code library
In fact a package is just a directory containing
1. files with Python code called modules in Python speak
2. possibly some compiled code that can be accessed by Python (e.g., functions compiled from C or
FORTRAN code)
3. a file called __init__.py that specifies what will be executed when we type import package_name

40 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

In fact you can find and explore the directory for NumPy on your computer easily enough if you look around
On this machine its located in

anaconda3/lib/python3.6/site-packages/numpy

Subpackages

Consider the line x = np.random.randn(100)


Here np refers to the package NumPy, while random is a subpackage of NumPy
You can see the contents here
Subpackages are just packages that are subdirectories of another package

Importing Names Directly

Recall this code that we saw above

import numpy as np

np.sqrt(4)

2.0

Heres another way to access NumPys square root function

from numpy import sqrt

sqrt(4)

2.0

This is also fine


The advantage is less typing if we use sqrt often in our code
The disadvantage is that, in a long program, these two lines might be separated by many other lines
Then its harder for readers to know where sqrt came from, should they wish to

1.3.4 Alternative Versions

Lets try writing some alternative versions of our first program


Our aim in doing this is to illustrate some more Python syntax and semantics
The programs below are less efficient but
• help us understand basic constructs like loops

1.3. An Introductory Example 41


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

• illustrate common data types like lists

A Version with a For Loop

Heres a version that illustrates loops and Python lists

ts_length = 100
_values = [] # Empty list

for i in range(ts_length):
e = np.random.randn()
_values.append(e)

plt.plot(_values)
plt.show()

In brief,
• The first pair of lines import functionality as before
• The next line sets the desired length of the time series

42 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

• The next line creates an empty list called _values that will store the ϵt values as we generate them
• The next three lines are the for loop, which repeatedly draws a new random number ϵt and appends it
to the end of the list _values
• The last two lines generate the plot and display it to the user
Lets study some parts of this program in more detail

Lists

Consider the statement _values = [], which creates an empty list


Lists are a native Python data structure used to group a collection of objects
For example, try

x = [10, 'foo', False] # We can include heterogeneous data inside a list


type(x)

list

The first element of x is an integer, the next is a string and the third is a Boolean value
When adding a value to a list, we can use the syntax list_name.append(some_value)

[10, 'foo', False]

x.append(2.5)
x

[10, 'foo', False, 2.5]

Here append() is whats called a method, which is a function attached to an objectin this case, the list x
Well learn all about methods later on, but just to give you some idea,
• Python objects such as lists, strings, etc. all have methods that are used to manipulate the data con-
tained in the object
• String objects have string methods, list objects have list methods, etc.
Another useful list method is pop()

[10, 'foo', False, 2.5]

x.pop()

1.3. An Introductory Example 43


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

2.5

[10, 'foo', False]

The full set of list methods can be found here


Following C, C++, Java, etc., lists in Python are zero based
x

[10, 'foo', False]

x[0]

10

x[1]

'foo'

The For Loop

Now lets consider the for loop from the program above, which was
for i in range(ts_length):
e = np.random.randn()
_values.append(e)

Python executes the two indented lines ts_length times before moving on
These two lines are called a code block, since they comprise the block of code that we are looping over
Unlike most other languages, Python knows the extent of the code block only from indentation
In our program, indentation decreases after line _values.append(e), telling Python that this line marks
the lower limit of the code block
More on indentation belowfor now lets look at another example of a for loop
animals = ['dog', 'cat', 'bird']
for animal in animals:
print("The plural of " + animal + " is " + animal + "s")

If you put this in a text file or Jupyter cell and run it you will see
The plural of dog is dogs
The plural of cat is cats
The plural of bird is birds

44 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

This example helps to clarify how the for loop works: When we execute a loop of the form

for variable_name in sequence:


<code block>

The Python interpreter performs the following:


• For each element of sequence, it binds the name variable_name to that element and then
executes the code block
The sequence object can in fact be a very general object, as well see soon enough

Code Blocks and Indentation

In discussing the for loop, we explained that the code blocks being looped over are delimited by indentation
In fact, in Python all code blocks (i.e., those occurring inside loops, if clauses, function definitions, etc.) are
delimited by indentation
Thus, unlike most other languages, whitespace in Python code affects the output of the program
Once you get used to it, this is a good thing: It
• forces clean, consistent indentation, improving readability
• removes clutter, such as the brackets or end statements used in other languages
On the other hand, it takes a bit of care to get right, so please remember:
• The line before the start of a code block always ends in a colon
– for i in range(10):
– if x > y:
– while x < 100:
– etc., etc.
• All lines in a code block must have the same amount of indentation
• The Python standard is 4 spaces, and thats what you should use

Tabs vs Spaces

One small gotcha here is the mixing of tabs and spaces, which often leads to errors
(Important: Within text files, the internal representation of tabs and spaces is not the same)
You can use your Tab key to insert 4 spaces, but you need to make sure its configured to do so
If you are using a Jupyter notebook you will have no problems here
Also, good text editors will allow you to configure the Tab key to insert spaces instead of tabs trying
searching on line

1.3. An Introductory Example 45


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

While Loops

The for loop is the most common technique for iteration in Python
But, for the purpose of illustration, lets modify the program above to use a while loop instead

ts_length = 100
_values = []
i = 0
while i < ts_length:
e = np.random.randn()
_values.append(e)
i = i + 1
plt.plot(_values)
plt.show()

Note that
• the code block for the while loop is again delimited only by indentation
• the statement i = i + 1 can be replaced by i += 1

46 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

User-Defined Functions

Now lets go back to the for loop, but restructure our program to make the logic clearer
To this end, we will break our program into two parts:
1. A user-defined function that generates a list of random variables
2. The main part of the program that
(a) calls this function to get data
(b) plots the data
This is accomplished in the next program

def generate_data(n):
_values = []
for i in range(n):
e = np.random.randn()
_values.append(e)
return _values

data = generate_data(100)
plt.plot(data)
plt.show()

1.3. An Introductory Example 47


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Lets go over this carefully, in case youre not familiar with functions and how they work
We have defined a function called generate_data() as follows
• def is a Python keyword used to start function definitions
• def generate_data(n): indicates that the function is called generate_data, and that it
has a single argument n
• The indented code is a code block called the function bodyin this case it creates an iid list of random
draws using the same logic as before
• The return keyword indicates that _values is the object that should be returned to the calling
code
This whole function definition is read by the Python interpreter and stored in memory
When the interpreter gets to the expression generate_data(100), it executes the function body with n
set equal to 100
The net result is that the name data is bound to the list _values returned by the function

48 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Conditions

Our function generate_data() is rather limited


Lets make it slightly more useful by giving it the ability to return either standard normals or uniform random
variables on (0, 1) as required
This is achieved the next piece of code

def generate_data(n, generator_type):


_values = []
for i in range(n):
if generator_type == 'U':
e = np.random.uniform(0, 1)
else:
e = np.random.randn()
_values.append(e)
return _values

data = generate_data(100, 'U')


plt.plot(data)
plt.show()

1.3. An Introductory Example 49


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Hopefully the syntax of the if/else clause is self-explanatory, with indentation again delimiting the extent of
the code blocks
Notes
• We are passing the argument U as a string, which is why we write it as 'U'
• Notice that equality is tested with the == syntax, not =
– For example, the statement a = 10 assigns the name a to the value 10
– The expression a == 10 evaluates to either True or False, depending on the value of a
Now, there are several ways that we can simplify the code above
For example, we can get rid of the conditionals all together by just passing the desired generator type as a
function
To understand this, consider the following version

def generate_data(n, generator_type):


_values = []
for i in range(n):
e = generator_type()
_values.append(e)
return _values

data = generate_data(100, np.random.uniform)


plt.plot(data)
plt.show()

50 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Now, when we call the function generate_data(), we pass np.random.uniform as the second
argument
This object is a function
When the function call generate_data(100, np.random.uniform) is executed, Python runs the
function code block with n equal to 100 and the name generator_type bound to the function np.
random.uniform
• While these lines are executed, the names generator_type and np.random.uniform are
synonyms, and can be used in identical ways
This principle works more generallyfor example, consider the following piece of code

max(7, 2, 4) # max() is a built-in Python function

m = max
m(7, 2, 4)

1.3. An Introductory Example 51


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Here we created another name for the built-in function max(), which could then be used in identical ways
In the context of our program, the ability to bind new names to functions means that there is no problem
passing a function as an argument to another functionas we did above

List Comprehensions

We can also simplify the code for generating the list of random draws considerably by using something
called a list comprehension
List comprehensions are an elegant Python tool for creating lists
Consider the following example, where the list comprehension is on the right-hand side of the second line
animals = ['dog', 'cat', 'bird']
plurals = [animal + 's' for animal in animals]
plurals

['dogs', 'cats', 'birds']

Heres another example


range(8)

[0, 1, 2, 3, 4, 5, 6, 7]

doubles = [2 * x for x in range(8)]


doubles

[0, 2, 4, 6, 8, 10, 12, 14]

With the list comprehension syntax, we can simplify the lines


_values = []
for i in range(n):
e = generator_type()
_values.append(e)

into
_values = [generator_type() for i in range(n)]

1.3.5 Exercises

Exercise 1

Recall that n! is read as n factorial and defined as n! = n × (n − 1) × · · · × 2 × 1

52 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

There are functions to compute this in various modules, but lets write our own version as an exercise
In particular, write a function factorial such that factorial(n) returns n! for any positive integer n

Exercise 2

The binomial random variable Y ∼ Bin(n, p) represents the number of successes in n binary trials, where
each trial succeeds with probability p
Without any import besides from numpy.random import uniform, write a function
binomial_rv such that binomial_rv(n, p) generates one draw of Y
Hint: If U is uniform on (0, 1) and p ∈ (0, 1), then the expression U < p evaluates to True with probability
p

Exercise 3

Compute an approximation to π using Monte Carlo. Use no imports besides


import numpy as np

Your hints are as follows:


• If U is a bivariate uniform random variable on the unit square (0, 1)2 , then the probability that U lies
in a subset B of (0, 1)2 is equal to the area of B
• If U1 , . . . , Un are iid copies of U , then, as n gets large, the fraction that fall in B converges to the
probability of landing in B
• For a circle, area = pi * radius^2

Exercise 4

Write a program that prints one realization of the following random device:
• Flip an unbiased coin 10 times
• If 3 consecutive heads occur one or more times within this sequence, pay one dollar
• If not, pay nothing
Use no import besides from numpy.random import uniform

Exercise 5

Your next task is to simulate and plot the correlated time series
xt+1 = α xt + ϵt+1 where x0 = 0 and t = 0, . . . , T
The sequence of shocks {ϵt } is assumed to be iid and standard normal
In your solution, restrict your import statements to

1.3. An Introductory Example 53


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

import numpy as np
import matplotlib.pyplot as plt

Set T = 200 and α = 0.9

Exercise 6

To do the next exercise, you will need to know how to produce a plot legend
The following example should be sufficient to convey the idea

import numpy as np
import matplotlib.pyplot as plt

x = [np.random.randn() for i in range(100)]


plt.plot(x, label="white noise")
plt.legend()
plt.show()

Running it produces a figure like so

Now, starting with your solution to exercise 5, plot three simulated time series, one for each of the cases
α = 0, α = 0.8 and α = 0.98
In particular, you should produce (modulo randomness) a figure that looks as follows

54 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

(The figure nicely illustrates how time series with the same one-step-ahead conditional volatilities, as these
three processes have, can have very different unconditional volatilities.)
Use a for loop to step through the α values
Important hints:
• If you call the plot() function multiple times before calling show(), all of the lines you produce
will end up on the same figure
– And if you omit the argument 'b-' to the plot function, Matplotlib will automatically select
different colors for each line
• The expression 'foo' + str(42) evaluates to 'foo42'

1.3.6 Solutions

Exercise 1

def factorial(n):
k = 1
for i in range(n):
k = k * (i + 1)
return k

1.3. An Introductory Example 55


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

factorial(4)

24

Exercise 2

from numpy.random import uniform

def binomial_rv(n, p):


count = 0
for i in range(n):
U = uniform()
if U < p:
count = count + 1 # Or count += 1
return count

binomial_rv(10, 0.5)

Exercise 3

Consider the circle of diameter 1 embedded in the unit square


Let A be its area and let r = 1/2 be its radius
If we know π then we can compute A via A = πr2
But here the point is to compute π, which we can do by π = A/r2
Summary: If we can estimate the area of the unit circle, then dividing by r2 = (1/2)2 = 1/4 gives an
estimate of π
We estimate the area by sampling bivariate uniforms and looking at the fraction that fall into the unit circle

n = 100000

count = 0
for i in range(n):
u, v = np.random.uniform(), np.random.uniform()
d = np.sqrt((u - 0.5)**2 + (v - 0.5)**2)
if d < 0.5:
count += 1

area_estimate = count / n

print(area_estimate * 4) # dividing by radius**2

56 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

3.1496

Exercise 4

from numpy.random import uniform

payoff = 0
count = 0

for i in range(10):
U = uniform()
count = count + 1 if U < 0.5 else 0
if count == 3:
payoff = 1

print(payoff)

Exercise 5

The next line embeds all subsequent figures in the browser itself

α = 0.9
ts_length = 200
current_x = 0

x_values = []
for i in range(ts_length + 1):
x_values.append(current_x)
current_x = α * current_x + np.random.randn()
plt.plot(x_values)
plt.show()

1.3. An Introductory Example 57


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Exercise 6

αs = [0.0, 0.8, 0.98]


ts_length = 200

for α in αs:
x_values = []
current_x = 0
for i in range(ts_length):
x_values.append(current_x)
current_x = α * current_x + np.random.randn()
plt.plot(x_values, label=f'α = {α}')
plt.legend()
plt.show()

58 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

1.4 Python Essentials

Contents

• Python Essentials
– Data Types
– Input and Output
– Iterating
– Comparisons and Logical Operators
– More Functions
– Coding Style and PEP8
– Exercises
– Solutions

1.4. Python Essentials 59


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

In this lecture well cover features of the language that are essential to reading and writing Python code

1.4.1 Data Types

Weve already met several built in Python data types, such as strings, integers, floats and lists
Lets learn a bit more about them

Primitive Data Types

One simple data type is Boolean values, which can be either True or False

x = True
x

True

In the next line of code, the interpreter evaluates the expression on the right of = and binds y to this value

y = 100 < 10
y

False

type(y)

bool

In arithmetic expressions, True is converted to 1 and False is converted 0


This is called Boolean arithmetic and is often useful in programming
Here are some examples

x + y

x * y

True + True

60 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

bools = [True, True, False, True] # List of Boolean values

sum(bools)

The two most common data types used to represent numbers are integers and floats

a, b = 1, 2
c, d = 2.5, 10.0
type(a)

int

type(c)

float

Computers distinguish between the two because, while floats are more informative, arithmetic operations on
integers are faster and more accurate
As long as youre using Python 3.x, division of integers yields floats

1 / 2

0.5

But be careful! If youre still using Python 2.x, division of two integers returns only the integer part
For integer division in Python 3.x use this syntax:

1 // 2

Complex numbers are another primitive data type in Python

x = complex(1, 2)
y = complex(2, 1)
x * y

5j

Containers

Python has several basic types for storing collections of (possibly heterogeneous) data
Weve already discussed lists

1.4. Python Essentials 61


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

A related data type is tuples, which are immutable lists

x = ('a', 'b') # Parentheses instead of the square brackets


x = 'a', 'b' # Or no brackets --- the meaning is identical
x

('a', 'b')

type(x)

tuple

In Python, an object is called immutable if, once created, the object cannot be changed
Conversely, an object is mutable if it can still be altered after creation
Python lists are mutable

x = [1, 2]
x[0] = 10
x

[10, 2]

But tuples are not

x = (1, 2)
x[0] = 10

---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<python-input-21-6cb4d74ca096> in <module>()
----> 1 x[0]=10

TypeError: 'tuple' object does not support item assignment

Well say more about the role of mutable and immutable data a bit later
Tuples (and lists) can be unpacked as follows

integers = (10, 20, 30)


x, y, z = integers
x

10

20

Youve actually seen an example of this already

62 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Tuple unpacking is convenient and well use it often

Slice Notation

To access multiple elements of a list or tuple, you can use Pythons slice notation
For example,

a = [2, 4, 6, 8]
a[1:]

[4, 6, 8]

a[1:3]

[4, 6]

The general rule is that a[m:n] returns n - m elements, starting at a[m]


Negative numbers are also permissible

a[-2:] # Last two elements of the list

[6, 8]

The same slice notation works on tuples and strings

s = 'foobar'
s[-3:] # Select the last three elements

'bar'

Sets and Dictionaries

Two other container types we should mention before moving on are sets and dictionaries
Dictionaries are much like lists, except that the items are named instead of numbered

d = {'name': 'Frodo', 'age': 33}


type(d)

dict

d['age']

33

1.4. Python Essentials 63


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

The names 'name' and 'age' are called the keys


The objects that the keys are mapped to ('Frodo' and 33) are called the values
Sets are unordered collections without duplicates, and set methods provide the usual set theoretic operations

s1 = {'a', 'b'}
type(s1)

set

s2 = {'b', 'c'}
s1.issubset(s2)

False

s1.intersection(s2)

set(['b'])

The set() function creates sets from sequences

s3 = set(('foo', 'bar', 'foo'))


s3

set(['foo', 'bar']) # Unique elements only

1.4.2 Input and Output

Lets briefly review reading and writing to text files, starting with writing

f = open('newfile.txt', 'w') # Open 'newfile.txt' for writing


f.write('Testing\n') # Here '\n' means new line
f.write('Testing again')
f.close()

Here
• The built-in function open() creates a file object for writing to
• Both write() and close() are methods of file objects
Where is this file that weve created?
Recall that Python maintains a concept of the present working directory (pwd) that can be located from with
Jupyter or IPython via

%pwd

If a path is not specified, then this is where Python writes to

64 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

We can also use Python to read the contents of newfile.txt as follows

f = open('newfile.txt', 'r')
out = f.read()
out

'Testing\nTesting again'

print(out)

Testing
Testing again

Paths

Note that if newfile.txt is not in the present working directory then this call to open() fails
In this case you can shift the file to the pwd or specify the full path to the file

f = open('insert_full_path_to_file/newfile.txt', 'r')

1.4.3 Iterating

One of the most important tasks in computing is stepping through a sequence of data and performing a given
action
One of Pythons strengths is its simple, flexible interface to this kind of iteration via the for loop

Looping over Different Objects

Many Python objects are iterable, in the sense that they can looped over
To give an example, lets write the file us_cities.txt, which lists US cities and their population, to the present
working directory

%%file us_cities.txt
new york: 8244910
los angeles: 3819702
chicago: 2707120
houston: 2145146
philadelphia: 1536471
phoenix: 1469471
san antonio: 1359758
san diego: 1326179
dallas: 1223229

Suppose that we want to make the information more readable, by capitalizing names and adding commas to
mark thousands

1.4. Python Essentials 65


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

The program us_cities.py program reads the data in and makes the conversion:

data_file = open('us_cities.txt', 'r')


for line in data_file:
city, population = line.split(':') # Tuple unpacking
city = city.title() # Capitalize city names
population = f'{int(population):,}' # Add commas to numbers
print(city.ljust(15) + population)
data_file.close()

Here format() is a string method used for inserting variables into strings
The output is as follows

New York 8,244,910


Los Angeles 3,819,702
Chicago 2,707,120
Houston 2,145,146
Philadelphia 1,536,471
Phoenix 1,469,471
San Antonio 1,359,758
San Diego 1,326,179
Dallas 1,223,229

The reformatting of each line is the result of three different string methods, the details of which can be left
till later
The interesting part of this program for us is line 2, which shows that
1. The file object f is iterable, in the sense that it can be placed to the right of in within a for loop
2. Iteration steps through each line in the file
This leads to the clean, convenient syntax shown in our program
Many other kinds of objects are iterable, and well discuss some of them later on

Looping without Indices

One thing you might have noticed is that Python tends to favor looping without explicit indexing
For example,

x_values = [1, 2, 3] # Some iterable x


for x in x_values:
print(x * x)

is preferred to

for i in range(len(x_values)):
print(x_values[i] * x_values[i])

When you compare these two alternatives, you can see why the first one is preferred

66 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Python provides some facilities to simplify looping without indices


One is zip(), which is used for stepping through pairs from two sequences
For example, try running the following code
countries = ('Japan', 'Korea', 'China')
cities = ('Tokyo', 'Seoul', 'Beijing')
for country, city in zip(countries, cities):
print(f'The capital of {country} is {city}')

The zip() function is also useful for creating dictionaries for example
names = ['Tom', 'John']
marks = ['E', 'F']
dict(zip(names, marks))

{'John': 'F', 'Tom': 'E'}

If we actually need the index from a list, one option is to use enumerate()
To understand what enumerate() does, consider the following example
letter_list = ['a', 'b', 'c']
for index, letter in enumerate(letter_list):
print(f"letter_list[{index}] = '{letter}'")

The output of the loop is


letter_list[0] = 'a'
letter_list[1] = 'b'
letter_list[2] = 'c'

1.4.4 Comparisons and Logical Operators

Comparisons

Many different kinds of expressions evaluate to one of the Boolean values (i.e., True or False)
A common type is comparisons, such as
x, y = 1, 2
x < y

True

x > y

False

One of the nice features of Python is that we can chain inequalities

1.4. Python Essentials 67


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

1 < 2 < 3

True

1 <= 2 <= 3

True

As we saw earlier, when testing for equality we use ==

x = 1 # Assignment
x == 2 # Comparison

False

For not equal use !=

1 != 2

True

Note that when testing conditions, we can use any valid Python expression

x = 'yes' if 42 else 'no'


x

'yes'

x = 'yes' if [] else 'no'


x

'no'

Whats going on here?


The rule is:
• Expressions that evaluate to zero, empty sequences or containers (strings, lists, etc.) and None are all
equivalent to False
– for example, [] and () are equivalent to False in an if clause
• All other values are equivalent to True
– for example, 42 is equivalent to True in an if clause

Combining Expressions

We can combine expressions using and, or and not

68 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

These are the standard logical connectives (conjunction, disjunction and denial)

1 < 2 and 'f' in 'foo'

True

1 < 2 and 'g' in 'foo'

False

1 < 2 or 'g' in 'foo'

True

not True

False

not not True

True

Remember
• P and Q is True if both are True, else False
• P or Q is False if both are False, else True

1.4.5 More Functions

Lets talk a bit more about functions, which are all-important for good programming style
Python has a number of built-in functions that are available without import
We have already met some

max(19, 20)

20

range(4) # in python3 this returns a range iterator object

range(0, 4)

list(range(4)) # will evaluate the range iterator and create a list

1.4. Python Essentials 69


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

[0, 1, 2, 3]

str(22)

'22'

type(22)

int

Two more useful built-in functions are any() and all()

bools = False, True, True


all(bools) # True if all are True and False otherwise

False

any(bools) # False if all are False and True otherwise

True

The full list of Python built-ins is here


Now lets talk some more about user-defined functions constructed using the keyword def

Why Write Functions?

User defined functions are important for improving the clarity of your code by
• separating different strands of logic
• facilitating code reuse
(Writing the same thing twice is almost always a bad idea)
The basics of user defined functions were discussed here

The Flexibility of Python Functions

As we discussed in the previous lecture, Python functions are very flexible


In particular
• Any number of functions can be defined in a given file
• Functions can be (and often are) defined inside other functions
• Any object can be passed to a function as an argument, including other functions
• A function can return any kind of object, including functions

70 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

We already gave an example of how straightforward it is to pass a function to a function


Note that a function can have arbitrarily many return statements (including zero)
Execution of the function terminates when the first return is hit, allowing code like the following example

def f(x):
if x < 0:
return 'negative'
return 'nonnegative'

Functions without a return statement automatically return the special Python object None

Docstrings

Python has a system for adding comments to functions, modules, etc. called docstrings
The nice thing about docstrings is that they are available at run-time
Try running this

def f(x):
"""
This function squares its argument
"""
return x**2

After running this code, the docstring is available

f?

Type: function
String Form:<function f at 0x2223320>
File: /home/john/temp/temp.py
Definition: f(x)
Docstring: This function squares its argument

f??

Type: function
String Form:<function f at 0x2223320>
File: /home/john/temp/temp.py
Definition: f(x)
Source:
def f(x):
"""
This function squares its argument
"""
return x**2

With one question mark we bring up the docstring, and with two we get the source code as well

1.4. Python Essentials 71


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

One-Line Functions: lambda

The lambda keyword is used to create simple functions on one line


For example, the definitions

def f(x):
return x**3

and

f = lambda x: x**3

are entirely equivalent


∫2
To see why lambda is useful, suppose that we want to calculate 0 x3 dx (and have forgotten our high-
school calculus)
The SciPy library has a function called quad that will do this calculation for us
The syntax of the quad function is quad(f, a, b) where f is a function and a and b are numbers
To create the function f (x) = x3 we can use lambda as follows

from scipy.integrate import quad

quad(lambda x: x**3, 0, 2)

(4.0, 4.440892098500626e-14)

Here the function created by lambda is said to be anonymous, because it was never given a name

Keyword Arguments

If you did the exercises in the previous lecture, you would have come across the statement

plt.plot(x, 'b-', label="white noise")

In this call to Matplotlibs plot function, notice that the last argument is passed in name=argument
syntax
This is called a keyword argument, with label being the keyword
Non-keyword arguments are called positional arguments, since their meaning is determined by order
• plot(x, 'b-', label="white noise") is different from plot('b-', x,
label="white noise")
Keyword arguments are particularly useful when a function has a lot of arguments, in which case its hard to
remember the right order
You can adopt keyword arguments in user defined functions with no difficulty
The next example illustrates the syntax

72 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

def f(x, a=1, b=1):


return a + b * x

The keyword argument values we supplied in the definition of f become the default values

f(2)

They can by modified as follows

f(2, a=4, b=5)

14

1.4.6 Coding Style and PEP8

To learn more about the Python programming philosophy type import this at the prompt
Among other things, Python strongly favors consistency in programming style
Weve all heard the saying about consistency and little minds
In programming, as in mathematics, the opposite is true
• A mathematical paper where the symbols ∪ and ∩ were reversed would be very hard to read, even if
the author told you so on the first page
In Python, the standard style is set out in PEP8
(Occasionally well deviate from PEP8 in these lectures to better match mathematical notation)

1.4.7 Exercises

Solve the following exercises


(For some, the built in function sum() comes in handy)

Exercise 1

Part 1: Given two numeric lists or tuples x_vals and y_vals of equal length, compute their inner product
using zip()
Part 2: In one line, count the number of even numbers in 0,,99
• Hint: x % 2 returns 0 if x is even, 1 otherwise
Part 3: Given pairs = ((2, 5), (4, 2), (9, 8), (12, 10)), count the number of pairs
(a, b) such that both a and b are even

1.4. Python Essentials 73


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Exercise 2

Consider the polynomial


n
p(x) = a0 + a1 x + a2 x2 + · · · an xn = ai x i (1.1)
i=0

Write a function p such that p(x, coeff) that computes the value in (1.1) given a point x and a list of
coefficients coeff
Try to use enumerate() in your loop

Exercise 3

Write a function that takes a string as an argument and returns the number of capital letters in the string
Hint: 'foo'.upper() returns 'FOO'

Exercise 4

Write a function that takes two sequences seq_a and seq_b as arguments and returns True if every
element in seq_a is also an element of seq_b, else False
• By sequence we mean a list, a tuple or a string
• Do the exercise without using sets and set methods

Exercise 5

When we cover the numerical libraries, we will see they include many alternatives for interpolation and
function approximation
Nevertheless, lets write our own function approximation routine as an exercise
In particular, without using any imports, write a function linapprox that takes as arguments
• A function f mapping some interval [a, b] into R
• two scalars a and b providing the limits of this interval
• An integer n determining the number of grid points
• A number x satisfying a <= x <= b
and returns the piecewise linear interpolation of f at x, based on n evenly spaced grid points a =
point[0] < point[1] < ... < point[n-1] = b
Aim for clarity, not efficiency

74 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

1.4.8 Solutions

Exercise 1

Part 1 solution:

Heres one possible solution

x_vals = [1, 2, 3]
y_vals = [1, 1, 1]
sum([x * y for x, y in zip(x_vals, y_vals)])

This also works

sum(x * y for x, y in zip(x_vals, y_vals))

Part 2 solution:

One solution is

sum([x % 2 == 0 for x in range(100)])

50

This also works:

sum(x % 2 == 0 for x in range(100))

50

Some less natural alternatives that nonetheless help to illustrate the flexibility of list comprehensions are

len([x for x in range(100) if x % 2 == 0])

50

and

sum([1 for x in range(100) if x % 2 == 0])

50

1.4. Python Essentials 75


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Part 3 solution

Heres one possibility

pairs = ((2, 5), (4, 2), (9, 8), (12, 10))


sum([x % 2 == 0 and y % 2 == 0 for x, y in pairs])

Exercise 2

def p(x, coeff):


return sum(a * x**i for i, a in enumerate(coeff))

p(1, (2, 4))

Exercise 3

Heres one solution:

def f(string):
count = 0
for letter in string:
if letter == letter.upper() and letter.isalpha():
count += 1
return count
f('The Rain in Spain')

Exercise 4

Heres a solution:

def f(seq_a, seq_b):


is_subset = True
for a in seq_a:
if a not in seq_b:
is_subset = False
return is_subset

# == test == #

76 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

print(f([1, 2], [1, 2, 3]))


print(f([1, 2, 3], [1, 2]))

True
False

Of course if we use the sets data type then the solution is easier

def f(seq_a, seq_b):


return set(seq_a).issubset(set(seq_b))

Exercise 5

def linapprox(f, a, b, n, x):


"""
Evaluates the piecewise linear interpolant of f at x on the interval
[a, b], with n evenly spaced grid points.

Parameters
===========
f : function
The function to approximate

x, a, b : scalars (floats or integers)


Evaluation point and endpoints, with a <= x <= b

n : integer
Number of grid points

Returns
=========
A float. The interpolant evaluated at x

"""
length_of_interval = b - a
num_subintervals = n - 1
step = length_of_interval / num_subintervals

# === find first grid point larger than x === #


point = a
while point <= x:
point += step

# === x must lie between the gridpoints (point - step) and point === #
u, v = point - step, point

return f(u) + (x - u) * (f(v) - f(u)) / (v - u)

1.4. Python Essentials 77


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

1.5 OOP I: Introduction to Object Oriented Programming

Contents

• OOP I: Introduction to Object Oriented Programming


– Overview
– Objects
– Summary

1.5.1 Overview

OOP is one of the major paradigms in programming


The traditional programming paradigm (think Fortran, C, MATLAB, etc.) is called procedural
It works as follows
• The program has a state corresponding to the values of its variables
• Functions are called to act on these data
• Data are passed back and forth via function calls
In contrast, in the OOP paradigm
• data and functions are bundled together into objects
(Functions in this context are referred to as methods)

Python and OOP

Python is pragmatic language that blends object oriented and procedural styles, rather than taking a purist
approach
However, at a foundational level, Python is object oriented
In particular, in Python, everything is an object
In this lecture we explain what that statement means and why it matters

1.5.2 Objects

In Python, an object is a collection of data and instructions held in computer memory that consists of
1. a type
2. a unique identity
3. data (i.e., content)

78 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

4. methods
These concepts are defined and discussed sequentially below

Type

Python provides for different types of objects, to accommodate different categories of data
For example

s = 'This is a string'
type(s)

str

x = 42 # Now let's create an integer


type(x)

int

The type of an object matters for many expressions


For example, the addition operator between two strings means concatenation

'300' + 'cc'

'300cc'

On the other hand, between two numbers it means ordinary addition

300 + 400

700

Consider the following expression

'300' + 400

Here we are mixing types, and its unclear to Python whether the user wants to
• convert '300' to an integer and then add it to 400, or
• convert 400 to string and then concatenate it with '300'
Some languages might try to guess but Python is strongly typed
• Type is important, and implicit type conversion is rare
• Python will respond instead by raising a TypeError

1.5. OOP I: Introduction to Object Oriented Programming 79


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-1-9b7dffd27f2d> in <module>()
----> 1 '300' + 400

TypeError: Can't convert 'int' object to str implicitly

To avoid the error, you need to clarify by changing the relevant type
For example,

int('300') + 400 # To add as numbers, change the string to an integer

700

Identity

In Python, each object has a unique identifier, which helps Python (and us) keep track of the object
The identity of an object can be obtained via the id() function

y = 2.5
z = 2.5
id(y)

166719660

id(z)

166719740

In this example, y and z happen to have the same value (i.e., 2.5), but they are not the same object
The identity of an object is in fact just the address of the object in memory

Object Content: Data and Attributes

If we set x = 42 then we create an object of type int that contains the data 42
In fact it contains more, as the following example shows

x = 42
x

42

x.imag

80 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

x.__class__

int

When Python creates this integer object, it stores with it various auxiliary information, such as the imaginary
part, and the type
Any name following a dot is called an attribute of the object to the left of the dot
• e.g.,‘‘imag‘‘ and __class__ are attributes of x
We see from this example that objects have attributes that contain auxillary information
They also have attributes that act like functions, called methods
These attributes are important, so lets discuss them in depth

Methods

Methods are functions that are bundled with objects


Formally, methods are attributes of objects that are callable (i.e., can be called as functions)

x = ['foo', 'bar']
callable(x.append)

True

callable(x.__doc__)

False

Methods typically act on the data contained in the object they belong to, or combine that data with other
data

x = ['a', 'b']
x.append('c')
s = 'This is a string'
s.upper()

'THIS IS A STRING'

s.lower()

'this is a string'

1.5. OOP I: Introduction to Object Oriented Programming 81


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

s.replace('This', 'That')

'That is a string'

A great deal of Python functionality is organized around method calls


For example, consider the following piece of code

x = ['a', 'b']
x[0] = 'aa' # Item assignment using square bracket notation
x

['aa', 'b']

It doesnt look like there are any methods used here, but in fact the square bracket assignment notation is just
a convenient interface to a method call
What actually happens is that Python calls the __setitem__ method, as follows

x = ['a', 'b']
x.__setitem__(0, 'aa') # Equivalent to x[0] = 'aa'
x

['aa', 'b']

(If you wanted to you could modify the __setitem__ method, so that square bracket assignment does
something totally different)

1.5.3 Summary

In Python, everything in memory is treated as an object


This includes not just lists, strings, etc., but also less obvious things, such as
• functions (once they have been read into memory)
• modules (ditto)
• files opened for reading or writing
• integers, etc.
Consider, for example, functions
When Python reads a function definition, it creates a function object and stores it in memory
The following code illustrates

def f(x): return x**2


f

82 Chapter 1. Introduction to Python


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

<function __main__.f>

type(f)

function

id(f)

3074342220L

f.__name__

'f'

We can see that f has type, identity, attributes and so onjust like any other object
It also has methods
One example is the __call__ method, which just evaluates the function

f.__call__(3)

Another is the __dir__ method, which returns a list of attributes


Modules loaded into memory are also treated as objects

import math

id(math)

3074329380L

This uniform treatment of data in Python (everything is an object) helps keep the language simple and
consistent

1.5. OOP I: Introduction to Object Oriented Programming 83


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

84 Chapter 1. Introduction to Python


CHAPTER

TWO

THE SCIENTIFIC LIBRARIES

Next we cover the third party libraries most useful for scientific work in Python

2.1 NumPy

Contents

• NumPy
– Overview
– Introduction to NumPy
– NumPy Arrays
– Operations on Arrays
– Additional Functionality
– Exercises
– Solutions

Lets be clear: the work of science has nothing whatever to do with consensus. Consensus is the
business of politics. Science, on the contrary, requires only one investigator who happens to be
right, which means that he or she has results that are verifiable by reference to the real world.
In science consensus is irrelevant. What is relevant is reproducible results. – Michael Crichton

2.1.1 Overview

NumPy is a first-rate library for numerical programming


• Widely used in academia, finance and industry
• Mature, fast, stable and under continuous development
In this lecture we introduce NumPy arrays and the fundamental array processing operations provided by
NumPy

85
QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

References

• The official NumPy documentation

2.1.2 Introduction to NumPy

The essential problem that NumPy solves is fast array processing


For example, suppose we want to create an array of 1 million random draws from a uniform distribution and
compute the mean
If we did this in pure Python it would be orders of magnitude slower than C or Fortran
This is because
• Loops in Python over Python data types like lists carry significant overhead
• C and Fortran code contains a lot of type information that can be used for optimization
• Various optimizations can be carried out during compilation, when the compiler sees the instructions
as a whole
However, for a task like the one described above theres no need to switch back to C or Fortran
Instead we can use NumPy, where the instructions look like this:

import numpy as np

x = np.random.uniform(0, 1, size=1000000)
x.mean()

0.49990566939719772

The operations of creating the array and computing its mean are both passed out to carefully optimized
machine code compiled from C
More generally, NumPy sends operations in batches to optimized C and Fortran code
This is similar in spirit to Matlab, which provides an interface to fast Fortran routines

A Comment on Vectorization

NumPy is great for operations that are naturally vectorized


Vectorized operations are precompiled routines that can be sent in batches, like
• matrix multiplication and other linear algebra routines
• generating a vector of random numbers
• applying a fixed transformation (e.g., sine or cosine) to an entire array
In a later lecture well discuss code that isnt easy to vectorize and how such routines can also be optimized

86 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

2.1.3 NumPy Arrays

The most important thing that NumPy defines is an array data type formally called a numpy.ndarray
NumPy arrays power a large proportion of the scientific Python ecosystem
To create a NumPy array containing only zeros we use np.zeros

a = np.zeros(3)
a

array([0., 0., 0.])

type(a)

numpy.ndarray

NumPy arrays are somewhat like native Python lists, except that
• Data must be homogeneous (all elements of the same type)
• These types must be one of the data types (dtypes) provided by NumPy
The most important of these dtypes are:
• float64: 64 bit floating point number
• int64: 64 bit integer
• bool: 8 bit True or False
There are also dtypes to represent complex numbers, unsigned integers, etc
On modern machines, the default dtype for arrays is float64

a = np.zeros(3)
type(a[0])

numpy.float64

If we want to use integers we can specify as follows:

a = np.zeros(3, dtype=int)
type(a[0])

numpy.int64

Shape and Dimension

Consider the following assignment

2.1. NumPy 87
QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

z = np.zeros(10)

Here z is a flat array with no dimension neither row nor column vector
The dimension is recorded in the shape attribute, which is a tuple

z.shape

(10,)

Here the shape tuple has only one element, which is the length of the array (tuples with one element end
with a comma)
To give it dimension, we can change the shape attribute

z.shape = (10, 1)
z

array([[0.],
[0.],
[0.],
[0.],
[0.],
[0.],
[0.],
[0.],
[0.],
[0.]])

z = np.zeros(4)
z.shape = (2, 2)
z

array([[0., 0.],
[0., 0.]])

In the last case, to make the 2 by 2 array, we could also pass a tuple to the zeros() function, as in z =
np.zeros((2, 2))

Creating Arrays

As weve seen, the np.zeros function creates an array of zeros


You can probably guess what np.ones creates
Related is np.empty, which creates arrays in memory that can later be populated with data

z = np.empty(3)
z

88 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

array([0., 0., 1.])

The numbers you see here are garbage values


(Python allocates 3 contiguous 64 bit pieces of memory, and the existing contents of those memory slots are
interpreted as float64 values)
To set up a grid of evenly spaced numbers use np.linspace

z = np.linspace(2, 4, 5) # From 2 to 4, with 5 elements

To create an identity matrix use either np.identity or np.eye

z = np.identity(2)
z

array([[1., 0.],
[0., 1.]])

In addition, NumPy arrays can be created from Python lists, tuples, etc. using np.array

z = np.array([10, 20]) # ndarray from Python list


z

array([10, 20])

type(z)

numpy.ndarray

z = np.array((10, 20), dtype=float) # Here 'float' is equivalent to 'np.


,→float64'

array([ 10., 20.])

z = np.array([[1, 2], [3, 4]]) # 2D array from a list of lists


z

array([[1, 2],
[3, 4]])

See also np.asarray, which performs a similar function, but does not make a distinct copy of data already
in a NumPy array

na = np.linspace(10, 20, 2)
na is np.asarray(na) # Does not copy NumPy arrays

2.1. NumPy 89
QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

True

na is np.array(na) # Does make a new copy --- perhaps unnecessarily

False

To read in the array data from a text file containing numeric data use np.loadtxt or np.
genfromtxtsee the documentation for details

Array Indexing

For a flat array, indexing is the same as Python sequences:

z = np.linspace(1, 2, 5)
z

array([1. , 1.25, 1.5 , 1.75, 2. ])

z[0]

1.0

z[0:2] # Two elements, starting at element 0

array([ 1. , 1.25])

z[-1]

2.0

For 2D arrays the index syntax is as follows:

z = np.array([[1, 2], [3, 4]])


z

array([[1, 2],
[3, 4]])

z[0, 0]

z[0, 1]

90 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

And so on
Note that indices are still zero-based, to maintain compatibility with Python sequences
Columns and rows can be extracted as follows

z[0, :]

array([1, 2])

z[:, 1]

array([2, 4])

NumPy arrays of integers can also be used to extract elements

z = np.linspace(2, 4, 5)
z

array([2. , 2.5, 3. , 3.5, 4. ])

indices = np.array((0, 2, 3))


z[indices]

array([2. , 3. , 3.5])

Finally, an array of dtype bool can be used to extract elements

array([2. , 2.5, 3. , 3.5, 4. ])

d = np.array([0, 1, 1, 0, 0], dtype=bool)


d

array([False, True, True, False, False])

z[d]

array([2.5, 3. ])

Well see why this is useful below


An aside: all elements of an array can be set equal to one number using slice notation

2.1. NumPy 91
QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

z = np.empty(3)
z

array([2. , 3. , 3.5])

z[:] = 42
z

array([42., 42., 42.])

Array Methods

Arrays have useful methods, all of which are carefully optimized

a = np.array((4, 3, 2, 1))
a

array([4, 3, 2, 1])

a.sort() # Sorts a in place


a

array([1, 2, 3, 4])

a.sum() # Sum

10

a.mean() # Mean

2.5

a.max() # Max

a.argmax() # Returns the index of the maximal element

a.cumsum() # Cumulative sum of the elements of a

array([ 1, 3, 6, 10])

92 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

a.cumprod() # Cumulative product of the elements of a

array([ 1, 2, 6, 24])

a.var() # Variance

1.25

a.std() # Standard deviation

1.1180339887498949

a.shape = (2, 2)
a.T # Equivalent to a.transpose()

array([[1, 3],
[2, 4]])

Another method worth knowing is searchsorted()


If z is a nondecreasing array, then z.searchsorted(a) returns the index of the first element of z that
is >= a

z = np.linspace(2, 4, 5)
z

array([2. , 2.5, 3. , 3.5, 4. ])

z.searchsorted(2.2)

Many of the methods discussed above have equivalent functions in the NumPy namespace

a = np.array((4, 3, 2, 1))

np.sum(a)

10

np.mean(a)

2.5

2.1. NumPy 93
QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

2.1.4 Operations on Arrays

Arithmetic Operations

The operators +, -, *, / and ** all act elementwise on arrays

a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])
a + b

array([ 6, 8, 10, 12])

a * b

array([ 5, 12, 21, 32])

We can add a scalar to each element as follows

a + 10

array([11, 12, 13, 14])

Scalar multiplication is similar

a * 10

array([10, 20, 30, 40])

The two dimensional arrays follow the same general rules

A = np.ones((2, 2))
B = np.ones((2, 2))
A + B

array([[2., 2.],
[2., 2.]])

A + 10

array([[11., 11.],
[11., 11.]])

A * B

array([[1., 1.],
[1., 1.]])

In particular, A * B is not the matrix product, it is an element-wise product

94 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Matrix Multiplication

With Anacondas scientific Python package based around Python 3.5 and above, one can use the @ symbol
for matrix multiplication, as follows:

A = np.ones((2, 2))
B = np.ones((2, 2))
A @ B

array([[2., 2.],
[2., 2.]])

(For older versions of Python and NumPy you need to use the np.dot function)
We can also use @ to take the inner product of two flat arrays

A = np.array((1, 2))
B = np.array((10, 20))
A @ B

50

In fact, we can use @ when one element is a Python list or tuple

A = np.array(((1, 2), (3, 4)))


A

array([[1, 2],
[3, 4]])

A @ (0, 1)

array([2, 4])

Since we are postmultiplying, the tuple is treated as a column vector

Mutability and Copying Arrays

NumPy arrays are mutable data types, like Python lists


In other words, their contents can be altered (mutated) in memory after initialization
We already saw examples above
Heres another example:

a = np.array([42, 44])
a

2.1. NumPy 95
QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

array([42, 44])

a[-1] = 0 # Change last element to 0


a

array([42, 0])

Mutability leads to the following behavior (which can be shocking to MATLAB programmers)

a = np.random.randn(3)
a

array([ 0.69695818, -0.05165053, -1.12617761])

b = a
b[0] = 0.0
a

array([ 0. , -0.05165053, -1.12617761])

Whats happened is that we have changed a by changing b


The name b is bound to a and becomes just another reference to the array (the Python assignment model is
described in more detail later in the course)
Hence, it has equal rights to make changes to that array
This is in fact the most sensible default behavior!
It means that we pass around only pointers to data, rather than making copies
Making copies is expensive in terms of both speed and memory

Making Copies

It is of course possible to make b an independent copy of a when required


This can be done using np.copy

a = np.random.randn(3)
a

array([ 0.67357176, -0.16532174, 0.36539759])

b = np.copy(a)
b

array([ 0.67357176, -0.16532174, 0.36539759])

96 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Now b is an independent copy (called a deep copy)

b[:] = 1
b

array([1., 1., 1.])

array([0.67357176, -0.16532174, 0.36539759])

Note that the change to b has not affected a

2.1.5 Additional Functionality

Lets look at some other useful things we can do with NumPy

Vectorized Functions

NumPy provides versions of the standard functions log, exp, sin, etc. that act element-wise on arrays

z = np.array([1, 2, 3])
np.sin(z)

array([ 0.84147098, 0.90929743, 0.14112001])

This eliminates the need for explicit element-by-element loops such as

n = len(z)
y = np.empty(n)
for i in range(n):
y[i] = np.sin(z[i])

Because they act element-wise on arrays, these functions are called vectorized functions
In NumPy-speak, they are also called ufuncs, which stands for universal functions
As we saw above, the usual arithmetic operations (+, *, etc.) also work element-wise, and combining these
with the ufuncs gives a very large set of fast element-wise functions

array([1, 2, 3])

(1 / np.sqrt(2 * np.pi)) * np.exp(- 0.5 * z**2)

array([0.24197072, 0.05399097, 0.00443185])

2.1. NumPy 97
QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Not all user defined functions will act element-wise


For example, passing the function f defined below a NumPy array causes a ValueError

def f(x):
return 1 if x > 0 else 0

The NumPy function np.where provides a vectorized alternative:

x = np.random.randn(4)
x

array([-0.25521782, 0.38285891, -0.98037787, -0.083662 ])

np.where(x > 0, 1, 0) # Insert 1 if x > 0 true, otherwise 0

array([0, 1, 0, 0])

You can also use np.vectorize to vectorize a given function

def f(x): return 1 if x > 0 else 0

f = np.vectorize(f)
f(x) # Passing the same vector x as in the previous example

array([0, 1, 0, 0])

However, this approach doesnt always obtain the same speed as a more carefully crafted vectorized function

Comparisons

As a rule, comparisons on arrays are done element-wise

z = np.array([2, 3])
y = np.array([2, 3])
z == y

array([ True, True])

y[0] = 5
z == y

array([False, True])

z != y

array([ True, False])

98 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

The situation is similar for >, <, >= and <=


We can also do comparisons against scalars

z = np.linspace(0, 10, 5)
z

array([ 0. , 2.5, 5. , 7.5, 10. ])

z > 3

array([False, False, True, True, True])

This is particularly useful for conditional extraction

b = z > 3
b

array([False, False, True, True, True])

z[b]

array([ 5. , 7.5, 10. ])

Of course we canand frequently doperform this in one step

z[z > 3]

array([ 5. , 7.5, 10. ])

Subpackages

NumPy provides some additional functionality related to scientific programming through its subpackages
Weve already seen how we can generate random variables using np.random

z = np.random.randn(10000) # Generate standard normals


y = np.random.binomial(10, 0.5, size=1000) # 1,000 draws from Bin(10, 0.5)
y.mean()

5.096

Another commonly used subpackage is np.linalg

A = np.array([[1, 2], [3, 4]])

np.linalg.det(A) # Compute the determinant

2.1. NumPy 99
QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

-2.0000000000000004

np.linalg.inv(A) # Compute the inverse

array([[-2. , 1. ],
[ 1.5, -0.5]])

Much of this functionality is also available in SciPy, a collection of modules that are built on top of NumPy
Well cover the SciPy versions in more detail soon
For a comprehensive list of whats available in NumPy see this documentation

2.1.6 Exercises

Exercise 1

Consider the polynomial expression


N
p(x) = a0 + a1 x + a2 x2 + · · · aN xN = an x n (2.1)
n=0

Earlier, you wrote a simple function p(x, coeff) to evaluate (2.1) without considering efficiency
Now write a new function that does the same job, but uses NumPy arrays and array operations for its
computations, rather than any form of Python loop
(Such functionality is already implemented as np.poly1d, but for the sake of the exercise dont use this
class)
• Hint: Use np.cumprod()

Exercise 2

Let q be a NumPy array of length n with q.sum() == 1


Suppose that q represents a probability mass function
We wish to generate a discrete random variable x such that P{x = i} = qi
In other words, x takes values in range(len(q)) and x = i with probability q[i]
The standard (inverse transform) algorithm is as follows:
• Divide the unit interval [0, 1] into n subintervals I0 , I1 , . . . , In−1 such that the length of Ii is qi
• Draw a uniform random variable U on [0, 1] and return the i such that U ∈ Ii
The probability of drawing i is the length of Ii , which is equal to qi
We can implement the algorithm as follows

100 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

from random import uniform

def sample(q):
a = 0.0
U = uniform(0, 1)
for i in range(len(q)):
if a < U <= a + q[i]:
return i
a = a + q[i]

If you cant see how this works, try thinking through the flow for a simple example, such as q = [0.25,
0.75] It helps to sketch the intervals on paper
Your exercise is to speed it up using NumPy, avoiding explicit loops
• Hint: Use np.searchsorted and np.cumsum
If you can, implement the functionality as a class called discreteRV, where
• the data for an instance of the class is the vector of probabilities q
• the class has a draw() method, which returns one draw according to the algorithm described above
If you can, write the method so that draw(k) returns k draws from q

Exercise 3

Recall our earlier discussion of the empirical cumulative distribution function


Your task is to
1. Make the __call__ method more efficient using NumPy
2. Add a method that plots the ECDF over [a, b], where a and b are method parameters

2.1.7 Solutions

import matplotlib.pyplot as plt

Exercise 1

This code does the job

def p(x, coef):


X = np.empty(len(coef))
X[0] = 1
X[1:] = x
y = np.cumprod(X) # y = [1, x, x**2,...]
return coef @ y

Lets test it

2.1. NumPy 101


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

coef = np.ones(3)
print(coef)
print(p(1, coef))
# For comparison
q = np.poly1d(coef)
print(q(1))

[1. 1. 1.]
3.0
3.0

Exercise 2

Heres our first pass at a solution:

from numpy import cumsum


from numpy.random import uniform

class DiscreteRV:
"""
Generates an array of draws from a discrete random variable with vector of
probabilities given by q.
"""

def __init__(self, q):


"""
The argument q is a NumPy array, or array like, nonnegative and sums
to 1
"""
self.q = q
self.Q = cumsum(q)

def draw(self, k=1):


"""
Returns k draws from q. For each such draw, the value i is returned
with probability q[i].
"""
return self.Q.searchsorted(uniform(0, 1, size=k))

The logic is not obvious, but if you take your time and read it slowly, you will understand
There is a problem here, however
Suppose that q is altered after an instance of discreteRV is created, for example by

q = (0.1, 0.9)
d = DiscreteRV(q)
d.q = (0.5, 0.5)

The problem is that Q does not change accordingly, and Q is the data used in the draw method
To deal with this, one option is to compute Q every time the draw method is called

102 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

But this is inefficient relative to computing Q once off


A better option is to use descriptors
A solution from the quantecon library using descriptors that behaves as we desire can be found here

Exercise 3

An example solution is given below.


In essence weve just taken this code from QuantEcon and added in a plot method

"""
Modifies ecdf.py from QuantEcon to add in a plot method

"""

class ECDF:
"""
One-dimensional empirical distribution function given a vector of
observations.

Parameters
----------
observations : array_like
An array of observations

Attributes
----------
observations : array_like
An array of observations

"""

def __init__(self, observations):


self.observations = np.asarray(observations)

def __call__(self, x):


"""
Evaluates the ecdf at x

Parameters
----------
x : scalar(float)
The x at which the ecdf is evaluated

Returns
-------
scalar(float)
Fraction of the sample less than x

"""
return np.mean(self.observations <= x)

2.1. NumPy 103


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

def plot(self, a=None, b=None):


"""
Plot the ecdf on the interval [a, b].

Parameters
----------
a : scalar(float), optional(default=None)
Lower end point of the plot interval
b : scalar(float), optional(default=None)
Upper end point of the plot interval

"""

# === choose reasonable interval if [a, b] not specified === #


if a is None:
a = self.observations.min() - self.observations.std()
if b is None:
b = self.observations.max() + self.observations.std()

# === generate plot === #


x_vals = np.linspace(a, b, num=100)
f = np.vectorize(self.__call__)
plt.plot(x_vals, f(x_vals))
plt.show()

Heres an example of usage

X = np.random.randn(1000)
F = ECDF(X)
F.plot()

104 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

2.2 Matplotlib

Contents

• Matplotlib
– Overview
– The APIs
– More Features
– Further Reading
– Exercises
– Solutions

2.2.1 Overview

Weve already generated quite a few figures in these lectures using Matplotlib
Matplotlib is an outstanding graphics library, designed for scientific computing, with
• high quality 2D and 3D plots
• output in all the usual formats (PDF, PNG, etc.)
• LaTeX integration
• fine grained control over all aspects of presentation
• animation, etc.

Matplotlibs Split Personality

Matplotlib is unusual in that it offers two different interfaces to plotting


One is a simple MATLAB-style API (Application Programming Interface) that was written to help MAT-
LAB refugees find a ready home
The other is a more Pythonic object-oriented API
For reasons described below, we recommend that you use the second API
But first, lets discuss the difference

2.2.2 The APIs

The MATLAB-style API

Heres the kind of easy example you might find in introductory treatments

2.2. Matplotlib 105


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

import matplotlib.pyplot as plt


import numpy as np

x = np.linspace(0, 10, 200)


y = np.sin(x)

plt.plot(x, y, 'b-', linewidth=2)


plt.show()

This is simple and convenient, but also somewhat limited and un-Pythonic
For example, in the function calls, a lot of objects get created and passed around without making themselves
known to the programmer
Python programmers tend to prefer a more explicit style of programming (run import this in a code
block and look at the second line)
This leads us to the alternative, object oriented Matplotlib API

The Object Oriented API

Heres the code corresponding to the preceding figure using the object-oriented API

fig, ax = plt.subplots()
ax.plot(x, y, 'b-', linewidth=2)
plt.show()

106 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Here the call fig, ax = plt.subplots() returns a pair, where


• fig is a Figure instancelike a blank canvas
• ax is an AxesSubplot instancethink of a frame for plotting in
The plot() function is actually a method of ax
While theres a bit more typing, the more explicit use of objects gives us better control
This will become more clear as we go along

Tweaks

Here weve changed the line to red and added a legend

fig, ax = plt.subplots()
ax.plot(x, y, 'r-', linewidth=2, label='sine function', alpha=0.6)
ax.legend()
plt.show()

2.2. Matplotlib 107


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Weve also used alpha to make the line slightly transparentwhich makes it look smoother
The location of the legend can be changed by replacing ax.legend() with ax.legend(loc='upper
center')

fig, ax = plt.subplots()
ax.plot(x, y, 'r-', linewidth=2, label='sine function', alpha=0.6)
ax.legend(loc='upper center')
plt.show()

108 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

If everything is properly configured, then adding LaTeX is trivial

fig, ax = plt.subplots()
ax.plot(x, y, 'r-', linewidth=2, label='$y=\sin(x)$', alpha=0.6)
ax.legend(loc='upper center')
plt.show()

The figure now looks as follows

Controlling the ticks, adding titles and so on is also straightforward

fig, ax = plt.subplots()
ax.plot(x, y, 'r-', linewidth=2, label='$y=\sin(x)$', alpha=0.6)
ax.legend(loc='upper center')
ax.set_yticks([-1, 0, 1])
ax.set_title('Test plot')
plt.show()

Heres the figure

2.2. Matplotlib 109


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

2.2.3 More Features

Matplotlib has a huge array of functions and features, which you can discover over time as you have need
for them
We mention just a few

Multiple Plots on One Axis

Its straightforward to generate multiple plots on the same axes


Heres an example that randomly generates three normal densities and adds a label with their mean

from scipy.stats import norm


from random import uniform

fig, ax = plt.subplots()
x = np.linspace(-4, 4, 150)
for i in range(3):
m, s = uniform(-1, 1), uniform(1, 2)
y = norm.pdf(x, loc=m, scale=s)
current_label = f'$\mu = {m:.2}$'
ax.plot(x, y, linewidth=2, alpha=0.6, label=current_label)
ax.legend()
plt.show()

110 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Multiple Subplots

Sometimes we want multiple subplots in one figure


Heres an example that generates 6 histograms

num_rows, num_cols = 3, 2
fig, axes = plt.subplots(num_rows, num_cols, figsize=(10, 12))
for i in range(num_rows):
for j in range(num_cols):
m, s = uniform(-1, 1), uniform(1, 2)
x = norm.rvs(loc=m, scale=s, size=100)
axes[i, j].hist(x, alpha=0.6, bins=20)
t = f'$\mu = {m:.2}, \quad \sigma = {s:.2}$'
axes[i, j].set(title=t, xticks=[-4, 0, 4], yticks=[])
plt.show()

The output looks as follows

2.2. Matplotlib 111


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

3D Plots

Matplotlib does a nice job of 3D plots here is one example

112 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

from mpl_toolkits.mplot3d.axes3d import Axes3D


from matplotlib import cm

def f(x, y):


return np.cos(x**2 + y**2) / (1 + x**2 + y**2)

xgrid = np.linspace(-3, 3, 50)


ygrid = xgrid
x, y = np.meshgrid(xgrid, ygrid)

fig = plt.figure(figsize=(8, 6))


ax = fig.add_subplot(111, projection='3d')
ax.plot_surface(x,
y,
f(x, y),
rstride=2, cstride=2,
cmap=cm.jet,
alpha=0.7,
linewidth=0.25)
ax.set_zlim(-0.5, 1.0)
plt.show()

2.2. Matplotlib 113


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

A Customizing Function

Perhaps you will find a set of customizations that you regularly use
Suppose we usually prefer our axes to go through the origin, and to have a grid
Heres a nice example from Matthew Doty of how the object-oriented API can be used to build a custom
subplots function that implements these changes
Read carefully through the code and see if you can follow whats going on

def subplots():
"Custom subplots with axes throught the origin"
fig, ax = plt.subplots()

# Set the axes through the origin


for spine in ['left', 'bottom']:
ax.spines[spine].set_position('zero')
for spine in ['right', 'top']:
ax.spines[spine].set_color('none')

ax.grid()
return fig, ax

fig, ax = subplots() # Call the local version, not plt.subplots()


x = np.linspace(-2, 10, 200)
y = np.sin(x)
ax.plot(x, y, 'r-', linewidth=2, label='sine function', alpha=0.6)
ax.legend(loc='lower right')
plt.show()

Heres the figure it produces (note axes through the origin and the grid)

114 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

The custom subplots function


1. calls the standard plt.subplots function internally to generate the fig, ax pair,
2. makes the desired customizations to ax, and
3. passes the fig, ax pair back to the calling code

2.2.4 Further Reading

• The Matplotlib gallery provides many examples


• A nice Matplotlib tutorial by Nicolas Rougier, Mike Muller and Gael Varoquaux
• mpltools allows easy switching between plot styles
• Seaborn facilitates common statistics plots in Matplotlib

2.2.5 Exercises

Exercise 1

Plot the function

f (x) = cos(πθx) exp(−x)

over the interval [0, 5] for each θ in np.linspace(0, 2, 10)


Place all the curves in the same figure
The output should look like this

2.2. Matplotlib 115


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

2.2.6 Solutions

Exercise 1

Heres one solution

θ_vals = np.linspace(0, 2, 10)


x = np.linspace(0, 5, 200)
fig, ax = plt.subplots()

for θ in θ_vals:
ax.plot(x, np.cos(np.pi * θ * x) * np.exp(- x))

plt.show()

2.3 SciPy

Contents

• SciPy
– SciPy versus NumPy
– Statistics
– Roots and Fixed Points
– Optimization
– Integration
– Linear Algebra
– Exercises
– Solutions

SciPy builds on top of NumPy to provide common tools for scientific programming, such as
• linear algebra
• numerical integration
• interpolation
• optimization
• distributions and random number generation
• signal processing
• etc., etc

116 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Like NumPy, SciPy is stable, mature and widely used


Many SciPy routines are thin wrappers around industry-standard Fortran libraries such as LAPACK, BLAS,
etc.
Its not really necessary to learn SciPy as a whole
A more common approach is to get some idea of whats in the library and then look up documentation as
required
In this lecture we aim only to highlight some useful parts of the package

2.3.1 SciPy versus NumPy

SciPy is a package that contains various tools that are built on top of NumPy, using its array data type and
related functionality
In fact, when we import SciPy we also get NumPy, as can be seen from the SciPy initialization file

# Import numpy symbols to scipy name space


import numpy as _num
linalg = None
from numpy import *
from numpy.random import rand, randn
from numpy.fft import fft, ifft
from numpy.lib.scimath import *

__all__ = []
__all__ += _num.__all__
__all__ += ['randn', 'rand', 'fft', 'ifft']

del _num
# Remove the linalg imported from numpy so that the scipy.linalg package can
,→be
# imported.
del linalg
__all__.remove('linalg')

However, its more common and better practice to use NumPy functionality explicitly

import numpy as np

a = np.identity(3)

What is useful in SciPy is the functionality in its subpackages


• scipy.optimize, scipy.integrate, scipy.stats, etc.
These subpackages and their attributes need to be imported separately

from scipy.integrate import quad


from scipy.optimize import brentq
# etc

2.3. SciPy 117


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Lets explore some of the major subpackages

2.3.2 Statistics

The scipy.stats subpackage supplies


• numerous random variable objects (densities, cumulative distributions, random sampling, etc.)
• some estimation procedures
• some statistical tests

Random Variables and Distributions

Recall that numpy.random provides functions for generating random variables

np.random.beta(5, 5, size=3)

array([ 0.6167565 , 0.67994589, 0.32346476])

This generates a draw from the distribution below when a, b = 5, 5

x(a−1) (1 − x)(b−1)
f (x; a, b) = ∫ 1 (0 ≤ x ≤ 1) (2.2)
(a−1) u(b−1) du
0 u

Sometimes we need access to the density itself, or the cdf, the quantiles, etc.
For this we can use scipy.stats, which provides all of this functionality as well as random number
generation in a single consistent interface
Heres an example of usage

from scipy.stats import beta


import matplotlib.pyplot as plt

q = beta(5, 5) # Beta(a, b), with a = b = 5


obs = q.rvs(2000) # 2000 observations
grid = np.linspace(0.01, 0.99, 100)

fig, ax = plt.subplots(figsize=(10, 6))


ax.hist(obs, bins=40, normed=True)
ax.plot(grid, q.pdf(grid), 'k-', linewidth=2)
plt.show()

The following plot is produced

118 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

In this code we created a so-called rv_frozen object, via the call q = beta(5, 5)
The frozen part of the notation implies that q represents a particular distribution with a particular set of
parameters
Once weve done so, we can then generate random numbers, evaluate the density, etc., all from this fixed
distribution

q.cdf(0.4) # Cumulative distribution function

0.2665676800000002

q.pdf(0.4) # Density function

2.0901888000000004

q.ppf(0.8) # Quantile (inverse cdf) function

0.63391348346427079

q.mean()

0.5

The general syntax for creating these objects is

2.3. SciPy 119


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

identifier = scipy.stats.distribution_name(shape_parameters)
where distribution_name is one of the distribution names in scipy.stats
There are also two keyword arguments, loc and scale, which following our example above, are called as
identifier = scipy.stats.distribution_name(shape_parameters,
loc=c, scale=d)
These transform the original random variable X into Y = c + dX
The methods rvs, pdf, cdf, etc. are transformed accordingly
Before finishing this section, we note that there is an alternative way of calling the methods described above
For example, the previous code can be replaced by

obs = beta.rvs(5, 5, size=2000)


grid = np.linspace(0.01, 0.99, 100)

fig, ax = plt.subplots()
ax.hist(obs, bins=40, normed=True)
ax.plot(grid, beta.pdf(grid, 5, 5), 'k-', linewidth=2)
plt.show()

Other Goodies in scipy.stats

There are a variety statistical functions in scipy.stats


For example, scipy.stats.linregress implements simple linear regression

from scipy.stats import linregress

x = np.random.randn(200)
y = 2 * x + 0.1 * np.random.randn(200)
gradient, intercept, r_value, p_value, std_err = linregress(x, y)
gradient, intercept

(1.9962554379482236, 0.008172822032671799)

To see the full list, consult the documentation

2.3.3 Roots and Fixed Points

A root of a real function f on [a, b] is an x ∈ [a, b] such that f (x) = 0


For example, if we plot the function

f (x) = sin(4(x − 1/4)) + x + x20 − 1 (2.3)

with x ∈ [0, 1] we get

120 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

f = lambda x: np.sin(4 * (x - 1/4)) + x + x**20 - 1


x = np.linspace(0, 1, 100)

plt.figure(figsize=(10, 8))
plt.plot(x, f(x))
plt.axhline(ls='--', c='k')
plt.show()

The unique root is approximately 0.408


Lets consider some numerical techniques for finding roots

Bisection

One of the most common algorithms for numerical root finding is bisection
To understand the idea, recall the well known game where
• Player A thinks of a secret number between 1 and 100

2.3. SciPy 121


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

• Player B asks if its less than 50


– If yes, B asks if its less than 25
– If no, B asks if its less than 75
And so on
This is bisection
Heres a fairly simplistic implementation of the algorithm in Python
It works for all sufficiently well behaved increasing continuous functions with f (a) < 0 < f (b)

def bisect(f, a, b, tol=10e-5):


"""
Implements the bisection root finding algorithm, assuming that f is a
real-valued function on [a, b] satisfying f(a) < 0 < f(b).
"""
lower, upper = a, b

while upper - lower > tol:


middle = 0.5 * (upper + lower)
# === if root is between lower and middle === #
if f(middle) > 0:
lower, upper = lower, middle
# === if root is between middle and upper === #
else:
lower, upper = middle, upper

return 0.5 * (upper + lower)

In fact SciPy provides its own bisection function, which we now test using the function f defined in (2.3)

from scipy.optimize import bisect

bisect(f, 0, 1)

0.40829350427936706

The Newton-Raphson Method

Another very common root-finding algorithm is the Newton-Raphson method


In SciPy this algorithm is implemented by scipy.optimize.newton
Unlike bisection, the Newton-Raphson method uses local slope information
This is a double-edged sword:
• When the function is well-behaved, the Newton-Raphson method is faster than bisection
• When the function is less well-behaved, the Newton-Raphson might fail

122 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Lets investigate this using the same function f , first looking at potential instability

from scipy.optimize import newton

newton(f, 0.2) # Start the search at initial condition x = 0.2

0.40829350427935679

newton(f, 0.7) # Start the search at x = 0.7 instead

0.70017000000002816

The second initial condition leads to failure of convergence


On the other hand, using IPythons timeit magic, we see that newton can be much faster

%timeit bisect(f, 0, 1)

1000 loops, best of 3: 261 us per loop

%timeit newton(f, 0.2)

10000 loops, best of 3: 60.2 us per loop

Hybrid Methods

So far we have seen that the Newton-Raphson method is fast but not robust
This bisection algorithm is robust but relatively slow
This illustrates a general principle
• If you have specific knowledge about your function, you might be able to exploit it to generate effi-
ciency
• If not, then the algorithm choice involves a trade-off between speed of convergence and robustness
In practice, most default algorithms for root finding, optimization and fixed points use hybrid methods
These methods typically combine a fast method with a robust method in the following manner:
1. Attempt to use a fast method
2. Check diagnostics
3. If diagnostics are bad, then switch to a more robust algorithm
In scipy.optimize, the function brentq is such a hybrid method, and a good default

brentq(f, 0, 1)

2.3. SciPy 123


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

0.40829350427936706

%timeit brentq(f, 0, 1)

10000 loops, best of 3: 63.2 us per loop

Here the correct solution is found and the speed is almost the same as newton

Multivariate Root Finding

Use scipy.optimize.fsolve, a wrapper for a hybrid method in MINPACK


See the documentation for details

Fixed Points

SciPy has a function for finding (scalar) fixed points too

from scipy.optimize import fixed_point

fixed_point(lambda x: x**2, 10.0) # 10.0 is an initial guess

array(1.0)

If you dont get good results, you can always switch back to the brentq root finder, since the fixed point of
a function f is the root of g(x) := x − f (x)

2.3.4 Optimization

Most numerical packages provide only functions for minimization


Maximization can be performed by recalling that the maximizer of a function f on domain D is the mini-
mizer of −f on D
Minimization is closely related to root finding: For smooth functions, interior optima correspond to roots of
the first derivative
The speed/robustness trade-off described above is present with numerical optimization too
Unless you have some prior information you can exploit, its usually best to use hybrid methods
For constrained, univariate (i.e., scalar) minimization, a good hybrid option is fminbound

from scipy.optimize import fminbound

fminbound(lambda x: x**2, -1, 2) # Search in [-1, 2]

124 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

0.0

Multivariate Optimization

Multivariate local optimizers include minimize, fmin, fmin_powell, fmin_cg, fmin_bfgs, and
fmin_ncg
Constrained multivariate local optimizers include fmin_l_bfgs_b, fmin_tnc, fmin_cobyla
See the documentation for details

2.3.5 Integration

Most numerical integration methods work by computing the integral of an approximating polynomial
The resulting error depends on how well the polynomial fits the integrand, which in turn depends on how
regular the integrand is
In SciPy, the relevant module for numerical integration is scipy.integrate
A good default for univariate integration is quad

from scipy.integrate import quad

integral, error = quad(lambda x: x**2, 0, 1)


integral

0.33333333333333337

In fact quad is an interface to a very standard numerical integration routine in the Fortran library QUAD-
PACK
It uses Clenshaw-Curtis quadrature, based on expansion in terms of Chebychev polynomials
There are other options for univariate integrationa useful one is fixed_quad, which is fast and hence
works well inside for loops
There are also functions for multivariate integration
See the documentation for more details

2.3.6 Linear Algebra

We saw that NumPy provides a module for linear algebra called linalg
SciPy also provides a module for linear algebra with the same name
The latter is not an exact superset of the former, but overall it has more functionality
We leave you to investigate the set of available routines

2.3. SciPy 125


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

2.3.7 Exercises

Exercise 1

Previously we discussed the concept of recursive function calls


Write a recursive implementation of the bisection function described above, which we repeat here for con-
venience

def bisect(f, a, b, tol=10e-5):


"""
Implements the bisection root finding algorithm, assuming that f is a
real-valued function on [a, b] satisfying f(a) < 0 < f(b).
"""
lower, upper = a, b

while upper - lower > tol:


middle = 0.5 * (upper + lower)
# === if root is between lower and middle === #
if f(middle) > 0:
lower, upper = lower, middle
# === if root is between middle and upper === #
else:
lower, upper = middle, upper

return 0.5 * (upper + lower)

Test it on the function f = lambda x: np.sin(4 * (x - 0.25)) + x + x**20 - 1 dis-


cussed above

2.3.8 Solutions

Exercise 1

Heres a reasonable solution:

def bisect(f, a, b, tol=10e-5):


"""
Implements the bisection root finding algorithm, assuming that f is a
real-valued function on [a, b] satisfying f(a) < 0 < f(b).
"""
lower, upper = a, b
if upper - lower < tol:
return 0.5 * (upper + lower)
else:
middle = 0.5 * (upper + lower)
print(f'Current mid point = {middle}')
if f(middle) > 0: # Implies root is between lower and middle
bisect(f, lower, middle)

126 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

else: # Implies root is between middle and upper


bisect(f, middle, upper)

We can test it as follows

f = lambda x: np.sin(4 * (x - 0.25)) + x + x**20 - 1


bisect(f, 0, 1)

Current mid point = 0.5


Current mid point = 0.25
Current mid point = 0.375
Current mid point = 0.4375
Current mid point = 0.40625
Current mid point = 0.421875
Current mid point = 0.4140625
Current mid point = 0.41015625
Current mid point = 0.408203125
Current mid point = 0.4091796875
Current mid point = 0.40869140625
Current mid point = 0.408447265625
Current mid point = 0.4083251953125
Current mid point = 0.40826416015625

2.4 Numba

Contents

• Numba
– Overview
– Where are the Bottlenecks?
– Vectorization
– Numba

2.4.1 Overview

In our lecture on NumPy we learned one method to improve speed and efficiency in numerical work
That method, called vectorization, involved sending array processing operations in batch to efficient low
level code
This clever idea dates back to Matlab, which uses it extensively
Unfortunately, vectorization is limited and has several weaknesses
One weakness is that it is highly memory intensive

2.4. Numba 127


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Another problem is that only some algorithms can be vectorized


In the last few years, a new Python library called Numba has appeared that solves many of these problems
It does so through something called just in time (JIT) compilation
JIT compilation is effective in many numerical settings and can generate extremely fast, efficient code
It can also do other tricks such as facilitate multithreading (a form of parallelization well suited to numerical
work)

The Need for Speed

To understand what Numba does and why, we need some background knowledge
Lets start by thinking about higher level languages, such as Python
These languages are optimized for humans
This means that the programmer can leave many details to the runtime environment
• specifying variable types
• memory allocation/deallocation, etc.
The upside is that, compared to low-level languages, Python is typically faster to write, less error prone and
easier to debug
The downside is that Python is harder to optimize that is, turn into fast machine code than languages like
C or Fortran
Indeed, the standard implementation of Python (called CPython) cannot match the speed of compiled lan-
guages such as C or Fortran
Does that mean that we should just switch to C or Fortran for everything?
The answer is no, no and one hundred times no
High productivity languages should be chosen over high speed languages for the great majority of scientific
computing tasks
This is because
1. Of any given program, relatively few lines are ever going to be time-critical
2. For those lines of code that are time-critical, we can achieve C-like speed using a combination of
NumPy and Numba
This lecture provides a guide

2.4.2 Where are the Bottlenecks?

Lets start by trying to understand why high level languages like Python are slower than compiled code

128 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Dynamic Typing

Consider this Python operation

a, b = 10, 10
a + b

20

Even for this simple operation, the Python interpreter has a fair bit of work to do
For example, in the statement a + b, the interpreter has to know which operation to invoke
If a and b are strings, then a + b requires string concatenation

a, b = 'foo', 'bar'
a + b

'foobar'

If a and b are lists, then a + b requires list concatenation

a, b = ['foo'], ['bar']
a + b

['foo', 'bar']

(We say that the operator + is overloaded its action depends on the type of the objects on which it acts)
As a result, Python must check the type of the objects and then call the correct operation
This involves substantial overheads

Static Types

Compiled languages avoid these overheads with explicit, static types


For example, consider the following C code, which sums the integers from 1 to 10

#include <stdio.h>

int main(void) {
int i;
int sum = 0;
for (i = 1; i <= 10; i++) {
sum = sum + i;
}
printf("sum = %d\n", sum);
return 0;
}

2.4. Numba 129


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

The variables i and sum are explicitly declared to be integers


Hence, the meaning of addition here is completely unambiguous

Data Access

Another drag on speed for high level languages is data access


To illustrate, lets consider the problem of summing some data say, a collection of integers

Summing with Compiled Code

In C or Fortran, these integers would typically be stored in an array, which is a simple data structure for
storing homogeneous data
Such an array is stored in a single contiguous block of memory
• In modern computers, memory addresses are allocated to each byte (one byte = 8 bits)
• For example, a 64 bit integer is stored in 8 bytes of memory
• An array of n such integers occupies 8n consecutive memory slots
Moreover, the compiler is made aware of the data type by the programmer
• In this case 64 bit integers
Hence, each successive data point can be accessed by shifting forward in memory space by a known and
fixed amount
• In this case 8 bytes

Summing in Pure Python

Python tries to replicate these ideas to some degree


For example, in the standard Python implementation (CPython), list elements are placed in memory loca-
tions that are in a sense contiguous
However, these list elements are more like pointers to data rather than actual data
Hence, there is still overhead involved in accessing the data values themselves
This is a considerable drag on speed
In fact, its generally true that memory traffic is a major culprit when it comes to slow execution
Lets look at some ways around these problems

130 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

2.4.3 Vectorization

Vectorization is about sending batches of related operations to native machine code


• The machine code itself is typically compiled from carefully optimized C or Fortran
This can greatly accelerate many (but not all) numerical computations

Operations on Arrays

First lets run some imports

import random
import numpy as np
import quantecon as qe

Now lets try this non-vectorized code

qe.util.tic() # Start timing


n = 100_000
sum = 0
for i in range(n):
x = random.uniform(0, 1)
sum += x**2
qe.util.toc() # End timing

TOC: Elapsed: 0.055155277252197266 seconds.

Now compare this vectorized code

qe.util.tic()
n = 100_000
x = np.random.uniform(0, 1, n)
np.sum(x**2)
qe.util.toc()

TOC: Elapsed: 0.0016531944274902344 seconds.

The second code block which achieves the same thing as the first runs much faster
The reason is that in the second implementation we have broken the loop down into three basic operations
1. draw n uniforms
2. square them
3. sum them
These are sent as batch operators to optimized machine code
Apart from minor overheads associated with sending data back and forth, the result is C or Fortran-like
speed
When we run batch operations on arrays like this, we say that the code is vectorized

2.4. Numba 131


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Vectorized code is typically fast and efficient


It is also surprisingly flexible, in the sense that many operations can be vectorized
The next section illustrates this point

Universal Functions

Many functions provided by NumPy are so-called universal functions also called ufuncs
This means that they
• map scalars into scalars, as expected
• map arrays into arrays, acting element-wise
For example, np.cos is a ufunc:

np.cos(1.0)

0.54030230586813977

np.cos(np.linspace(0, 1, 3))

array([ 1., 0.87758256, 0.54030231])

By exploiting ufuncs, many operations can be vectorized


For example, consider the problem of maximizing a function f of two variables (x, y) over the square
[−a, a] × [−a, a]
For f and a lets choose

cos(x2 + y 2 )
f (x, y) = and a = 3
1 + x2 + y 2
Heres a plot of f

import matplotlib.pyplot as plt


from mpl_toolkits.mplot3d.axes3d import Axes3D
from matplotlib import cm

def f(x, y):


return np.cos(x**2 + y**2) / (1 + x**2 + y**2)

xgrid = np.linspace(-3, 3, 50)


ygrid = xgrid
x, y = np.meshgrid(xgrid, ygrid)

fig = plt.figure(figsize=(8, 6))


ax = fig.add_subplot(111, projection='3d')
ax.plot_surface(x,
y,

132 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

f(x, y),
rstride=2, cstride=2,
cmap=cm.jet,
alpha=0.7,
linewidth=0.25)
ax.set_zlim(-0.5, 1.0)
plt.show()

To maximize it, were going to use a naive grid search:


1. Evaluate f for all (x, y) in a grid on the square
2. Return the maximum of observed values
Heres a non-vectorized version that uses Python loops
def f(x, y):
return np.cos(x**2 + y**2) / (1 + x**2 + y**2)

grid = np.linspace(-3, 3, 1000)


m = -np.inf

qe.tic()
for x in grid:
for y in grid:
z = f(x, y)

2.4. Numba 133


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

if z > m:
m = z

qe.toc()

TOC: Elapsed: 2.508265256881714 seconds.

And heres a vectorized version

def f(x, y):


return np.cos(x**2 + y**2) / (1 + x**2 + y**2)

grid = np.linspace(-3, 3, 1000)


x, y = np.meshgrid(grid, grid)

qe.tic()
np.max(f(x, y))
qe.toc()

TOC: Elapsed: 0.048872947692871094 seconds.

In the vectorized version, all the looping takes place in compiled code
As you can see, the second version is much faster
(Well make it even faster again below, when we discuss Numba)

Pros and Cons of Vectorization

At its best, vectorization yields fast, simple code


However, its not without disadvantages
One issue is that it can be highly memory intensive
For example, the vectorized maximization routine above is far more memory intensive than the non-
vectorized version that preceded it
Another issue is that not all algorithms can be vectorized
In these kinds of settings, we need to go back to loops
Fortunately, there are nice ways to speed up Python loops

2.4.4 Numba

One exciting development in this direction is Numba


Numba aims to automatically compile functions to native machine code instructions on the fly
The process isnt flawless, since Numba needs to infer type information on all variables to generate pure
machine instructions

134 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Such inference isnt possible in every setting


But for simple routines Numba infers types very well
Moreover, the hot loops at the heart of our code that we need to speed up are often such simple routines

Prerequisites

If you followed our set up instructions, then Numba should be installed


Make sure you have the latest version of Anaconda by running conda update anaconda from a ter-
minal (Mac, Linux) / Anaconda command prompt (Windows)

An Example

Lets consider some problems that are difficult to vectorize


One is generating the trajectory of a difference equation given an initial condition
Lets take the difference equation to be the quadratic map

xt+1 = 4xt (1 − xt )

Heres the plot of a typical trajectory, starting from x0 = 0.1, with t on the x-axis

def qm(x0, n):


x = np.empty(n+1)
x[0] = x0
for t in range(n):
x[t+1] = 4 * x[t] * (1 - x[t])
return x

x = qm(0.1, 250)
fig, ax = plt.subplots(figsize=(10, 6))
ax.plot(x, 'b-', lw=2, alpha=0.8)
ax.set_xlabel('time', fontsize=16)
plt.show()

2.4. Numba 135


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

To speed this up using Numba is trivial using Numbas jit function

from numba import jit

qm_numba = jit(qm) # qm_numba is now a 'compiled' version of qm

Lets time and compare identical function calls across these two versions:

qe.util.tic()
qm(0.1, int(10**5))
time1 = qe.util.toc()

TOC: Elapsed: 0.07170653343200684 seconds.

qe.util.tic()
qm_numba(0.1, int(10**5))
time2 = qe.util.toc()

TOC: Elapsed: 0.06515693664550781 seconds.

The first execution is relatively slow because of JIT compilation (see below)
Next time and all subsequent times it runs much faster:

qe.util.tic()
qm_numba(0.1, int(10**5))

136 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

time2 = qe.util.toc()

TOC: Elapsed: 0.0003921985626220703 seconds.

time1 / time2 # Calculate speed gain

182.8322188449848

Thats a speed increase of two orders of magnitude!


Your mileage will of course vary depending on hardware and so on
Nonetheless, two orders of magnitude is huge relative to how simple and clear the implementation is

Decorator Notation

If you dont need a separate name for the numbafied version of qm, you can just put @jit before the function

@jit
def qm(x0, n):
x = np.empty(n+1)
x[0] = x0
for t in range(n):
x[t+1] = 4 * x[t] * (1 - x[t])
return x

This is equivalent to qm = jit(qm)

How and When it Works

Numba attempts to generate fast machine code using the infrastructure provided by the LLVM Project
It does this by inferring type information on the fly
As you can imagine, this is easier for simple Python objects (simple scalar data types, such as floats, integers,
etc.)
Numba also plays well with NumPy arrays, which it treats as typed memory regions
In an ideal setting, Numba can infer all necessary type information
This allows it to generate native machine code, without having to call the Python runtime environment
In such a setting, Numba will be on par with machine code from low level languages
When Numba cannot infer all type information, some Python objects are given generic object status, and
some code is generated using the Python runtime
In this second setting, Numba typically provides only minor speed gains or none at all
Hence, its prudent when using Numba to focus on speeding up small, time-critical snippets of code

2.4. Numba 137


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

This will give you much better performance than blanketing your Python programs with @jit statements

A Gotcha: Global Variables

Consider the following example

a = 1

@jit
def add_x(x):
return a + x

print(add_x(10))

11

a = 2

print(add_x(10))

11

Notice that changing the global had no effect on the value returned by the function
When Numba compiles machine code for functions, it treats global variables as constants to ensure type
stability

Numba for vectorization

Numba can also be used to create custom ufuncs with the @vectorize decorator
To illustrate the advantage of using Numba to vectorize a function, we return to a maximization problem
discussed above

from numba import vectorize

@vectorize
def f_vec(x, y):
return np.cos(x**2 + y**2) / (1 + x**2 + y**2)

grid = np.linspace(-3, 3, 1000)


x, y = np.meshgrid(grid, grid)

np.max(f_vec(x, y)) # Run once to compile

qe.tic()
np.max(f_vec(x, y))
qe.toc()

138 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

TOC: Elapsed: 0.011091232299804688 seconds.

This is faster than our vectorized version using NumPys ufuncs


Why should that be? After all, anything vectorized with NumPy will be running in fast C or Fortran code
The reason is that its much less memory intensive
For example, when NumPy computes np.cos(x**2 + y**2) it first creates the intermediate arrays x**2 and
y**2, then it creates the array np.cos(x**2 + y**2)
In our @vectorize version using Numba, the entire operator is reduced to a single vectorized process and
none of these intermediate arrays are created
We can gain further speed improvements using Numbas automatic parallelization feature by specifying
target=parallel
In this case, we need to specify the types of our inputs and outputs

@vectorize('float64(float64, float64)', target='parallel')


def f_vec(x, y):
return np.cos(x**2 + y**2) / (1 + x**2 + y**2)

np.max(f_vec(x, y)) # Run once to compile

qe.tic()
np.max(f_vec(x, y))
qe.toc()

TOC: Elapsed: 0.008301734924316406 seconds.

This is a striking speed up with very little effort

2.5 Other Scientific Libraries

Contents

• Other Scientific Libraries


– Overview
– Cython
– Joblib
– Other Options
– Exercises
– Solutions

2.5. Other Scientific Libraries 139


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

2.5.1 Overview

In this lecture we review some other scientific libraries that are useful for economic research and analysis
We have, however, already picked most of the low hanging fruit in terms of economic research
Hence you should feel free to skip this lecture on first pass

2.5.2 Cython

Like Numba, Cython provides an approach to generating fast compiled code that can be used from Python
As was the case with Numba, a key problem is the fact that Python is dynamically typed
As youll recall, Numba solves this problem (where possible) by inferring type
Cythons approach is different programmers add type definitions directly to their Python code
As such, the Cython language can be thought of as Python with type definitions
In addition to a language specification, Cython is also a language translator, transforming Cython code into
optimized C and C++ code
Cython also takes care of building language extensions the wrapper code that interfaces between the result-
ing compiled code and Python
Important Note:
In what follows code is executed in a Jupyter notebook
This is to take advantage of a Cython cell magic that makes Cython particularly easy to use
Some modifications are required to run the code outside a notebook
• See the book Cython by Kurt Smith or the online documentation

A First Example

Lets start with a rather artificial example


∑n i
Suppose that we want to compute the sum i=0 α for given α, n
Suppose further that weve forgotten the basic formula


n
1 − αn+1
αi =
1−α
i=0

for a geometric progression and hence have resolved to rely on a loop

Python vs C

Heres a pure Python function that does the job

140 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

def geo_prog(alpha, n):


current = 1.0
sum = current
for i in range(n):
current = current * alpha
sum = sum + current
return sum

This works fine but for large n it is slow


Heres a C function that will do the same thing

double geo_prog(double alpha, int n) {


double current = 1.0;
double sum = current;
int i;
for (i = 1; i <= n; i++) {
current = current * alpha;
sum = sum + current;
}
return sum;
}

If youre not familiar with C, the main thing you should take notice of is the type definitions
• int means integer
• double means double precision floating point number
• the double in double geo_prog(... indicates that the function will return a double
Not surprisingly, the C code is faster than the Python code

A Cython Implementation

Cython implementations look like a convex combination of Python and C


Were going to run our Cython code in the Jupyter notebook, so well start by loading the Cython extension
in a notebook cell

%load_ext Cython

In the next cell, we execute the following

%%cython
def geo_prog_cython(double alpha, int n):
cdef double current = 1.0
cdef double sum = current
cdef int i
for i in range(n):
current = current * alpha
sum = sum + current
return sum

2.5. Other Scientific Libraries 141


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Here cdef is a Cython keyword indicating a variable declaration, and is followed by a type
The %%cython line at the top is not actually Cython code its a Jupyter cell magic indicating the start of
Cython code
After executing the cell, you can now call the function geo_prog_cython from within Python
What you are in fact calling is compiled C code with a Python call interface

import quantecon as qe
qe.util.tic()
geo_prog(0.99, int(10**6))
qe.util.toc()

TOC: Elapsed: 0.11026620864868164 seconds.

qe.util.tic()
geo_prog_cython(0.99, int(10**6))
qe.util.toc()

TOC: Elapsed: 0.038515567779541016 seconds.

Example 2: Cython with NumPy Arrays

Lets go back to the first problem that we worked with: generating the iterates of the quadratic map

xt+1 = 4xt (1 − xt )

The problem of computing iterates and returning a time series requires us to work with arrays
The natural array type to work with is NumPy arrays
Heres a Cython implementation that initializes, populates and returns a NumPy array

%%cython
import numpy as np

def qm_cython_first_pass(double x0, int n):


cdef int t
x = np.zeros(n+1, float)
x[0] = x0
for t in range(n):
x[t+1] = 4.0 * x[t] * (1 - x[t])
return np.asarray(x)

If you run this code and time it, you will see that its performance is disappointing nothing like the speed
gain we got from Numba

qe.util.tic()
qm_cython_first_pass(0.1, int(10**5))
qe.util.toc()

142 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

TOC: Elapsed: 0.03628730773925781 seconds.

This example was also computed in the Numba lecture, and you can see Numba is around 90 times faster
The reason is that working with NumPy arrays incurs substantial Python overheads
We can do better by using Cythons typed memoryviews, which provide more direct access to arrays in
memory
When using them, the first step is to create a NumPy array
Next, we declare a memoryview and bind it to the NumPy array
Heres an example:

%%cython
import numpy as np
from numpy cimport float_t

def qm_cython(double x0, int n):


cdef int t
x_np_array = np.zeros(n+1, dtype=float)
cdef float_t [:] x = x_np_array
x[0] = x0
for t in range(n):
x[t+1] = 4.0 * x[t] * (1 - x[t])
return np.asarray(x)

Here
• cimport pulls in some compile-time information from NumPy
• cdef float_t [:] x = x_np_array creates a memoryview on the NumPy array
x_np_array
• the return statement uses np.asarray(x) to convert the memoryview back to a NumPy array
Lets time it:

qe.util.tic()
qm_cython(0.1, int(10**5))
qe.util.toc()

TOC: Elapsed: 0.0007178783416748047 seconds.

This is fast, although still slightly slower than qm_numba

Summary

Cython requires more expertise than Numba, and is a little more fiddly in terms of getting good performance
In fact, its surprising how difficult it is to beat the speed improvements provided by Numba
Nonetheless,

2.5. Other Scientific Libraries 143


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

• Cython is a very mature, stable and widely used tool


• Cython can be more useful than Numba when working with larger, more sophisticated applications

2.5.3 Joblib

Joblib is a popular Python library for caching and parallelization


To install it, start Jupyter and type

!pip install joblib

from within a notebook


Here we review just the basics

Caching

Perhaps, like us, you sometimes run a long computation that simulates a model at a given set of parameters
to generate a figure, say, or a table
20 minutes later you realize that you want to tweak the figure and now you have to do it all again
What caching will do is automatically store results at each parameterization
With Joblib, results are compressed and stored on file, and automatically served back up to you when you
repeat the calculation

An Example

Lets look at a toy example, related to the quadratic map model discussed above
Lets say we want to generate a long trajectory from a certain initial condition x0 and see what fraction of
the sample is below 0.1
(Well omit JIT compilation or other speed ups for simplicity)
Heres our code

from joblib import Memory

memory = Memory(cachedir='./joblib_cache')

@memory.cache
def qm(x0, n):
x = np.empty(n+1)
x[0] = x0
for t in range(n):
x[t+1] = 4 * x[t] * (1 - x[t])
return np.mean(x < 0.1)

144 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

We are using joblib to cache the result of calling qm at a given set of parameters
With the argument cachedir=./joblib_cache, any call to this function results in both the input values and
output values being stored a subdirectory joblib_cache of the present working directory
(In UNIX shells, . refers to the present working directory)
The first time we call the function with a given set of parameters we see some extra output that notes
information being cached

qe.util.tic()
n = int(1e7)
qm(0.2, n)
qe.util.toc()

[Memory] Calling __main__ [truncated]


______________________________________qm - 6.9s, 0.1min

TOC: Elapsed: 6.922961473464966 seconds.

The next time we call the function with the same set of parameters, the result is returned almost instanta-
neously

qe.util.tic()
n = int(1e7)
qm(0.2, n)
qe.util.toc()

0.204758079524
TOC: Elapsed: 0.0009872913360595703 seconds.

2.5.4 Other Options

There are in fact many other approaches to speeding up your Python code
One is interfacing with Fortran
If you are comfortable writing Fortran you will find it very easy to create extension modules from Fortran
code using F2Py
F2Py is a Fortran-to-Python interface generator that is particularly simple to use
Robert Johansson provides a very nice introduction to F2Py, among other things
Recently, a Jupyter cell magic for Fortran has been developed you might want to give it a try

2.5.5 Exercises

Exercise 1

Later well learn all about finite state Markov chains

2.5. Other Scientific Libraries 145


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

For now, lets just concentrate on simulating a very simple example of such a chain
Suppose that the volatility of returns on an asset can be in one of two regimes high or low
The transition probabilities across states are as follows

For example, let the period length be one month, and suppose the current state is high
We see from the graph that the state next month will be
• high with probability 0.8
• low with probability 0.2
Your task is to simulate a sequence of monthly volatility states according to this rule
Set the length of the sequence to n = 100000 and start in the high state
Implement a pure Python version, a Numba version and a Cython version, and compare speeds
To test your code, evaluate the fraction of time that the chain spends in the low state
If your code is correct, it should be about 2/3

2.5.6 Solutions

Exercise 1

We let
• 0 represent low
• 1 represent high

p, q = 0.1, 0.2 # Prob of leaving low and high state respectively

Heres a pure Python version of the function

def compute_series(n):
x = np.empty(n, dtype=int)
x[0] = 1 # Start in state 1
U = np.random.uniform(0, 1, size=n)
for t in range(1, n):
current_x = x[t-1]
if current_x == 0:

146 Chapter 2. The Scientific Libraries


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

x[t] = U[t] < p


else:
x[t] = U[t] > q
return x

Lets run this code and check that the fraction of time spent in the low state is about 0.666

n = 100000
x = compute_series(n)
print(np.mean(x == 0)) # Fraction of time x is in state 0

0.66951

Now lets time it

qe.util.tic()
compute_series(n)
qe.util.toc()

TOC: Elapsed: 0.07770729064941406 seconds.

Next lets implement a Numba version, which is easy

from numba import jit

compute_series_numba = jit(compute_series)

Lets check we still get the right numbers

x = compute_series_numba(n)
print(np.mean(x == 0))

0.66764

Lets see the time

qe.util.tic()
compute_series_numba(n)
qe.util.toc()

TOC: Elapsed: 0.0017528533935546875 seconds.

This is a nice speed improvement for one line of code


Now lets implement a Cython version

%load_ext Cython

%%cython
import numpy as np

2.5. Other Scientific Libraries 147


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

from numpy cimport int_t, float_t

def compute_series_cy(int n):


# == Create NumPy arrays first == #
x_np = np.empty(n, dtype=int)
U_np = np.random.uniform(0, 1, size=n)
# == Now create memoryviews of the arrays == #
cdef int_t [:] x = x_np
cdef float_t [:] U = U_np
# == Other variable declarations == #
cdef float p = 0.1
cdef float q = 0.2
cdef int t
# == Main loop == #
x[0] = 1
for t in range(1, n):
current_x = x[t-1]
if current_x == 0:
x[t] = U[t] < p
else:
x[t] = U[t] > q
return np.asarray(x)

compute_series_cy(10)

array([1, 0, 0, 0, 0, 0, 0, 0, 0, 0])

x = compute_series_cy(n)
print(np.mean(x == 0))

0.66927

qe.util.tic()
compute_series_cy(n)
qe.util.toc()

TOC: Elapsed: 0.0025839805603027344 seconds.

The Cython implementation is fast, but not as fast as Numba

148 Chapter 2. The Scientific Libraries


CHAPTER

THREE

ADVANCED PYTHON PROGRAMMING

This part provides a look at more advanced concepts in Python programming

3.1 Writing Good Code

Contents

• Writing Good Code


– Overview
– An Example of Bad Code
– Good Coding Practice
– Revisiting the Example
– Summary

3.1.1 Overview

When computer programs are small, poorly written code is not overly costly
But more data, more sophisticated models, and more computer power are enabling us to take on more
challenging problems that involve writing longer programs
For such programs, investment in good coding practices will pay high returns
The main payoffs are higher productivity and faster code
In this lecture, we review some elements of good coding practice
We also touch on modern developments in scientific computing such as just in time compilation and how
they affect good program design

149
QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

3.1.2 An Example of Bad Code

Lets have a look at some poorly written code


The job of the code is to generate and plot time series of the simplified Solow model

kt+1 = sktα + (1 − δ)kt , t = 0, 1, 2, . . . (3.1)

Here
• kt is capital at time t and
• s, α, δ are parameters (savings, a productivity parameter and depreciation)
For each parameterization, the code
1. sets k0 = 1
2. iterates using (3.1) to produce a sequence k0 , k1 , k2 . . . , kT
3. plots the sequence
The plots will be grouped into three subfigures
In each subfigure, two parameters are held fixed while another varies

import numpy as np
import matplotlib.pyplot as plt

# Allocate memory for time series


k = np.empty(50)

fig, axes = plt.subplots(3, 1, figsize=(12, 15))

# Trajectories with different α


δ = 0.1
s = 0.4
α = (0.25, 0.33, 0.45)

for j in range(3):
k[0] = 1
for t in range(49):
k[t+1] = s * k[t]**α[j] + (1 - δ) * k[t]
axes[0].plot(k, 'o-', label=rf"$\alpha = {α[j]},\; s = {s},\; \delta={δ}$
,→")

axes[0].grid(lw=0.2)
axes[0].set_ylim(0, 18)
axes[0].set_xlabel('time')
axes[0].set_ylabel('capital')
axes[0].legend(loc='upper left', frameon=True, fontsize=14)

# Trajectories with different s


δ = 0.1

150 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

α = 0.33
s = (0.3, 0.4, 0.5)

for j in range(3):
k[0] = 1
for t in range(49):
k[t+1] = s[j] * k[t]**α + (1 - δ) * k[t]
axes[1].plot(k, 'o-', label=rf"$\alpha = {α},\; s = {s},\; \delta={δ}$")

axes[1].grid(lw=0.2)
axes[1].set_xlabel('time')
axes[1].set_ylabel('capital')
axes[1].set_ylim(0, 18)
axes[1].legend(loc='upper left', frameon=True, fontsize=14)

# Trajectories with different δ


δ = (0.05, 0.1, 0.15)
α = 0.33
s = 0.4

for j in range(3):
k[0] = 1
for t in range(49):
k[t+1] = s * k[t]**α + (1 - δ[j]) * k[t]
axes[2].plot(k, 'o-', label=rf"$\alpha = {α},\; s = {s},\; \delta={δ[j]}$
,→")

axes[2].set_ylim(0, 18)
axes[2].set_xlabel('time')
axes[2].set_ylabel('capital')
axes[2].grid(lw=0.2)
axes[2].legend(loc='upper left', frameon=True, fontsize=14)

plt.show()

3.1. Writing Good Code 151


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

True, the code more or less follows PEP8


At the same time, its very poorly structured
Lets talk about why thats the case, and what we can do about it

152 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

3.1.3 Good Coding Practice

There are usually many different ways to write a program that accomplishes a given task
For small programs, like the one above, the way you write code doesnt matter too much
But if you are ambitious and want to produce useful things, youll write medium to large programs too
In those settings, coding style matters a great deal
Fortunately, lots of smart people have thought about the best way to write code
Here are some basic precepts

Dont Use Magic Numbers

If you look at the code above, youll see numbers like 50 and 49 and 3 scattered through the code
These kinds of numeric literals in the body of your code are sometimes called magic numbers
This is not a complement
While numeric literals are not all evil, the numbers shown in the program above should certainly be replaced
by named constants
For example, the code above could declare the variable time_series_length = 50
Then in the loops, 49 should be replaced by time_series_length - 1
The advantages are:
• the meaning is much clearer throughout
• to alter the time series length, you only need to change one value

Dont Repeat Yourself

The other mortal sin in the code snippet above is repetition


Blocks of logic (such as the loop to generate time series) are repeated with only minor changes
This violates a fundamental tenet of programming: Dont repeat yourself (DRY)
• Also called DIE (duplication is evil)
Yes, we realize that you can just cut and paste and change a few symbols
But as a programmer your aim should be to automate repetition, not do it yourself
More importantly, repeating the same logic in different places means that eventually one of them will likely
be wrong
If you want to know more, read the excellent summary found on this page
Well talk about how to avoid repetition below

3.1. Writing Good Code 153


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Minimize Global Variables

Sure, global variables (i.e., names assigned to values outside of any function or class) are convenient
Rookie programmers typically use global variables with abandon as we once did ourselves
But global variables are dangerous, especially in medium to large size programs, since
• they can affect what happens in any part of your program
• they can be changed by any function
This makes it much harder to be certain about what some small part of a given piece of code actually
commands
Heres a useful discussion on the topic
While the odd global in small scripts is no big deal, we recommend that you teach yourself to avoid them
(Well discuss how just below)

JIT Compilation

In fact, theres now another good reason to avoid global variables


In scientific computing, were witnessing the rapid growth of just in time (JIT) compilation
JIT compilation can generate excellent performance for scripting languages like Python and Julia
But the task of the compiler used for JIT compilation becomes much harder when many global variables are
present
(This is because data type instability hinders the generation of efficient machine code well learn more about
such topics later on)

Use Functions or Classes

Fortunately, we can easily avoid the evils of global variables and WET code
• WET stands for we love typing and is the opposite of DRY
We can do this by making frequent use of functions or classes
In fact, functions and classes are designed specifically to help us avoid shaming ourselves by repeating code
or excessive use of global variables

Which one, functions or classes?

Both can be useful, and in fact they work well with each other
Well learn more about these topics over time
(Personal preference is part of the story too)

154 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Whats really important is that you use one or the other or both

3.1.4 Revisiting the Example

Heres some code that reproduces the plot above with better coding style
It uses a function to avoid repetition
Note also that
• global variables are quarantined by collecting together at the end, not the start of the program
• magic numbers are avoided
• the loop at the end where the actual work is done is short and relatively simple

from itertools import product

def plot_path(ax, αs, s_vals, δs, series_length=50):


"""
Add a time series plot to the axes ax for all given parameters.
"""
k = np.empty(series_length)

for (α, s, δ) in product(αs, s_vals, δs):


k[0] = 1
for t in range(series_length-1):
k[t+1] = s * k[t]**α + (1 - δ) * k[t]
ax.plot(k, 'o-', label=rf"$\alpha = {α},\; s = {s},\; \delta = {δ}$")

ax.grid(lw=0.2)
ax.set_xlabel('time')
ax.set_ylabel('capital')
ax.set_ylim(0, 18)
ax.legend(loc='upper left', frameon=True, fontsize=14)

fig, axes = plt.subplots(3, 1, figsize=(12, 15))

# Parameters (αs, s_vals, δs)


set_one = ([0.25, 0.33, 0.45], [0.4], [0.1])
set_two = ([0.33], [0.3, 0.4, 0.5], [0.1])
set_three = ([0.33], [0.4], [0.05, 0.1, 0.15])

for (ax, params) in zip(axes, (set_one, set_two, set_three)):


αs, s_vals, δs = params
plot_path(ax, αs, s_vals, δs)

plt.show()

3.1. Writing Good Code 155


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

3.1.5 Summary

Writing decent code isnt hard


Its also fun and intellectually satisfying

156 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

We recommend that you cultivate good habits and style even when you write relatively short programs

3.2 OOP II: Building Classes

Contents

• OOP II: Building Classes


– Overview
– OOP Review
– Defining Your Own Classes
– Special Methods
– Exercises
– Solutions

3.2.1 Overview

In an earlier lecture we learned some foundations of object oriented programming


The objectives of this lecture are
• cover OOP in more depth
• learn how to build our own objects, specialized to our needs
For example, you already know how to
• create lists, strings and other Python objects
• use their methods to modify their contents
So imagine now you want to write a program with consumers, who can
• hold and spend cash
• consume goods
• work and earn cash
A natural solution in Python would be to create consumers as objects with
• data, such as cash on hand
• methods, such as buy or work that affect this data
Python makes it easy to do this, by providing you with class definitions
Classes are blueprints that help you build objects according to your own specifications
It takes a little while to get used to the syntax so well provide plenty of examples

3.2. OOP II: Building Classes 157


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

3.2.2 OOP Review

OOP is supported in many languages:


• JAVA and Ruby are relatively pure OOP
• Python supports both procedural and object-oriented programming
• Fortran and MATLAB are mainly procedural, some OOP recently tacked on
• C is a procedural language, while C++ is C with OOP added on top
Lets cover general OOP concepts before we specialize to Python

Key Concepts

As discussed an earlier lecture, in the OOP paradigm, data and functions are bundled together into objects
An example is a Python list, which not only stores data, but also knows how to sort itself, etc.

x = [1, 5, 4]
x.sort()
x

[1, 4, 5]

As we now know, sort is a function that is part of the list object and hence called a method
If we want to make our own types of objects we need to use class definitions
A class definition is a blueprint for a particular class of objects (e.g., lists, strings or complex numbers)
It describes
• What kind of data the class stores
• What methods it has for acting on these data
An object or instance is a realization of the class, created from the blueprint
• Each instance has its own unique data
• Methods set out in the class definition act on this (and other) data
In Python, the data and methods of an object are collectively referred to as attributes
Attributes are accessed via dotted attribute notation
• object_name.data
• object_name.method_name()
In the example

x = [1, 5, 4]
x.sort()
x.__class__

158 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

list

• x is an object or instance, created from the definition for Python lists, but with its own particular data
• x.sort() and x.__class__ are two attributes of x
• dir(x) can be used to view all the attributes of x

Why is OOP Useful?

OOP is useful for the same reason that abstraction is useful: for recognizing and exploiting common struc-
ture
For example,
• a Markov chain consists of a set of states and a collection of transition probabilities for moving across
states
• a general equilibrium theory consists of a commodity space, preferences, technologies, and an equi-
librium definition
• a game consists of a list of players, lists of actions available to each player, player payoffs as functions
of all players actions, and a timing protocol
These are all abstractions that collect together objects of the same type
Recognizing common structure allows us to employ common tools
In economic theory, this might be a proposition that applies to all games of a certain type
In Python, this might be a method thats useful for all Markov chains (e.g., simulate)
When we use OOP, the simulate method is conveniently bundled together with the Markov chain object

3.2.3 Defining Your Own Classes

Lets build some simple classes to start off

Example: A Consumer Class

First well build a Consumer class with


• a wealth attribute that stores the consumers wealth (data)
• an earn method, where earn(y) increments the consumers wealth by y
• a spend method, where spend(x) either decreases wealth by x or returns an error if insufficient
funds exist
Admittedly a little contrived, this example of a class helps us internalize some new syntax
Heres one implementation

3.2. OOP II: Building Classes 159


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

class Consumer:

def __init__(self, w):


"Initialize consumer with w dollars of wealth"
self.wealth = w

def earn(self, y):


"The consumer earns y dollars"
self.wealth += y

def spend(self, x):


"The consumer spends x dollars if feasible"
new_wealth = self.wealth - x
if new_wealth < 0:
print("Insufficent funds")
else:
self.wealth = new_wealth

Theres some special syntax here so lets step through carefully


• The class keyword indicates that we are building a class
This class defines instance data wealth and three methods: __init__, earn and spend
• wealth is instance data because each consumer we create (each instance of the Consumer class) will
have its own separate wealth data
The ideas behind the earn and spend methods were discussed above
Both of these act on the instance data wealth
The __init__ method is a constructor method
Whenever we create an instance of the class, this method will be called automatically
Calling __init__ sets up a namespace to hold the instance data more on this soon
Well also discuss the role of self just below

Usage

Heres an example of usage

c1 = Consumer(10) # Create instance with initial wealth 10


c1.spend(5)
c1.wealth

c1.earn(15)
c1.spend(100)

160 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Insufficent funds

We can of course create multiple instances each with its own data

c1 = Consumer(10)
c2 = Consumer(12)
c2.spend(4)
c2.wealth

c1.wealth

10

In fact each instance stores its data in a separate namespace dictionary

c1.__dict__

{'wealth': 10}

c2.__dict__

{'wealth': 8}

When we access or set attributes were actually just modifying the dictionary maintained by the instance

Self

If you look at the Consumer class definition again youll see the word self throughout the code
The rules with self are that
• Any instance data should be prepended with self
– e.g., the earn method references self.wealth rather than just wealth
• Any method defined within the class should have self as its first argument
– e.g., def earn(self, y) rather than just def earn(y)
• Any method referenced within the class should be called as self.method_name
There are no examples of the last rule in the preceding code but we will see some shortly

Details

In this section we look at some more formal details related to classes and self
• You might wish to skip to the next section on first pass of this lecture

3.2. OOP II: Building Classes 161


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

• You can return to these details after youve familiarized yourself with more examples
Methods actually live inside a class object formed when the interpreter reads the class definition

print(Consumer.__dict__) # Show __dict__ attribute of class object

{'earn': <function Consumer.earn at 0x7f2590054d90>,


'spend': <function Consumer.spend at 0x7f2590054e18>,
'__doc__': None,
'__weakref__': <attribute '__weakref__' of 'Consumer' objects>,
'__init__': <function Consumer.p__init__ at 0x7f2590054d08>,
'__module__': '__main__',
'__dict__': <attribute '__dict__' of 'Consumer' objects>}

Note how the three methods __init__, earn and spend are stored in the class object
Consider the following code

c1 = Consumer(10)
c1.earn(10)
c1.wealth

20

When you call earn via c1.earn(10) the interpreter passes the instance c1 and the argument 10 to Con-
sumer.earn
In fact the following are equivalent
• c1.earn(10)
• Consumer.earn(c1, 10)
In the function call Consumer.earn(c1, 10) note that c1 is the first argument
Recall that in the definition of the earn method, self is the first parameter

def earn(self, y):


"The consumer earns y dollars"
self.wealth += y

The end result is that self is bound to the instance c1 inside the function call
Thats why the statement self.wealth += y inside earn ends up modifying c1.wealth

Example: The Solow Growth Model

For our next example, lets write a simple class to implement the Solow growth model
The Solow growth model is a neoclassical growth model where the amount of capital stock per capita kt
evolves according to the rule

162 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

szktα + (1 − δ)kt
kt+1 = (3.2)
1+n
Here
• s is an exogenously given savings rate
• z is a productivity parameter
• α is capitals share of income
• n is the population growth rate
• δ is the depreciation rate
The steady state of the model is the k that solves (3.2) when kt+1 = kt = k
Heres a class that implements this model
Some points of interest in the code are
• An instance maintains a record of its current capital stock in the variable self.k
• The h method implements the right hand side of (3.2)
• The update method uses h to update capital as per (3.2)
– Notice how inside update the reference to the local method h is self.h
The methods steady_state and generate_sequence are fairly self explanatory

class Solow:
r"""
Implements the Solow growth model with update rule

k_{t+1} = [(s z k^α_t) + (1 - δ)k_t] /(1 + n)

"""
def __init__(self, n=0.05, # population growth rate
s=0.25, # savings rate
δ=0.1, # depreciation rate
α=0.3, # share of labor
z=2.0, # productivity
k=1.0): # current capital stock

self.n, self.s, self.δ, self.α, self.z = n, s, δ, α, z


self.k = k

def h(self):
"Evaluate the h function"
# Unpack parameters (get rid of self to simplify notation)
n, s, δ, α, z = self.n, self.s, self.δ, self.α, self.z
# Apply the update rule
return (s * z * self.k**α + (1 - δ) * self.k) / (1 + n)

def update(self):

3.2. OOP II: Building Classes 163


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

"Update the current state (i.e., the capital stock)."


self.k = self.h()

def steady_state(self):
"Compute the steady state value of capital."
# Unpack parameters (get rid of self to simplify notation)
n, s, δ, α, z = self.n, self.s, self.δ, self.α, self.z
# Compute and return steady state
return ((s * z) / (n + δ))**(1 / (1 - α))

def generate_sequence(self, t):


"Generate and return a time series of length t"
path = []
for i in range(t):
path.append(self.k)
self.update()
return path

Heres a little program that uses the class to compute time series from two different initial conditions
The common steady state is also plotted for comparison

import matplotlib.pyplot as plt

s1 = Solow()
s2 = Solow(k=8.0)

T = 60
fig, ax = plt.subplots(figsize=(9, 6))

# Plot the common steady state value of capital


ax.plot([s1.steady_state()]*T, 'k-', label='steady state')

# Plot time series for each economy


for s in s1, s2:
lb = f'capital series from initial state {s.k}'
ax.plot(s.generate_sequence(T), 'o-', lw=2, alpha=0.6, label=lb)

ax.legend()
plt.show()

Heres the figure it produces

164 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Example: A Market

Next lets write a class for a simple one good market where agents are price takers
The market consists of the following objects:
• A linear demand curve Q = ad − bd p
• A linear supply curve Q = az + bz (p − t)
Here
• p is price paid by the consumer, Q is quantity, and t is a per unit tax
• Other symbols are demand and supply parameters
The class provides methods to compute various values of interest, including competitive equlibrium price
and quantity, tax revenue raised, consumer surplus and producer surplus
Heres our implementation
from scipy.integrate import quad

class Market:

def __init__(self, ad, bd, az, bz, tax):


"""
Set up market parameters. All parameters are scalars. See
https://lectures.quantecon.org/py/python_oop.html for interpretation.

"""
self.ad, self.bd, self.az, self.bz, self.tax = ad, bd, az, bz, tax
if ad < az:

3.2. OOP II: Building Classes 165


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

raise ValueError('Insufficient demand.')

def price(self):
"Return equilibrium price"
return (self.ad - self.az + self.bz * self.tax) / (self.bd + self.bz)

def quantity(self):
"Compute equilibrium quantity"
return self.ad - self.bd * self.price()

def consumer_surp(self):
"Compute consumer surplus"
# == Compute area under inverse demand function == #
integrand = lambda x: (self.ad / self.bd) - (1 / self.bd) * x
area, error = quad(integrand, 0, self.quantity())
return area - self.price() * self.quantity()

def producer_surp(self):
"Compute producer surplus"
# == Compute area above inverse supply curve, excluding tax == #
integrand = lambda x: -(self.az / self.bz) + (1 / self.bz) * x
area, error = quad(integrand, 0, self.quantity())
return (self.price() - self.tax) * self.quantity() - area

def taxrev(self):
"Compute tax revenue"
return self.tax * self.quantity()

def inverse_demand(self, x):


"Compute inverse demand"
return self.ad / self.bd - (1 / self.bd)* x

def inverse_supply(self, x):


"Compute inverse supply curve"
return -(self.az / self.bz) + (1 / self.bz) * x + self.tax

def inverse_supply_no_tax(self, x):


"Compute inverse supply curve without tax"
return -(self.az / self.bz) + (1 / self.bz) * x

Heres a sample of usage

baseline_params = 15, .5, -2, .5, 3


m = Market(*baseline_params)
print("equilibrium price = ", m.price())

equilibrium price = 18.5

print("consumer surplus = ", m.consumer_surp())

consumer surplus = 33.0625

166 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Heres a short program that uses this class to plot an inverse demand curve together with inverse supply
curves with and without taxes
import numpy as np

# Baseline ad, bd, az, bz, tax


baseline_params = 15, .5, -2, .5, 3
m = Market(*baseline_params)

q_max = m.quantity() * 2
q_grid = np.linspace(0.0, q_max, 100)
pd = m.inverse_demand(q_grid)
ps = m.inverse_supply(q_grid)
psno = m.inverse_supply_no_tax(q_grid)

fig, ax = plt.subplots()
ax.plot(q_grid, pd, lw=2, alpha=0.6, label='demand')
ax.plot(q_grid, ps, lw=2, alpha=0.6, label='supply')
ax.plot(q_grid, psno, '--k', lw=2, alpha=0.6, label='supply without tax')
ax.set_xlabel('quantity', fontsize=14)
ax.set_xlim(0, q_max)
ax.set_ylabel('price', fontsize=14)
ax.legend(loc='lower right', frameon=False, fontsize=14)
plt.show()

The figure produced looks as follows

The next program provides a function that


• takes an instance of Market as a parameter

3.2. OOP II: Building Classes 167


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

• computes dead weight loss from the imposition of the tax

def deadw(m):
"Computes deadweight loss for market m."
# == Create analogous market with no tax == #
m_no_tax = Market(m.ad, m.bd, m.az, m.bz, 0)
# == Compare surplus, return difference == #
surp1 = m_no_tax.consumer_surp() + m_no_tax.producer_surp()
surp2 = m.consumer_surp() + m.producer_surp() + m.taxrev()
return surp1 - surp2

Heres an example of usage

baseline_params = 15, .5, -2, .5, 3


m = Market(*baseline_params)
deadw(m) # Show deadweight loss

1.125

Example: Chaos

Lets look at one more example, related to chaotic dynamics in nonlinear systems
One simple transition rule that can generate complex dynamics is the logistic map

xt+1 = rxt (1 − xt ), x0 ∈ [0, 1], r ∈ [0, 4] (3.3)

Lets write a class for generating time series from this model
Heres one implementation

class Chaos:
"""
Models the dynamical system with :math:`x_{t+1} = r x_t (1 - x_t)`
"""
def __init__(self, x0, r):
"""
Initialize with state x0 and parameter r
"""
self.x, self.r = x0, r

def update(self):
"Apply the map to update state."
self.x = self.r * self.x *(1 - self.x)

def generate_sequence(self, n):


"Generate and return a sequence of length n."
path = []
for i in range(n):
path.append(self.x)

168 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

self.update()
return path

Heres an example of usage

ch = Chaos(0.1, 4.0) # x0 = 0.1 and r = 0.4


ch.generate_sequence(5) # First 5 iterates

[0.1, 0.36000000000000004, 0.9216, 0.28901376000000006, 0.8219392261226498]

This piece of code plots a longer trajectory

ch = Chaos(0.1, 4.0)
ts_length = 250

fig, ax = plt.subplots()
ax.set_xlabel('$t$', fontsize=14)
ax.set_ylabel('$x_t$', fontsize=14)
x = ch.generate_sequence(ts_length)
ax.plot(range(ts_length), x, 'bo-', alpha=0.5, lw=2, label='$x_t$')
plt.show()

The resulting figure looks as follows

The next piece of code provides a bifurcation diagram

fig, ax = plt.subplots()
ch = Chaos(0.1, 4)

3.2. OOP II: Building Classes 169


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

r = 2.5
while r < 4:
ch.r = r
t = ch.generate_sequence(1000)[950:]
ax.plot([r] * len(t), t, 'b.', ms=0.6)
r = r + 0.005

ax.set_xlabel('$r$', fontsize=16)
plt.show()

Here is the figure it generates

On the horizontal axis is the parameter r in (3.3)


The vertical axis is the state space [0, 1]
For each r we compute a long time series and then plot the tail (the last 50 points)
The tail of the sequence shows us where the trajectory concentrates after settling down to some kind of
steady state, if a steady state exists
Whether it settles down, and the character of the steady state to which it does settle down, depend on the
value of r
For r between about 2.5 and 3, the time series settles into a single fixed point plotted on the vertical axis
For r between about 3 and 3.45, the time series settles down to oscillating between the two values plotted
on the vertical axis

170 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

For r a little bit higher than 3.45, the time series settles down to oscillating among the four values plotted
on the vertical axis
Notice that there is no value of r that leads to a steady state oscillating among three values

3.2.4 Special Methods

Python provides special methods with which some neat tricks can be performed
For example, recall that lists and tuples have a notion of length, and that this length can be queried via the
len function

x = (10, 20)
len(x)

If you want to provide a return value for the len function when applied to your user-defined object, use the
__len__ special method

class Foo:

def __len__(self):
return 42

Now we get

f = Foo()
len(f)

42

A special method we will use regularly is the __call__ method


This method can be used to make your instances callable, just like functions

class Foo:

def __call__(self, x):


return x + 42

After running we get

f = Foo()
f(8) # Exactly equivalent to f.__call__(8)

50

Exercise 1 provides a more useful example

3.2. OOP II: Building Classes 171


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

3.2.5 Exercises

Exercise 1

The empirical cumulative distribution function (ecdf) corresponding to a sample {Xi }ni=1 is defined as

1∑
n
Fn (x) := 1{Xi ≤ x} (x ∈ R) (3.4)
n
i=1

Here 1{Xi ≤ x} is an indicator function (one if Xi ≤ x and zero otherwise) and hence Fn (x) is the fraction
of the sample that falls below x
The Glivenko–Cantelli Theorem states that, provided that the sample is iid, the ecdf Fn converges to the
true distribution function F
Implement Fn as a class called ECDF, where
• A given sample {Xi }ni=1 are the instance data, stored as self.observations
• The class implements a __call__ method that returns Fn (x) for any x
Your code should work as follows (modulo randomness)

from random import uniform

samples = [uniform(0, 1) for i in range(10)]


F = ECDF(samples)
F(0.5) # Evaluate ecdf at x = 0.5

0.29

F.observations = [uniform(0, 1) for i in range(1000)]


F(0.5)

0.479

Aim for clarity, not efficiency

Exercise 2

In an earlier exercise, you wrote a function for evaluating polynomials


This exercise is an extension, where the task is to build a simple class called Polynomial for representing
and manipulating polynomial functions such as


N
p(x) = a0 + a1 x + a2 x + · · · aN x
2 N
= an xn (x ∈ R) (3.5)
n=0

172 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

The instance data for the class Polynomial will be the coefficients (in the case of (3.5), the numbers
a0 , . . . , aN )
Provide methods that
1. Evaluate the polynomial (3.5), returning p(x) for any x
2. Differentiate the polynomial, replacing the original coefficients with those of its derivative p′
Avoid using any import statements

3.2.6 Solutions

Exercise 1

class ECDF:

def __init__(self, observations):


self.observations = observations

def __call__(self, x):


counter = 0.0
for obs in self.observations:
if obs <= x:
counter += 1
return counter / len(self.observations)

# == test == #

from random import uniform

samples = [uniform(0, 1) for i in range(10)]


F = ECDF(samples)

print(F(0.5)) # Evaluate ecdf at x = 0.5

F.observations = [uniform(0, 1) for i in range(1000)]

print(F(0.5))

0.5
0.486

Exercise 2

class Polynomial:

def __init__(self, coefficients):


"""
Creates an instance of the Polynomial class representing

3.2. OOP II: Building Classes 173


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

p(x) = a_0 x^0 + ... + a_N x^N,

where a_i = coefficients[i].


"""
self.coefficients = coefficients

def __call__(self, x):


"Evaluate the polynomial at x."
y = 0
for i, a in enumerate(self.coefficients):
y += a * x**i
return y

def differentiate(self):
"Reset self.coefficients to those of p' instead of p."
new_coefficients = []
for i, a in enumerate(self.coefficients):
new_coefficients.append(i * a)
# Remove the first element, which is zero
del new_coefficients[0]
# And reset coefficients data to new values
self.coefficients = new_coefficients
return new_coefficients

3.3 OOP III: The Samuelson Accelerator

Contents

• OOP III: The Samuelson Accelerator


– Overview
– Details
– Implementation
– Stochastic shocks
– Government spending
– Wrapping everything into a class
– Using the LinearStateSpace class
– Pure multiplier model
– Summary

Co-author: Natasha Watkins

174 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

3.3.1 Overview

This lecture creates nonstochastic and stochastic versions of Paul Samuelsons celebrated multiplier acceler-
ator model [Sam39]
In doing so, we extend the example of the Solow model class in our second OOP lecture
Our objectives are to
• provide a more detailed example of OOP and classes
• review a famous model
• review linear difference equations, both deterministic and stochastic

Samuelsons Model

Samuelson used a second-order linear difference equation to represent a model of national output based on
three components:
• a national output identity asserting that national outcome is the sum of consumption plus investment
plus government purchases
• a Keynesian consumption function asserting that consumption at time t is equal to a constant times
national output at time t − 1
• an investment accelerator asserting that investment at time t equals a constant called the accelerator
coefficient times the difference in output between period t − 1 and t − 2
• the idea that consumption plus investment plus government purchases constitute aggregate demand,
which automatically calls forth an equal amount of aggregate supply
(To read about linear difference equations see here or chapter IX of [Sar87])
Samuelson used the model to analyze how particular values of the marginal propensity to consume and the
accelerator coefficient might give rise to transient business cycles in national output
Possible dynamic properties include
• smooth convergence to a constant level of output
• damped business cycles that eventually converge to a constant level of output
• persistent business cycles that neither dampen nor explode
Later we present an extension that adds a random shock to the right side of the national income identity
representing random fluctuations in aggregate demand
This modification makes national output become governed by a second-order stochastic linear difference
equation that, with appropriate parameter values, gives rise to recurrent irregular business cycles.
(To read about stochastic linear difference equations see chapter XI of [Sar87])

3.3. OOP III: The Samuelson Accelerator 175


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

3.3.2 Details

Lets assume that


• {Gt } is a sequence of levels of government expenditures. Well start by setting Gt = G for all t
• {Ct } is a sequence of levels of aggregate consumption expenditures, a key endogenous variable in the
model
• {It } is a sequence of rates of investment, another key endogenous variable
• {Yt } is a sequence of levels of national income, yet another endogenous variable
• a is the marginal propensity to consume in the Keynesian consumption function Ct = aYt−1 + γ
• b is the accelerator coefficient in the investment accelerator $I_t = b (Y_{t-1} - Y_{t-2}) $
• {ϵt } is an IID sequence standard normal random variables
• σ ≥ 0 is a volatility parameter setting σ = 0 recovers the nonstochastic case that well start with
The model combines the consumption function

Ct = aYt−1 + γ (3.6)

with the investment accelerator

It = b(Yt−1 − Yt−2 ) (3.7)

and the national income identity

Yt = Ct + It + Gt (3.8)

• The parameter a is peoples marginal propensity to consume out of income - equation (3.6) asserts that
people consume a fraction of math:a in (0,1) of each additional dollar of income
• The parameter $b > 0 $ is the investment accelerator coefficient - equation (3.7) asserts that people
invest in physical capital when income is increasing and disinvest when it is decreasing
Equations (3.6), (3.7), and (3.8) imply the following second-order linear difference equation for national
income:

Yt = (a + b)Yt−1 − bYt−2 + (γ + Gt )

or

Yt = ρ1 Yt−1 + ρ2 Yt−2 + (γ + Gt ) (3.9)

where ρ1 = (a + b) and ρ2 = −b

176 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

To complete the model, we require two initial conditions


If the model is to generate time series for t = 0, . . . , T , we require initial values

Y−1 = Ȳ−1 , Y−2 = Ȳ−2

Well ordinarily set the parameters (a, b) so that starting from an arbitrary pair of initial conditions
(Ȳ−1 , Ȳ−2 ), national income $Y_t $ converges to a constant value as t becomes large
We are interested in studying
• the transient fluctuations in Yt as it converges to its steady state level
• the rate at which it converges to a steady state level
The deterministic version of the model described so far meaning that no random shocks hit aggregate
demand has only transient fluctuations
We can convert the model to one that has persistent irregular fluctuations by adding a random shock to
aggregate demand

Stochastic version of the model

We create a random or stochastic version of the model by adding a random process of shocks or dis-
turbances {σϵt } to the right side of equation (3.9), leading to the second-order scalar linear stochastic
difference equation:

Yt = Gt + a(1 − b)Yt−1 − abYt−2 + σϵt (3.10)

Mathematical analysis of the model

To get started, lets set Gt ≡ 0, σ = 0, and γ = 0.


Then we can write equation (3.10) as

Yt = ρ1 Yt−1 + ρ2 Yt−2

or

Yt+2 − ρ1 Yt+1 − ρ2 Yt = 0 (3.11)

To discover the properties of the solution of (3.11), it is useful first to form the characteristic polynomial
for (3.11):

z 2 − ρ1 z − ρ2 (3.12)

where z is possibly a complex number

3.3. OOP III: The Samuelson Accelerator 177


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

We want to find the two zeros (a.k.a. roots) – namely λ1 , λ2 – of the characteristic polynomial
These are two special values of z, say z = λ1 and z = λ2 , such that if we set z equal to one of these values
in expression (3.12), the characteristic polynomial (3.12) equals zero:

z 2 − ρ1 z − ρ2 = (z − λ1 )(z − λ2 ) = 0 (3.13)

Equation (3.13) is said to factor the characteristic polynomial


When the roots are complex, they will occur as a complex conjugate pair
When the roots are complex, it is convenient to represent them in the polar form

λ1 = reiω , λ2 = re−iω

where r is the amplitude of the complex number and ω is its angle or phase
These can also be represented as

λ1 = r(cos(ω) + i sin(ω))

λ2 = r(cos(ω) − i sin(ω))
(To read about the polar form, see here)
Given initial conditions Y−1 , Y−2 , we want to generate a solution of the difference equation (3.11)
It can be represented as

Yt = λt1 c1 + λt2 c2

where c1 and c2 are constants that depend on the two initial conditions and on ρ1 , ρ2
When the roots are complex, it is useful to pursue the following calculations
Notice that

Yt = c1 (reiω )t + c2 (re−iω )t (3.14)


t iωt t −iωt
= c1 r e + c2 r e (3.15)
= c1 r [cos(ωt) + i sin(ωt)] + c2 r [cos(ωt) − i sin(ωt)]
t t
(3.16)
= (c1 + c2 )r cos(ωt) + i(c1 − c2 )r sin(ωt)
t t
(3.17)

The only way that Yt can be a real number for each t is if c1 + c2 is a real number and c1 − c2 is an imaginary
number This happens only when c1 and c2 are complex conjugates, in which case they can be written in the
polar forms

c1 = veiθ , c2 = ve−iθ

So we can write

Yt = veiθ rt eiωt + ve−iθ rt e−iωt (3.18)


t i(ωt+θ) −i(ωt+θ)
= vr [e +e ] (3.19)
t
= 2vr cos(ωt + θ) (3.20)

178 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

where v and θ are constants that must be chosen to satisfy initial conditions for Y−1 , Y−2
where c̃1 , c̃2 is a pair of constants chosen to satisfy the given initial conditions for Y−1 , Y−2
This formula shows that when the roots are complex, Yt displays oscillations with period p̌ = 2π ω and
damping factor r We say that p̌ is the period because in that amount of time the cosine wave cos(ωt + θ)
goes through exactly one complete cycles (Draw a cosine funtion to convince yourself of this please)
Remark: Following [Sam39], we want to choose the parameters a, b of the model so that the absolute values
(of the possibly complex) roots λ1 , λ2 of the characteristic polynomial are both strictly less than one:

|λj | < 1 for j = 1, 2

Remark: When both roots λ1 , λ2 of the characteristic polynomial have absolute values strictly less than
one, the absolute value of the larger one governs the rate of convergence to the steady state of the non
stochastic version of the model

Things this lecture does

We write a function to generate simulations of a {Yt } sequence as a function of time


The function requires that we put in initial conditions for Y−1 , Y−2
The function checks that a, b are set so that λ1 , λ2 are less than unity in absolute value (also called
modulus)
The function also tells us whether the roots are complex, and, if they are complex, returns both their real
and complex parts
If the roots are both real, the function returns their values
We use our function written to simulate paths that are stochastic (when σ > 0)
We have written the function in a way that allows us to input {Gt } paths of a few simple forms, e.g.,
• one time jumps in G at some time
• a permanent jump in G that occurs at some time
We proceed to use the Samuelson multiplier-accererator model as a laboratory to make a simple OOP ex-
ample
The state that determines next periods Yt+1 is now not just the current value Yt but also the once lagged
value Yt−1
This involves a little more bookkeeping than is required in the Solow model class definition
We use the Samuelson multiplier-accelerator model as a vehicle for teaching how we can gradually add
more features to the class
We want to have a method in the class that automatically generates a simulation, either nonstochastic (σ = 0)
or stochastic (σ > 0)
We also show how to map the Samuelson model into a simple instance of the LinearStateSpace class de-
scribed here

3.3. OOP III: The Samuelson Accelerator 179


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

We can use a LinearStateSpace instance to do various things that we did above with our homemade function
and class
Among other things, we show by example that the eigenvalues of the matrix A that we use to form the
instance of the LinearStateSpace class for the Samuelson model equal the roots of the characteristic poly-
nomial (3.12) for the Samuelson multiplier accelerator model
Here is the formula for the matrix A in the linear state space system in the case that government expenditures
are a constant G:
 
1 0 0
A =  γ + G ρ 1 ρ2 
0 1 0

3.3.3 Implementation

Well start by drawing an informative graph from page 189 of [Sar87]


import numpy as np
import matplotlib.pyplot as plt

def param_plot():

"""this function creates the graph on page 189 of Sargent Macroeconomic


,→ Theory, second edition, 1987"""

fig, ax = plt.subplots(figsize=(12, 8))


ax.set_aspect('equal')

# Set axis
xmin, ymin = -3, -2
xmax, ymax = -xmin, -ymin
plt.axis([xmin, xmax, ymin, ymax])

# Set axis labels


ax.set(xticks=[], yticks=[])
ax.set_xlabel(r'$\rho_2$', fontsize=16)
ax.xaxis.set_label_position('top')
ax.set_ylabel(r'$\rho_1$', rotation=0, fontsize=16)
ax.yaxis.set_label_position('right')

# Draw (t1, t2) points


ρ1 = np.linspace(-2, 2, 100)
ax.plot(ρ1, -abs(ρ1) + 1, c='black')
ax.plot(ρ1, np.ones_like(ρ1) * -1, c='black')
ax.plot(ρ1, -(ρ1**2 / 4), c='black')

# Turn normal axes off


for spine in ['left', 'bottom', 'top', 'right']:
ax.spines[spine].set_visible(False)

# Add arrows to represent axes


axes_arrows = {'arrowstyle': '<|-|>', 'lw': 1.3}

180 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

ax.annotate('', xy=(xmin, 0), xytext=(xmax, 0), arrowprops=axes_arrows)


ax.annotate('', xy=(0, ymin), xytext=(0, ymax), arrowprops=axes_arrows)

# Annotate the plot with equations


plot_arrowsl = {'arrowstyle': '-|>', 'connectionstyle': "arc3, rad=-0.2"}
plot_arrowsr = {'arrowstyle': '-|>', 'connectionstyle': "arc3, rad=0.2"}
ax.annotate(r'$\rho_1 + \rho_2 < 1$', xy=(0.5, 0.3), xytext=(0.8, 0.6),
arrowprops=plot_arrowsr, fontsize='12')
ax.annotate(r'$\rho_1 + \rho_2 = 1$', xy=(0.38, 0.6), xytext=(0.6, 0.8),
arrowprops=plot_arrowsr, fontsize='12')
ax.annotate(r'$\rho_2 < 1 + \rho_1$', xy=(-0.5, 0.3), xytext=(-1.3, 0.6),
arrowprops=plot_arrowsl, fontsize='12')
ax.annotate(r'$\rho_2 = 1 + \rho_1$', xy=(-0.38, 0.6), xytext=(-1, 0.8),
arrowprops=plot_arrowsl, fontsize='12')
ax.annotate(r'$\rho_2 = -1$', xy=(1.5, -1), xytext=(1.8, -1.3),
arrowprops=plot_arrowsl, fontsize='12')
ax.annotate(r'${\rho_1}^2 + 4\rho_2 = 0$', xy=(1.15, -0.35),
xytext=(1.5, -0.3), arrowprops=plot_arrowsr, fontsize='12')
ax.annotate(r'${\rho_1}^2 + 4\rho_2 < 0$', xy=(1.4, -0.7),
xytext=(1.8, -0.6), arrowprops=plot_arrowsr, fontsize='12')

# Label categories of solutions


ax.text(1.5, 1, 'Explosive\n growth', ha='center', fontsize=16)
ax.text(-1.5, 1, 'Explosive\n oscillations', ha='center', fontsize=16)
ax.text(0.05, -1.5, 'Explosive oscillations', ha='center', fontsize=16)
ax.text(0.09, -0.5, 'Damped oscillations', ha='center', fontsize=16)

# Add small marker to y-axis


ax.axhline(y=1.005, xmin=0.495, xmax=0.505, c='black')
ax.text(-0.12, -1.12, '-1', fontsize=10)
ax.text(-0.12, 0.98, '1', fontsize=10)

return fig

param_plot()
plt.show()

3.3. OOP III: The Samuelson Accelerator 181


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

The graph portrays regions in which the (λ1 , λ2 ) root pairs implied by the (ρ1 = (a+b), ρ2 = −b) difference
equation parameter pairs in the Samuelson model are such that:
• (λ1 , λ2 ) are complex with modulus less than 1 - in this case, the {Yt } sequence displays damped
oscillations
• (λ1 , λ2 ) are both real, but one is strictly greater than 1 - this leads to explosive growth
• (λ1 , λ2 ) are both real, but one is strictly less than −1 - this leads to explosive oscillations
• (λ1 , λ2 ) are both real and both are less than 1 in absolute value - in this case, there is smooth conver-
gence to the steady state without damped cycles
Later well present the graph with a red mark showing the particular point implied by the setting of (a, b)

Function to describe implications of characteristic polynomial

def categorize_solution(ρ1, ρ2):


"""this function takes values of ρ1 and ρ2 and uses them to classify the
,→type of solution"""

discriminant = ρ1 ** 2 + 4 * ρ2
if ρ2 > 1 + ρ1 or ρ2 < -1:
print('Explosive oscillations')
elif ρ1 + ρ2 > 1:

182 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

print('Explosive growth')
elif discriminant < 0:
print('Roots are complex with modulus less than one; therefore damped
,→oscillations')
else:
print('Roots are real and absolute values are less than zero;
,→therefore get smooth convergence to a steady state')

### Test the categorize_solution function

categorize_solution(1.3, -.4)

Roots are real and absolute values are less than zero; therefore get smooth
,→convergence to a steady state

Function for plotting Yt paths

A useful function for our work below

def plot_y(function=None):
"""function plots path of Y_t"""
plt.subplots(figsize=(12, 8))
plt.plot(function)
plt.xlabel('Time $t$')
plt.ylabel('$Y_t$', rotation=0)
plt.grid()
plt.show()

Manual or by hand root calculations

The following function calculates roots of the characteristic polynomial using high school algebra
(Well calculate the roots in other ways later)
The function also plots a Yt starting from initial conditions that we set

from cmath import sqrt

##=== This is a 'manual' method ===#

def y_nonstochastic(y_0=100, y_1=80, α=.92, β=.5, γ=10, n=80):

"""Takes values of parameters and computes roots of characteristic


,→polynomial.
It tells whether they are real or complex and whether they are less
,→than unity in absolute value.

It also computes a simulation of length n starting from the two given


,→initial conditions for national income"""

3.3. OOP III: The Samuelson Accelerator 183


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

roots = []

ρ1 = α + β
ρ2 = -β

print(f'ρ_1 is {ρ1}')
print(f'ρ_2 is {ρ2}')

discriminant = ρ1 ** 2 + 4 * ρ2

if discriminant == 0:
roots.append(-ρ1 / 2)
print('Single real root: ')
print(''.join(str(roots)))
elif discriminant > 0:
roots.append((-ρ1 + sqrt(discriminant).real) / 2)
roots.append((-ρ1 - sqrt(discriminant).real) / 2)
print('Two real roots: ')
print(''.join(str(roots)))
else:
roots.append((-ρ1 + sqrt(discriminant)) / 2)
roots.append((-ρ1 - sqrt(discriminant)) / 2)
print('Two complex roots: ')
print(''.join(str(roots)))

if all(abs(root) < 1 for root in roots):


print('Absolute values of roots are less than one')
else:
print('Absolute values of roots are not less than one')

def transition(x, t): return ρ1 * x[t - 1] + ρ2 * x[t - 2] + γ

y_t = [y_0, y_1]

for t in range(2, n):


y_t.append(transition(y_t, t))

return y_t

plot_y(y_nonstochastic())

ρ_1 is 1.42
ρ_2 is -0.5
Two real roots:
[-0.6459687576256715, -0.7740312423743284]
Absolute values of roots are less than one

184 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Reverse engineering parameters to generate damped cycles

The next cell writes code that takes as inputs the modulus r and phase ϕ of a conjugate pair of complex
numbers in polar form

λ1 = r exp(iϕ), λ2 = r exp(−iϕ)

• The code assumes that these two complex numbers are the roots of the characteristic polynomial
• It then reverse engineers (a, b) and (ρ1 , ρ2 ), pairs that would generate those roots

### code to reverse engineer a cycle


### y_t = r^t (c_1 cos( t) + c2 sin( t))
###

import cmath
import math

def f(r, ):
"""
Takes modulus r and angle of complex number r exp(j )
and creates ρ1 and ρ2 of characteristic polynomial for which
r exp(j ) and r exp(- j ) are complex roots.

Returns the multiplier coefficient a and the accelerator coefficient b

3.3. OOP III: The Samuelson Accelerator 185


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

that verifies those roots.


"""
g1 = cmath.rect(r, ) # Generate two complex roots
g2 = cmath.rect(r, -)
ρ1 = g1 + g2 # Implied ρ1, ρ2
ρ2 = -g1 * g2
b = -ρ2 # Reverse engineer a and b that validate these
a = ρ1 - b
return ρ1, ρ2, a, b

## Now let's use the function in an example


## Here are the example paramters

r = .95
period = 10 # Length of cycle in units of time
= 2 * math.pi/period

## Apply the function

ρ1, ρ2, a, b = f(r, )

print(f"a, b = {a}, {b}")


print(f"ρ1, ρ2 = {ρ1}, {ρ2}")

a, b = (0.6346322893124001+0j) (0.9024999999999999-0j)
ρ1, ρ2 = (1.5371322893124+0j) (-0.9024999999999999+0j)

## Print the real components of ρ1 and ρ2

ρ1 = ρ1.real
ρ2 = ρ2.real

ρ1, ρ2

(1.5371322893124, -0.9024999999999999)

Root finding using numpy

Here well use numpy to compute the roots of the characteristic polynomial

r1, r2 = np.roots([1, -ρ1, -ρ2])

p1 = cmath.polar(r1)
p2 = cmath.polar(r2)

print(f"r, = {r}, {}")


print(f"p1, p2 = {p1}, {p2}")
# print(f"g1, g2 = {g1}, {g2}")

print(f"a, b = {a}, {b}")

186 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

print(f"ρ1, ρ2 = {ρ1}, {ρ2}")

r, = 0.95 0.6283185307179586
p1, p2 = (0.95, 0.6283185307179586) (0.95, -0.6283185307179586)
a, b = (0.6346322893124001+0j) (0.9024999999999999-0j)
ρ1, ρ2 = 1.5371322893124 -0.9024999999999999

##=== This method uses numpy to calculate roots ===#

def y_nonstochastic(y_0=100, y_1=80, α=.9, β=.8, γ=10, n=80):

""" Rather than computing the roots of the characteristic polynomial by


,→ hand as we did earlier, this function
enlists numpy to do the work for us """

# Useful constants
ρ1 = α + β
ρ2 = -β

categorize_solution(ρ1, ρ2)

# Find roots of polynomial


roots = np.roots([1, -ρ1, -ρ2])
print(f'Roots are {roots}')

# Check if real or complex


if all(isinstance(root, complex) for root in roots):
print('Roots are complex')
else:
print('Roots are real')

# Check if roots are less than one


if all(abs(root) < 1 for root in roots):
print('Roots are less than one')
else:
print('Roots are not less than one')

# Define transition equation


def transition(x, t): return ρ1 * x[t - 1] + ρ2 * x[t - 2] + γ

# Set initial conditions


y_t = [y_0, y_1]

# Generate y_t series


for t in range(2, n):
y_t.append(transition(y_t, t))

return y_t

plot_y(y_nonstochastic())

3.3. OOP III: The Samuelson Accelerator 187


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Roots are complex with modulus less than one; therefore damped oscillations
Roots are [ 0.85+0.27838822j 0.85-0.27838822j]
Roots are complex
Roots are less than one

Reverse engineered complex roots: example

The next cell studies the implications of reverse engineered complex roots
Well generate an undamped cycle of period 10

r = 1 # generates undamped, nonexplosive cycles

period = 10 # length of cycle in units of time


= 2 * math.pi/period

## Apply the reverse engineering function f

ρ1, ρ2, a, b = f(r, )

a = a.real # drop the imaginary part so that it is a valid input into y_


,→nonstochastic

b = b.real

188 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

print(f"a, b = {a}, {b}")

ytemp = y_nonstochastic(α=a, β=b, y_0=20, y_1=30)


plot_y(ytemp)

a, b = 0.6180339887498949 1.0
Roots are complex with modulus less than one; therefore damped oscillations
Roots are [ 0.80901699+0.58778525j 0.80901699-0.58778525j]
Roots are complex
Roots are less than one

Digression: using sympy to find roots

We can also use sympy to compute analytic formulas for the roots
import sympy
from sympy import Symbol, init_printing
init_printing()

r1 = Symbol("ρ_1")
r2 = Symbol("ρ_2")
z = Symbol("z")

sympy.solve(z**2 - r1*z - r2, z)

3.3. OOP III: The Samuelson Accelerator 189


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

[ √ √ ]
ρ1 1 ρ1 1
− ρ21 + 4ρ2 , + ρ21 + 4ρ2
2 2 2 2

a = Symbol("α")
b = Symbol("β")
r1 = a + b
r2 = -b

sympy.solve(z**2 - r1*z - r2, z)


[ ]
α β 1√ 2 α β 1√ 2
+ − α + 2αβ + β 2 − 4β, + + α + 2αβ + β 2 − 4β
2 2 2 2 2 2

3.3.4 Stochastic shocks

Now well construct some code to simulate the stochastic version of the model that emerges when we add a
random shock process to aggregate demand

def y_stochastic(y_0=0, y_1=0, α=0.8, β=0.2, γ=10, n=100, σ=5):

"""This function takes parameters of a stochastic version of the model


,→and proceeds to analyze
the roots of the characteristic polynomial and also generate a simulation"
,→""

# Useful constants
ρ1 = α + β
ρ2 = -β

# Categorize solution
categorize_solution(ρ1, ρ2)

# Find roots of polynomial


roots = np.roots([1, -ρ1, -ρ2])
print(roots)

# Check if real or complex


if all(isinstance(root, complex) for root in roots):
print('Roots are complex')
else:
print('Roots are real')

# Check if roots are less than one


if all(abs(root) < 1 for root in roots):
print('Roots are less than one')
else:
print('Roots are not less than one')

# Generate shocks
= np.random.normal(0, 1, n)

190 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

# Define transition equation


def transition(x, t): return ρ1 * \
x[t - 1] + ρ2 * x[t - 2] + γ + σ * [t]

# Set initial conditions


y_t = [y_0, y_1]

# Generate y_t series


for t in range(2, n):
y_t.append(transition(y_t, t))

return y_t

plot_y(y_stochastic())

Roots are real and absolute values are less than zero; therefore get smooth
,→convergence to a steady state
[ 0.7236068 0.2763932]
Roots are real
Roots are less than one

Lets do a simulation in which there are shocks and the characteristic polynomial has complex roots

3.3. OOP III: The Samuelson Accelerator 191


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

r = .97

period = 10 # length of cycle in units of time


= 2 * math.pi/period

### apply the reverse engineering function f

ρ1, ρ2, a, b = f(r, )

a = a.real # drop the imaginary part so that it is a valid input into y_


,→nonstochastic

b = b.real

print(f"a, b = {a}, {b}")


plot_y(y_stochastic(y_0=40, y_1 = 42, α=a, β=b, σ=2, n=100))

a, b = 0.6285929690873979 0.9409000000000001
Roots are complex with modulus less than one; therefore damped oscillations
[ 0.78474648+0.57015169j 0.78474648-0.57015169j]
Roots are complex
Roots are less than one

192 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

3.3.5 Government spending

This function computes a response to either a permanent or one-off increase in government expenditures

def y_stochastic_g(y_0=20,
y_1=20,
α=0.8,
β=0.2,
γ=10,
n=100,
σ=2,
g=0,
g_t=0,
duration='permanent'):

"""This program computes a response to a permanent increase in government


,→ expenditures that occurs
at time 20"""

# Useful constants
ρ1 = α + β
ρ2 = -β

# Categorize solution
categorize_solution(ρ1, ρ2)

# Find roots of polynomial


roots = np.roots([1, -ρ1, -ρ2])
print(roots)

# Check if real or complex


if all(isinstance(root, complex) for root in roots):
print('Roots are complex')
else:
print('Roots are real')

# Check if roots are less than one


if all(abs(root) < 1 for root in roots):
print('Roots are less than one')
else:
print('Roots are not less than one')

# Generate shocks
= np.random.normal(0, 1, n)

def transition(x, t, g):

# Non-stochastic - separated to avoid generating random series when


,→ not needed
if σ == 0:
return ρ1 * x[t - 1] + ρ2 * x[t - 2] + γ + g

# Stochastic

3.3. OOP III: The Samuelson Accelerator 193


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

else:
= np.random.normal(0, 1, n)
return ρ1 * x[t - 1] + ρ2 * x[t - 2] + γ + g + σ * [t]

# Create list and set initial conditions


y_t = [y_0, y_1]

# Generate y_t series


for t in range(2, n):

# No government spending
if g == 0:
y_t.append(transition(y_t, t))

# Government spending (no shock)


elif g != 0 and duration == None:
y_t.append(transition(y_t, t))

# Permanent government spending shock


elif duration == 'permanent':
if t < g_t:
y_t.append(transition(y_t, t, g=0))
else:
y_t.append(transition(y_t, t, g=g))

# One-off government spending shock


elif duration == 'one-off':
if t == g_t:
y_t.append(transition(y_t, t, g=g))
else:
y_t.append(transition(y_t, t, g=0))
return y_t

A permanent government spending shock can be simulated as follows

plot_y(y_stochastic_g(g=10, g_t=20, duration='permanent'))

Roots are real and absolute values are less than zero; therefore get smooth
,→convergence to a steady state

[ 0.7236068 0.2763932]
Roots are real
Roots are less than one

194 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

We can also see the response to a one time jump in government expenditures

plot_y(y_stochastic_g(g=500, g_t=50, duration='one-off'))

Roots are real and absolute values are less than zero; therefore get smooth
,→convergence to a steady state

[ 0.7236068 0.2763932]
Roots are real
Roots are less than one

3.3. OOP III: The Samuelson Accelerator 195


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

3.3.6 Wrapping everything into a class

Up to now we have written functions to do the work


Now well roll up our sleeves and write a Python class called Samuelson for the Samuleson model

class Samuelson():

r"""This class represents the Samuelson model, otherwise known as the


multiple-accelerator model. The model combines the Keynesian multiplier
with the accelerator theory of investment.

The path of output is governed by a linear second-order difference


,→ equation

.. math::

Y_t = + \alpha (1 + \beta) Y_{t-1} - \alpha \beta Y_{t-2}

Parameters
----------
y_0 : scalar
Initial condition for Y_0
y_1 : scalar
Initial condition for Y_1

196 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

α : scalar
Marginal propensity to consume
β : scalar
Accelerator coefficient
n : int
Number of iterations
σ : scalar
Volatility parameter. Must be greater than or equal to 0. Set
equal to 0 for non-stochastic model.
g : scalar
Government spending shock
g_t : int
Time at which government spending shock occurs. Must be specified
when duration != None.
duration : {None, 'permanent', 'one-off'}
Specifies type of government spending shock. If none, government
spending equal to g for all t.

"""

def __init__(self,
y_0=100,
y_1=50,
α=1.3,
β=0.2,
γ=10,
n=100,
σ=0,
g=0,
g_t=0,
duration=None):

self.y_0, self.y_1, self.α, self.β = y_0, y_1, α, β


self.n, self.g, self.g_t, self.duration = n, g, g_t, duration
self.γ, self.σ = γ, σ
self.ρ1 = α + β
self.ρ2 = -β
self.roots = np.roots([1, -self.ρ1, -self.ρ2])

def root_type(self):
if all(isinstance(root, complex) for root in self.roots):
return 'Complex conjugate'
elif len(self.roots) > 1:
return 'Double real'
else:
return 'Single real'

def root_less_than_one(self):
if all(abs(root) < 1 for root in self.roots):
return True

def solution_type(self):
ρ1, ρ2 = self.ρ1, self.ρ2

3.3. OOP III: The Samuelson Accelerator 197


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

discriminant = ρ1 ** 2 + 4 * ρ2
if ρ2 >= 1 + ρ1 or ρ2 <= -1:
return 'Explosive oscillations'
elif ρ1 + ρ2 >= 1:
return 'Explosive growth'
elif discriminant < 0:
return 'Damped oscillations'
else:
return 'Steady state'

def _transition(self, x, t, g):

# Non-stochastic - separated to avoid generating random series when


,→ not needed
if self.σ == 0:
return self.ρ1 * x[t - 1] + self.ρ2 * x[t - 2] + self.γ + g

# Stochastic
else:
= np.random.normal(0, 1, self.n)
return self.ρ1 * x[t - 1] + self.ρ2 * x[t - 2] + self.γ + g +
,→self.σ * [t]

def generate_series(self):

# Create list and set initial conditions


y_t = [self.y_0, self.y_1]

# Generate y_t series


for t in range(2, self.n):

# No government spending
if self.g == 0:
y_t.append(self._transition(y_t, t))

# Government spending (no shock)


elif self.g != 0 and self.duration == None:
y_t.append(self._transition(y_t, t))

# Permanent government spending shock


elif self.duration == 'permanent':
if t < self.g_t:
y_t.append(self._transition(y_t, t, g=0))
else:
y_t.append(self._transition(y_t, t, g=self.g))

# One-off government spending shock


elif self.duration == 'one-off':
if t == self.g_t:
y_t.append(self._transition(y_t, t, g=self.g))
else:
y_t.append(self._transition(y_t, t, g=0))
return y_t

198 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

def summary(self):
print('Summary\n' + '-' * 50)
print(f'Root type: {self.root_type()}')
print(f'Solution type: {self.solution_type()}')
print(f'Roots: {str(self.roots)}')

if self.root_less_than_one() == True:
print('Absolute value of roots is less than one')
else:
print('Absolute value of roots is not less than one')

if self.σ > 0:
print('Stochastic series with σ = ' + str(self.σ))
else:
print('Non-stochastic series')

if self.g != 0:
print('Government spending equal to ' + str(self.g))

if self.duration != None:
print(self.duration.capitalize() +
' government spending shock at t = ' + str(self.g_t))

def plot(self):
fig, ax = plt.subplots(figsize=(12, 8))
ax.plot(self.generate_series())
ax.set(xlabel='Iteration', xlim=(0, self.n))
ax.set_ylabel('$Y_t$', rotation=0)
ax.grid()

# Add parameter values to plot


paramstr = f'$\\alpha={self.α:.2f}$ \n $\\beta={self.β:.2f}$ \n
,→$\\gamma={self.γ:.2f}$ \n \

$\\sigma={self.σ:.2f}$ \n $\\rho_1={self.ρ1:.2f}$ \n $\\rho_2={self.ρ2:.2f}$'


props = dict(fc='white', pad=10, alpha=0.5)
ax.text(0.87, 0.05, paramstr, transform=ax.transAxes,
fontsize=12, bbox=props, va='bottom')

return fig

def param_plot(self):

# Uses the param_plot() function defined earlier (it is then able


# to be used standalone or as part of the model)

fig = param_plot()
ax = fig.gca()

# Add λ values to legend


for i, root in enumerate(self.roots):
if isinstance(root, complex):
operator = ['+', ''] # Need to fill operator for positive as
,→string is split apart

3.3. OOP III: The Samuelson Accelerator 199


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

label = rf'$\lambda_{i+1} = {sam.roots[i].real:.2f}


,→ {operator[i]} {sam.roots[i].imag:.2f}i$'
else:
label = rf'$\lambda_{i+1} = {sam.roots[i].real:.2f}$'
ax.scatter(0, 0, 0, label=label) # dummy to add to legend

# Add ρ pair to plot


ax.scatter(self.ρ1, self.ρ2, 100, 'red', '+', label=r'$(\ \rho_1, \
,→\rho_2 \ )$', zorder=5)

plt.legend(fontsize=12, loc=3)

return fig

Illustration of Samuelson class

Now well put our Samuelson class to work on an example

sam = Samuelson(α=0.8, β=0.5, σ=2, g=10, g_t=20, duration='permanent')


sam.summary()

Summary
---------------------------------------------------------
Root type: Complex conjugate
Solution type: Damped oscillations
Roots: [ 0.65+0.27838822j 0.65-0.27838822j]
Absolute value of roots is less than one
Stochastic series with σ = 2
Government spending equal to 10
Permanent government spending shock at t = 20

sam.plot()
plt.show()

200 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Using the graph

Well use our graph to show where the roots lie and how their location is consistent with the behavior of the
path just graphed
The red $+$ sign shows the location of the roots

sam.param_plot()
plt.show()

3.3. OOP III: The Samuelson Accelerator 201


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

3.3.7 Using the LinearStateSpace class

It turns out that we can use the QuantEcon.py LinearStateSpace class to do much of the work that we have
done from scratch above
Here is how we map the Samuelson model into an instance of a LinearStateSpace class
from quantecon import LinearStateSpace

""" This script maps the Samuelson model in the the LinearStateSpace class"""
α = 0.8
β = 0.9
ρ1 = α + β
ρ2 = -β
γ = 10
σ = 1
g = 10
n = 100

A = [[1, 0, 0],
[γ + g, ρ1, ρ2],
[0, 1, 0]]

G = [[γ + g, ρ1, ρ2], # this is Y_{t+1}


[γ, α, 0], # this is C_{t+1}

202 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

[0, β, -β]] # this is I_{t+1}

µ_0 = [1, 100, 100]


C = np.zeros((3,1))
C[1] = σ # stochastic

sam_t = LinearStateSpace(A, C, G, mu_0=µ_0)

x, y = sam_t.simulate(ts_length=n)

fig, axes = plt.subplots(3, 1, sharex=True, figsize=(15, 8))


titles = ['Output ($Y_t$)', 'Consumption ($C_t$)', 'Investment ($I_t$)']
colors = ['darkblue', 'red', 'purple']
for ax, series, title, color in zip(axes, y, titles, colors):
ax.plot(series, color=color)
ax.set(title=title, xlim=(0, n))
ax.grid()

axes[-1].set_xlabel('Iteration')

plt.show()

Other methods in the LinearStateSpace class

Lets plot impulse response functions for the instance of the Samuelson model using a method in the Lin-
earStateSpace class

3.3. OOP III: The Samuelson Accelerator 203


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

imres = sam_t.impulse_response()
imres = np.asarray(imres)
y1 = imres[:, :, 0]
y2 = imres[:, :, 1]
y1.shape

(2, 6, 1)
Now lets compute the zeros of the characteristic polynomial by simply calculating the eigenvalues of A

A = np.asarray(A)
w, v = np.linalg.eig(A)
print(w)

[ 0.85+0.42130749j 0.85-0.42130749j 1.00+0.j ]

Inheriting methods from LinearStateSpace

We could also create a subclass of LinearStateSpace (inheriting all its methods and attributes) to add more
functions to use

class SamuelsonLSS(LinearStateSpace):

"""
this subclass creates a Samuelson multiplier-accelerator model
as a linear state space system
"""
def __init__(self,
y_0=100,
y_1=100,
α=0.8,
β=0.9,
γ=10,
σ=1,
g=10):

self.α, self.β = α, β
self.y_0, self.y_1, self.g = y_0, y_1, g
self.γ, self.σ = γ, σ

# Define intial conditions


self.µ_0 = [1, y_0, y_1]

self.ρ1 = α + β
self.ρ2 = -β

# Define transition matrix


self.A = [[1, 0, 0],
[γ + g, self.ρ1, self.ρ2],
[0, 1, 0]]

204 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

# Define output matrix


self.G = [[γ + g, self.ρ1, self.ρ2], # this is Y_{t+1}
[γ, α, 0], # this is C_{t+1}
[0, β, -β]] # this is I_{t+1}

self.C = np.zeros((3, 1))


self.C[1] = σ # stochastic

# Initialize LSS with parameters from Samuleson model


LinearStateSpace.__init__(self, self.A, self.C, self.G, mu_0=self.µ_0)

def plot_simulation(self, ts_length=100, stationary=True):

# Temporarily store original parameters


temp_µ = self.µ_0
temp_Σ = self.Sigma_0

# Set distribution parameters equal to their stationary values for


,→simulation
if stationary == True:
try:
self.µ_x, self.µ_y, self.σ_x, self.σ_y = self.stationary_
,→distributions()

self.µ_0 = self.µ_y
self.Σ_0 = self.σ_y
# Exception where no convergence achieved when calculating
,→stationary distributions
except ValueError:
print('Stationary distribution does not exist')

x, y = self.simulate(ts_length)

fig, axes = plt.subplots(3, 1, sharex=True, figsize=(15, 8))


titles = ['Output ($Y_t$)', 'Consumption ($C_t$)', 'Investment ($I_t$)
,→ ']
colors = ['darkblue', 'red', 'purple']
for ax, series, title, color in zip(axes, y, titles, colors):
ax.plot(series, color=color)
ax.set(title=title, xlim=(0, n))
ax.grid()

axes[-1].set_xlabel('Iteration')

# Reset distribution parameters to their initial values


self.µ_0 = temp_µ
self.Sigma_0 = temp_Σ

return fig

def plot_irf(self, j=5):

x, y = self.impulse_response(j)

3.3. OOP III: The Samuelson Accelerator 205


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

# Reshape into 3 x j matrix for plotting purposes


yimf = np.array(y).flatten().reshape(j+1, 3).T

fig, axes = plt.subplots(3, 1, sharex=True, figsize=(15, 8))


labels = ['$Y_t$', '$C_t$', '$I_t$']
colors = ['darkblue', 'red', 'purple']
for ax, series, label, color in zip(axes, yimf, labels, colors):
ax.plot(series, color=color)
ax.set(xlim=(0, j))
ax.set_ylabel(label, rotation=0, fontsize=14, labelpad=10)
ax.grid()

axes[0].set_title('Impulse Response Functions')


axes[-1].set_xlabel('Iteration')

return fig

def multipliers(self, j=5):


x, y = self.impulse_response(j)
return np.sum(np.array(y).flatten().reshape(j+1, 3), axis=0)

Illustrations

Lets show how we can use the SamuelsonLSS

samlss = SamuelsonLSS()

samlss.plot_simulation(100, stationary=False)
plt.show()

206 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

samlss.plot_simulation(100, stationary=True)
plt.show()

samlss.plot_irf(100)
plt.show()

3.3. OOP III: The Samuelson Accelerator 207


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

samlss.multipliers()

array([ 7.414389, 6.835896, 0.578493])

3.3.8 Pure multiplier model

Lets shut down the accelerator by setting b = 0 to get a pure multiplier model
• the absence of cycles gives an idea about why Samuelson included the accelerator

pure_multiplier = SamuelsonLSS(α=0.95, β=0)

pure_multiplier.plot_simulation()

Stationary distribution does not exist

208 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

pure_multiplier = SamuelsonLSS(α=0.8, β=0)

pure_multiplier.plot_simulation()

3.3. OOP III: The Samuelson Accelerator 209


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

pure_multiplier.plot_irf(100)

3.3.9 Summary

In this lecture, we wrote functions and classes to represent non-stochastic and stochastic versions of the
Samuelson (1939) multiplier-accelerator model, described in [Sam39]
We saw that different parameter values led to different output paths, which could either be stationary, explo-
sive, or oscillating
We also were able to represent the model using the QuantEcon.py LinearStateSpace class

3.4 More Language Features

Contents

• More Language Features


– Overview
– Iterables and Iterators
– Names and Name Resolution
– Handling Errors
– Decorators and Descriptors

210 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

– Generators
– Recursive Function Calls
– Exercises
– Solutions

3.4.1 Overview

With this last lecture, our advice is to skip it on first pass, unless you have a burning desire to read it
Its here
1. as a reference, so we can link back to it when required, and
2. for those who have worked through a number of applications, and now want to learn more about the
Python language
A variety of topics are treated in the lecture, including generators, exceptions and descriptors

3.4.2 Iterables and Iterators

Weve already said something about iterating in Python


Now lets look more closely at how it all works, focusing in Pythons implementation of the for loop

Iterators

Iterators are a uniform interface to stepping through elements in a collection


Here well talk about using iteratorslater well learn how to build our own
Formally, an iterator is an object with a __next__ method
For example, file objects are iterators
To see this, lets have another look at the US cities data

f = open('us_cities.txt')
f.__next__()

'new york: 8244910\n'

f.__next__()

'los angeles: 3819702\n'

We see that file objects do indeed have a __next__ method, and that calling this method returns the next
line in the file

3.4. More Language Features 211


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

The next method can also be accessed via the builtin function next(), which directly calls this method

next(f)

'chicago: 2707120 \n'

The objects returned by enumerate() are also iterators

e = enumerate(['foo', 'bar'])
next(e)

(0, 'foo')

next(e)

(1, 'bar')

as are the reader objects from the csv module

from csv import reader

f = open('test_table.csv', 'r')
nikkei_data = reader(f)
next(nikkei_data)

['Date', 'Open', 'High', 'Low', 'Close', 'Volume', 'Adj Close']

next(nikkei_data)

['2009-05-21', '9280.35', '9286.35', '9189.92', '9264.15', '133200', '9264.15


,→']

Iterators in For Loops

All iterators can be placed to the right of the in keyword in for loop statements
In fact this is how the for loop works: If we write

for x in iterator:
<code block>

then the interpreter


• calls iterator.___next___() and binds x to the result
• executes the code block
• repeats until a StopIteration error occurs
So now you know how this magical looking syntax works

212 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

f = open('somefile.txt', 'r')
for line in f:
# do something

The interpreter just keeps


1. calling f.__next__() and binding line to the result
2. executing the body of the loop
This continues until a StopIteration error occurs

Iterables

You already know that we can put a Python list to the right of in in a for loop

for i in ['spam', 'eggs']:


print(i)

spam
eggs

So does that mean that a list is an iterator?


The answer is no:

x = ['foo', 'bar']
type(x)

list

next(x)

---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-17-5e4e57af3a97> in <module>()
----> 1 next(x)

TypeError: 'list' object is not an iterator

So why can we iterate over a list in a for loop?


The reason is that a list is iterable (as opposed to an iterator)
Formally, an object is iterable if it can be converted to an iterator using the built-in function iter()
Lists are one such object

x = ['foo', 'bar']
type(x)

3.4. More Language Features 213


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

list

y = iter(x)
type(y)

list_iterator

next(y)

'foo'

next(y)

'bar'

next(y)

---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
<ipython-input-62-75a92ee8313a> in <module>()
----> 1 y.next()

StopIteration:

Many other objects are iterable, such as dictionaries and tuples


Of course, not all objects are iterable

iter(42)

---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-63-826bbd6e91fc> in <module>()
----> 1 iter(42)

TypeError: 'int' object is not iterable

To conclude our discussion of for loops


• for loops work on either iterators or iterables
• In the second case, the iterable is converted into an iterator before the loop starts

Iterators and built-ins

Some built-in functions that act on sequences also work with iterables
• max(), min(), sum(), all(), any()

214 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

For example

x = [10, -10]
max(x)

10

y = iter(x)
type(y)

listiterator

max(y)

10

One thing to remember about iterators is that they are depleted by use

x = [10, -10]
y = iter(x)
max(y)

10

max(y)

---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-72-1d3b6314f310> in <module>()
----> 1 max(y)

ValueError: max() arg is an empty sequence

3.4.3 Names and Name Resolution

Variable Names in Python

Consider the Python statement

x = 42

We now know that when this statement is executed, Python creates an object of type int in your computers
memory, containing
• the value 42
• some associated attributes

3.4. More Language Features 215


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

But what is x itself?


In Python, x is called a name, and the statement x = 42 binds the name x to the integer object we have
just discussed
Under the hood, this process of binding names to objects is implemented as a dictionarymore about this in a
moment
There is no problem binding two or more names to the one object, regardless of what that object is

def f(string): # Create a function called f


print(string) # that prints any string it's passed

g = f
id(g) == id(f)

True

g('test')

test

In the first step, a function object is created, and the name f is bound to it
After binding the name g to the same object, we can use it anywhere we would use f
What happens when the number of names bound to an object goes to zero?
Heres an example of this situation, where the name x is first bound to one object and then rebound to another

x = 'foo'
id(x)

164994764

x = 'bar' # No names bound to object 164994764

What happens here is that the first object, with identity 164994764 is garbage collected
In other words, the memory slot that stores that object is deallocated, and returned to the operating system

Namespaces

Recall from the preceding discussion that the statement

x = 42

binds the name x to the integer object on the right-hand side


We also mentioned that this process of binding x to the correct object is implemented as a dictionary
This dictionary is called a namespace

216 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Definition: A namespace is a symbol table that maps names to objects in memory


Python uses multiple namespaces, creating them on the fly as necessary
For example, every time we import a module, Python creates a namespace for that module
To see this in action, suppose we write a script math2.py like this

# Filename: math2.py
pi = 'foobar'

Now we start the Python interpreter and import it

import math2

Next lets import the math module from the standard library

import math

Both of these modules have an attribute called pi

math.pi

3.1415926535897931

math2.pi

'foobar'

These two different bindings of pi exist in different namespaces, each one implemented as a dictionary
We can look at the dictionary directly, using module_name.__dict__

import math

math.__dict__

{'pow': <built-in function pow>, ..., 'pi': 3.1415926535897931,...} # Edited


,→output

import math2

math2.__dict__

{..., '__file__': 'math2.py', 'pi': 'foobar',...} # Edited output

As you know, we access elements of the namespace using the dotted attribute notation

math.pi

3.4. More Language Features 217


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

3.1415926535897931

In fact this is entirely equivalent to math.__dict__['pi']

math.__dict__['pi'] == math.pi

True

Viewing Namespaces

As we saw above, the math namespace can be printed by typing math.__dict__


Another way to see its contents is to type vars(math)

vars(math)

{'pow': <built-in function pow>,...

If you just want to see the names, you can type

dir(math)

['__doc__', '__name__', 'acos', 'asin', 'atan',...

Notice the special names __doc__ and __name__


These are initialized in the namespace when any module is imported
• __doc__ is the doc string of the module
• __name__ is the name of the module

print(math.__doc__)

This module is always available. It provides access to the


mathematical functions defined by the C standard.

math.__name__

'math'

Interactive Sessions

In Python, all code executed by the interpreter runs in some module


What about commands typed at the prompt?
These are also regarded as being executed within a module in this case, a module called __main__

218 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

To check this, we can look at the current module name via the value of __name__ given at the prompt

print(__name__)

__main__

When we run a script using IPythons run command, the contents of the file are executed as part of
__main__ too
To see this, lets create a file mod.py that prints its own __name__ attribute

# Filename: mod.py
print(__name__)

Now lets look at two different ways of running it in IPython

import mod # Standard import

mod

%run mod.py # Run interactively

__main__

In the second case, the code is executed as part of __main__, so __name__ is equal to __main__
To see the contents of the namespace of __main__ we use vars() rather than vars(__main__)
If you do this in IPython, you will see a whole lot of variables that IPython needs, and has initialized when
you started up your session
If you prefer to see only the variables you have initialized, use whos

x = 2
y = 3

import numpy as np

%whos

Variable Type Data/Info


------------------------------
np module <module 'numpy' from '/us<...>ages/numpy/__init__.pyc'>
x int 2
y int 3

The Global Namespace

Python documentation often makes reference to the global namespace


The global namespace is the namespace of the module currently being executed

3.4. More Language Features 219


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

For example, suppose that we start the interpreter and begin making assignments
We are now working in the module __main__, and hence the namespace for __main__ is the global
namespace
Next, we import a module called amodule

import amodule

At this point, the interpreter creates a namespace for the module amodule and starts executing commands
in the module
While this occurs, the namespace amodule.__dict__ is the global namespace
Once execution of the module finishes, the interpreter returns to the module from where the import statement
was made
In this case its __main__, so the namespace of __main__ again becomes the global namespace

Local Namespaces

Important fact: When we call a function, the interpreter creates a local namespace for that function, and
registers the variables in that namespace
The reason for this will be explained in just a moment
Variables in the local namespace are called local variables
After the function returns, the namespace is deallocated and lost
While the function is executing, we can view the contents of the local namespace with locals()
For example, consider

def f(x):
a = 2
print(locals())
return a * x

Now lets call the function

f(1)

{'a': 2, 'x': 1}

You can see the local namespace of f before it is destroyed

The __builtins__ Namespace

We have been using various built-in functions, such as max(), dir(), str(), list(), len(),
range(), type(), etc.
How does access to these names work?

220 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

• These definitions are stored in a module called __builtin__


• They have there own namespace called __builtins__

dir()

[..., '__builtins__', '__doc__', ...] # Edited output

dir(__builtins__)

[... 'iter', 'len', 'license', 'list', 'locals', ...] # Edited output

We can access elements of the namespace as follows

__builtins__.max

<built-in function max>

But __builtins__ is special, because we can always access them directly as well

max

<built-in function max>

__builtins__.max == max

True

The next section explains how this works

Name Resolution

Namespaces are great because they help us organize variable names


(Type import this at the prompt and look at the last item thats printed)
However, we do need to understand how the Python interpreter works with multiple namespaces
At any point of execution, there are in fact at least two namespaces that can be accessed directly
(Accessed directly means without using a dot, as in pi rather than math.pi)
These namespaces are
• The global namespace (of the module being executed)
• The builtin namespace
If the interpreter is executing a function, then the directly accessible namespaces are
• The local namespace of the function
• The global namespace (of the module being executed)

3.4. More Language Features 221


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

• The builtin namespace


Sometimes functions are defined within other functions, like so
def f():
a = 2
def g():
b = 4
print(a * b)
g()

Here f is the enclosing function for g, and each function gets its own namespaces
Now we can give the rule for how namespace resolution works:
The order in which the interpreter searches for names is
1. the local namespace (if it exists)
2. the hierarchy of enclosing namespaces (if they exist)
3. the global namespace
4. the builtin namespace
If the name is not in any of these namespaces, the interpreter raises a NameError
This is called the LEGB rule (local, enclosing, global, builtin)
Heres an example that helps to illustrate
Consider a script test.py that looks as follows
def g(x):
a = 1
x = x + a
return x

a = 0
y = g(10)
print("a = ", a, "y = ", y)

What happens when we run this script?


%run test.py

a = 0 y = 11

---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-2-401b30e3b8b5> in <module>()
----> 1 x

NameError: name 'x' is not defined

222 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

First,
• The global namespace {} is created
• The function object is created, and g is bound to it within the global namespace
• The name a is bound to 0, again in the global namespace
Next g is called via y = g(10), leading to the following sequence of actions
• The local namespace for the function is created
• Local names x and a are bound, so that the local namespace becomes {'x': 10, 'a': 1}
• Statement x = x + a uses the local a and local x to compute x + a, and binds local name x to
the result
• This value is returned, and y is bound to it in the global namespace
• Local x and a are discarded (and the local namespace is deallocated)
Note that the global a was not affected by the local a

Mutable Versus Immutable Parameters

This is a good time to say a little more about mutable vs immutable objects
Consider the code segment

def f(x):
x = x + 1
return x

x = 1
print(f(x), x)

We now understand what will happen here: The code prints 2 as the value of f(x) and 1 as the value of x
First f and x are registered in the global namespace
The call f(x) creates a local namespace and adds x to it, bound to 1
Next, this local x is rebound to the new integer object 2, and this value is returned
None of this affects the global x
However, its a different story when we use a mutable data type such as a list

def f(x):
x[0] = x[0] + 1
return x

x = [1]
print(f(x), x)

3.4. More Language Features 223


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

This prints [2] as the value of f(x) and same for x


Heres what happens
• f is registered as a function in the global namespace
• x bound to [1] in the global namespace
• The call f(x)
– Creates a local namespace
– Adds x to local namespace, bound to [1]
– The list [1] is modified to [2]
– Returns the list [2]
– The local namespace is deallocated, and local x is lost
• Global x has been modified

3.4.4 Handling Errors

Sometimes its possible to anticipate errors as were writing code


For example, the unbiased sample variance of sample y1 , . . . , yn is defined as

1 ∑
n
s2 := (yi − ȳ)2 ȳ = sample mean
n−1
i=1

This can be calculated in NumPy using np.var


But if you were writing a function to handle such a calculation, you might anticipate a divide-by-zero error
when the sample size is one
One possible action is to do nothing the program will just crash, and spit out an error message
But sometimes its worth writing your code in a way that anticipates and deals with runtime errors that you
think might arise
Why?
• Because the debugging information provided by the interpreter is often less useful than the information
on possible errors you have in your head when writing code
• Because errors causing execution to stop are frustrating if youre in the middle of a large computation
• Because its reduces confidence in your code on the part of your users (if you are writing for others)

Assertions

A relatively easy way to handle checks is with the assert keyword


For example, pretend for a moment that the np.var function doesnt exist and we need to write our own

224 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

def var(y):
n = len(y)
assert n > 1, 'Sample size must be greater than one.'
return np.sum((y - y.mean())**2) / float(n-1)

If we run this with an array of length one, the program will terminate and print our error message

var([1])

---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-20-0032ff8a150f> in <module>()
----> 1 var([1])

<ipython-input-19-cefafaec3555> in var(y)
1 def var(y):
2 n = len(y)
----> 3 assert n > 1, 'Sample size must be greater than one.'
4 return np.sum((y - y.mean())**2) / float(n-1)

AssertionError: Sample size must be greater than one.

The advantage is that we can


• fail early, as soon as we know there will be a problem
• supply specific information on why a program is failing

Handling Errors During Runtime

The approach used above is a bit limited, because it always leads to termination
Sometimes we can handle errors more gracefully, by treating special cases
Lets look at how this is done

Exceptions

Heres an example of a common error type

def f:

File "<ipython-input-5-f5bdb6d29788>", line 1


def f:
^
SyntaxError: invalid syntax

Since illegal syntax cannot be executed, a syntax error terminates execution of the program
Heres a different kind of error, unrelated to syntax

3.4. More Language Features 225


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

1 / 0

---------------------------------------------------------------------------
ZeroDivisionError Traceback (most recent call last)
<ipython-input-17-05c9758a9c21> in <module>()
----> 1 1/0

ZeroDivisionError: integer division or modulo by zero

Heres another

x1 = y1

---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-23-142e0509fbd6> in <module>()
----> 1 x1 = y1

NameError: name 'y1' is not defined

And another

'foo' + 6

---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-20-44bbe7e963e7> in <module>()
----> 1 'foo' + 6

TypeError: cannot concatenate 'str' and 'int' objects

And another

X = []
x = X[0]

---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-22-018da6d9fc14> in <module>()
----> 1 x = X[0]

IndexError: list index out of range

On each occasion, the interpreter informs us of the error type


• NameError, TypeError, IndexError, ZeroDivisionError, etc.
In Python, these errors are called exceptions

226 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Catching Exceptions

We can catch and deal with exceptions using try – except blocks
Heres a simple example

def f(x):
try:
return 1.0 / x
except ZeroDivisionError:
print('Error: division by zero. Returned None')
return None

When we call f we get the following output

f(2)

0.5

f(0)

Error: division by zero. Returned None

f(0.0)

Error: division by zero. Returned None

The error is caught and execution of the program is not terminated


Note that other error types are not caught
If we are worried the user might pass in a string, we can catch that error too

def f(x):
try:
return 1.0 / x
except ZeroDivisionError:
print('Error: Division by zero. Returned None')
except TypeError:
print('Error: Unsupported operation. Returned None')
return None

Heres what happens

f(2)

0.5

f(0)

3.4. More Language Features 227


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Error: Division by zero. Returned None

f('foo')

Error: Unsupported operation. Returned None

If we feel lazy we can catch these errors together

def f(x):
try:
return 1.0 / x
except (TypeError, ZeroDivisionError):
print('Error: Unsupported operation. Returned None')
return None

Heres what happens

f(2)

0.5

f(0)

Error: Unsupported operation. Returned None

f('foo')

Error: Unsupported operation. Returned None

If we feel extra lazy we can catch all error types as follows

def f(x):
try:
return 1.0 / x
except:
print('Error. Returned None')
return None

In general its better to be specific

3.4.5 Decorators and Descriptors

Lets look at some special syntax elements that are routinely used by Python developers
You might not need the following concepts immediately, but you will see them in other peoples code
Hence you need to understand them at some stage of your Python education

228 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Decorators

Decorators are a bit of syntactic sugar that, while easily avoided, have turned out to be popular
Its very easy to say what decorators do
On the other hand it takes a bit of effort to explain why you might use them

An Example

Suppose we are working on a program that looks something like this

import numpy as np

def f(x):
return np.log(np.log(x))

def g(x):
return np.sqrt(42 * x)

# Program continues with various calculations using f and g

Now suppose theres a problem: occasionally negative numbers get fed to f and g in the calculations that
follow
If you try it, youll see that when these functions are called with negative numbers they return a NumPy
object called nan
This stands for not a number (and indicates that you are trying to evaluate a mathematical function at a point
where it is not defined)
Perhaps this isnt what we want, because it causes other problems that are hard to pick up later on
Suppose that instead we want the program to terminate whenever this happens, with a sensible error message
This change is easy enough to implement

import numpy as np

def f(x):
assert x >= 0, "Argument must be nonnegative"
return np.log(np.log(x))

def g(x):
assert x >= 0, "Argument must be nonnegative"
return np.sqrt(42 * x)

# Program continues with various calculations using f and g

Notice however that there is some repetition here, in the form of two identical lines of code
Repetition makes our code longer and harder to maintain, and hence is something we try hard to avoid

3.4. More Language Features 229


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Here its not a big deal, but imagine now that instead of just f and g, we have 20 such functions that we need
to modify in exactly the same way
This means we need to repeat the test logic (i.e., the assert line testing nonnegativity) 20 times
The situation is still worse if the test logic is longer and more complicated
In this kind of scenario the following approach would be neater

import numpy as np

def check_nonneg(func):
def safe_function(x):
assert x >= 0, "Argument must be nonnegative"
return func(x)
return safe_function

def f(x):
return np.log(np.log(x))

def g(x):
return np.sqrt(42 * x)

f = check_nonneg(f)
g = check_nonneg(g)
# Program continues with various calculations using f and g

This looks complicated so lets work through it slowly


To unravel the logic, consider what happens when we say f = check_nonneg(f)
This calls the function check_nonneg with parameter func set equal to f
Now check_nonneg creates a new function called safe_function that verifies x as nonnegative and
then calls func on it (which is the same as f)
Finally, the global name f is then set equal to safe_function
Now the behavior of f is as we desire, and the same is true of g
At the same time, the test logic is written only once

Enter Decorators

The last version of our code is still not ideal


For example, if someone is reading our code and wants to know how f works, they will be looking for the
function definition, which is

def f(x):
return np.log(np.log(x))

They may well miss the line f = check_nonneg(f)


For this and other reasons, decorators were introduced to Python

230 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

With decorators, we can replace the lines

def f(x):
return np.log(np.log(x))

def g(x):
return np.sqrt(42 * x)

f = check_nonneg(f)
g = check_nonneg(g)

with

@check_nonneg
def f(x):
return np.log(np.log(x))

@check_nonneg
def g(x):
return np.sqrt(42 * x)

These two pieces of code do exactly the same thing


If they do the same thing, do we really need decorator syntax?
Well, notice that the decorators sit right on top of the function definitions
Hence anyone looking at the definition of the function will see them and be aware that the function is
modified
In the opinion of many people, this makes the decorator syntax a significant improvement to the language

Descriptors

Descriptors solve a common problem regarding management of variables


To understand the issue, consider a Car class, that simulates a car
Suppose that this class defines the variables miles and kms, which give the distance traveled in miles and
kilometers respectively
A highly simplified version of the class might look as follows

class Car:

def __init__(self, miles=1000):


self.miles = miles
self.kms = miles * 1.61

# Some other functionality, details omitted

One potential problem we might have here is that a user alters one of these variables but not the other

3.4. More Language Features 231


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

car = Car()
car.miles

1000

car.kms

1610.0

car.miles = 6000
car.kms

1610.0

In the last two lines we see that miles and kms are out of sync
What we really want is some mechanism whereby each time a user sets one of these variables, the other is
automatically updated

A Solution

In Python, this issue is solved using descriptors


A descriptor is just a Python object that implements certain methods
These methods are triggered when the object is accessed through dotted attribute notation
The best way to understand this is to see it in action
Consider this alternative version of the Car class
class Car:

def __init__(self, miles=1000):


self._miles = miles
self._kms = miles * 1.61

def set_miles(self, value):


self._miles = value
self._kms = value * 1.61

def set_kms(self, value):


self._kms = value
self._miles = value / 1.61

def get_miles(self):
return self._miles

def get_kms(self):
return self._kms

232 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

miles = property(get_miles, set_miles)


kms = property(get_kms, set_kms)

First lets check that we get the desired behavior

car = Car()
car.miles

1000

car.miles = 6000
car.kms

9660.0

Yep, thats what we want car.kms is automatically updated

How it Works

The names _miles and _kms are arbitrary names we are using to store the values of the variables
The objects miles and kms are properties, a common kind of descriptor
The methods get_miles, set_miles, get_kms and set_kms define what happens when you get
(i.e. access) or set (bind) these variables
• So-called getter and setter methods
The builtin Python function property takes getter and setter methods and creates a property
For example, after car is created as an instance of Car, the object car.miles is a property
Being a property, when we set its value via car.miles = 6000 its setter method is triggered in this
case set_miles

Decorators and Properties

These days its very common to see the property function used via a decorator
Heres another version of our Car class that works as before but now uses decorators to set up the properties

class Car:

def __init__(self, miles=1000):


self._miles = miles
self._kms = miles * 1.61

@property
def miles(self):
return self._miles

3.4. More Language Features 233


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

@property
def kms(self):
return self._kms

@miles.setter
def miles(self, value):
self._miles = value
self._kms = value * 1.61

@kms.setter
def kms(self, value):
self._kms = value
self._miles = value / 1.61

We wont go through all the details here


For further information you can refer to the descriptor documentation

3.4.6 Generators

A generator is a kind of iterator (i.e., it works with a next function)


We will study two ways to build generators: generator expressions and generator functions

Generator Expressions

The easiest way to build generators is using generator expressions


Just like a list comprehension, but with round brackets
Here is the list comprehension:

singular = ('dog', 'cat', 'bird')


type(singular)

tuple

plural = [string + 's' for string in singular]


plural

['dogs', 'cats', 'birds']

type(plural)

list

And here is the generator expression

234 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

singular = ('dog', 'cat', 'bird')


plural = (string + 's' for string in singular)
type(plural)

generator

next(plural)

'dogs'

next(plural)

'cats'

next(plural)

'birds'

Since sum() can be called on iterators, we can do this

sum((x * x for x in range(10)))

285

The function sum() calls next() to get the items, adds successive terms
In fact, we can omit the outer brackets in this case

sum(x * x for x in range(10))

285

Generator Functions

The most flexible way to create generator objects is to use generator functions
Lets look at some examples

Example 1

Heres a very simple example of a generator function

def f():
yield 'start'
yield 'middle'
yield 'end'

3.4. More Language Features 235


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

It looks like a function, but uses a keyword yield that we havent met before
Lets see how it works after running this code

type(f)

function

gen = f()
gen

<generator object f at 0x3b66a50>

next(gen)

'start'

next(gen)

'middle'

next(gen)

'end'

next(gen)

---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
<ipython-input-21-b2c61ce5e131> in <module>()
----> 1 gen.next()

StopIteration:

The generator function f() is used to create generator objects (in this case gen)
Generators are iterators, because they support a next method
The first call to next(gen)
• Executes code in the body of f() until it meets a yield statement
• Returns that value to the caller of next(gen)
The second call to next(gen) starts executing from the next line

def f():
yield 'start'
yield 'middle' # This line!
yield 'end'

236 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

and continues until the next yield statement


At that point it returns the value following yield to the caller of next(gen), and so on
When the code block ends, the generator throws a StopIteration error

Example 2

Our next example receives an argument x from the caller

def g(x):
while x < 100:
yield x
x = x * x

Lets see how it works

<function __main__.g>

gen = g(2)
type(gen)

generator

next(gen)

next(gen)

next(gen)

16

next(gen)

---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
<ipython-input-32-b2c61ce5e131> in <module>()
----> 1 gen.next()

StopIteration:

3.4. More Language Features 237


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

The call gen = g(2) binds gen to a generator


Inside the generator, the name x is bound to 2
When we call next(gen)
• The body of g() executes until the line yield x, and the value of x is returned
Note that value of x is retained inside the generator
When we call next(gen) again, execution continues from where it left off

def g(x):
while x < 100:
yield x
x = x * x # execution continues from here

When x < 100 fails, the generator throws a StopIteration error


Incidentally, the loop inside the generator can be infinite

def g(x):
while 1:
yield x
x = x * x

Advantages of Iterators

Whats the advantage of using an iterator here?


Suppose we want to sample a binomial(n,0.5)
One way to do it is as follows

import random
n = 10000000
draws = [random.uniform(0, 1) < 0.5 for i in range(n)]
sum(draws)

But we are creating two huge lists here, range(n) and draws
This uses lots of memory and is very slow
If we make n even bigger then this happens

n = 1000000000
draws = [random.uniform(0, 1) < 0.5 for i in range(n)]

---------------------------------------------------------------------------
MemoryError Traceback (most recent call last)
<ipython-input-9-20d1ec1dae24> in <module>()
----> 1 draws = [random.uniform(0, 1) < 0.5 for i in range(n)]

238 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

We can avoid these problems using iterators


Here is the generator function

def f(n):
i = 1
while i <= n:
yield random.uniform(0, 1) < 0.5
i += 1

Now lets do the sum

n = 10000000
draws = f(n)
draws

<generator object at 0xb7d8b2cc>

sum(draws)

4999141

In summary, iterables
• avoid the need to create big lists/tuples, and
• provide a uniform interface to iteration that can be used transparently in for loops

3.4.7 Recursive Function Calls

This is not something that you will use every day, but it is still useful you should learn it at some stage
Basically, a recursive function is a function that calls itself
For example, consider the problem of computing xt for some t when

xt+1 = 2xt , x0 = 1 (3.21)

Obviously the answer is 2t


We can compute this easily enough with a loop

def x_loop(t):
x = 1
for i in range(t):
x = 2 * x
return x

We can also use a recursive solution, as follows

3.4. More Language Features 239


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

def x(t):
if t == 0:
return 1
else:
return 2 * x(t-1)

What happens here is that each successive call uses its own frame in the stack
• a frame is where the local variables of a given function call are held
• stack is memory used to process function calls
– a First In Last Out (FILO) queue
This example is somewhat contrived, since the first (iterative) solution would usually be preferred to the
recursive solution
Well meet less contrived applications of recursion later on

3.4.8 Exercises

Exercise 1

The Fibonacci numbers are defined by

xt+1 = xt + xt−1 , x0 = 0, x1 = 1 (3.22)

The first few numbers in the sequence are: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55
Write a function to recursively compute the t-th Fibonacci number for any t

Exercise 2

Complete the following code, and test it using this csv file, which we assume that youve put in your current
working directory.

def column_iterator(target_file, column_number):


"""A generator function for CSV files.
When called with a file name target_file (string) and column number
column_number (integer), the generator function returns a generator
that steps through the elements of column column_number in file
target_file.
"""
# put your code here

dates = column_iterator('test_table.csv', 1)

for date in dates:


print(date)

240 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Exercise 3

Suppose we have a text file numbers.txt containing the following lines

prices
3
8

7
21

Using try – except, write a program to read in the contents of the file and sum the numbers, ignoring
lines without numbers

3.4.9 Solutions

Exercise 1

Heres the standard solution

def x(t):
if t == 0:
return 0
if t == 1:
return 1
else:
return x(t-1) + x(t-2)

Lets test it

print([x(i) for i in range(10)])

[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

Exercise 2

A small sample from test_table.csv is included (and saved) in the code below for convenience

%%file test_table.csv
Date,Open,High,Low,Close,Volume,Adj Close
2009-05-21,9280.35,9286.35,9189.92,9264.15,133200,9264.15
2009-05-20,9372.72,9399.40,9311.61,9344.64,143200,9344.64
2009-05-19,9172.56,9326.75,9166.97,9290.29,167000,9290.29
2009-05-18,9167.05,9167.82,8997.74,9038.69,147800,9038.69
2009-05-15,9150.21,9272.08,9140.90,9265.02,172000,9265.02
2009-05-14,9212.30,9223.77,9052.41,9093.73,169400,9093.73
2009-05-13,9305.79,9379.47,9278.89,9340.49,176000,9340.49
2009-05-12,9358.25,9389.61,9298.61,9298.61,188400,9298.61

3.4. More Language Features 241


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

2009-05-11,9460.72,9503.91,9342.75,9451.98,230800,9451.98
2009-05-08,9351.40,9464.43,9349.57,9432.83,220200,9432.83

One solution is as follows

def column_iterator(target_file, column_number):


"""A generator function for CSV files.
When called with a file name target_file (string) and column number
column_number (integer), the generator function returns a generator
which steps through the elements of column column_number in file
target_file.
"""
f = open(target_file, 'r')
for line in f:
yield line.split(',')[column_number - 1]
f.close()

dates = column_iterator('test_table.csv', 1)

i = 1
for date in dates:
print(date)
if i == 10:
break
i += 1

Date
2009-05-21
2009-05-20
2009-05-19
2009-05-18
2009-05-15
2009-05-14
2009-05-13
2009-05-12
2009-05-11

Exercise 3

Lets save the data first

%%file numbers.txt
prices
3
8

7
21

242 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Writing numbers.txt

f = open('numbers.txt')

total = 0.0
for line in f:
try:
total += float(line)
except ValueError:
pass

f.close()

print(total)

39.0

3.5 Debugging

Contents

• Debugging
– Overview
– Debugging
– Other Useful Magics

Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code
as cleverly as possible, you are, by definition, not smart enough to debug it. – Brian Kernighan

3.5.1 Overview

Are you one of those programmers who fills their code with print statements when trying to debug their
programs?
Hey, we all used to do that
(OK, sometimes we still do that)
But once you start writing larger programs youll need a better system
Debugging tools for Python vary across platforms, IDEs and editors
Here well focus on Jupyter and leave you to explore other settings

3.5. Debugging 243


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

3.5.2 Debugging

The debug Magic

Lets consider a simple (and rather contrived) example

import numpy as np
import matplotlib.pyplot as plt

def plot_log():
fig, ax = plt.subplots(2, 1)
x = np.linspace(1, 2, 10)
ax.plot(x, np.log(x))
plt.show()

plot_log() # Call the function, generate plot

This code is intended to plot the log function over the interval [1, 2]
But theres an error here: plt.subplots(2, 1) should be just plt.subplots()
(The call plt.subplots(2, 1) returns a NumPy array containing two axes objects, suitable for having
two subplots on the same figure)
Heres what happens when we run the code:

---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-1-ef5c75a58138> in <module>()
8 plt.show()
9
---> 10 plot_log() # Call the function, generate plot

<ipython-input-1-ef5c75a58138> in plot_log()
5 fig, ax = plt.subplots(2, 1)
6 x = np.linspace(1, 2, 10)
----> 7 ax.plot(x, np.log(x))
8 plt.show()
9

AttributeError: 'numpy.ndarray' object has no attribute 'plot'

The traceback shows that the error occurs at the method call ax.plot(x, np.log(x))
The error occurs because we have mistakenly made ax a NumPy array, and a NumPy array has no plot
method
But lets pretend that we dont understand this for the moment
We might suspect theres something wrong with ax but when we try to investigate this object, we get the
following exception:

---------------------------------------------------------------------------
NameError Traceback (most recent call last)

244 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

<ipython-input-2-645aedc8a285> in <module>()
----> 1 ax

NameError: name 'ax' is not defined

The problem is that ax was defined inside plot_log(), and the name is lost once that function terminates
Lets try doing it a different way
We run the first cell block again, generating the same error

import numpy as np
import matplotlib.pyplot as plt

def plot_log():
fig, ax = plt.subplots(2, 1)
x = np.linspace(1, 2, 10)
ax.plot(x, np.log(x))
plt.show()

plot_log() # Call the function, generate plot

---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-1-ef5c75a58138> in <module>()
8 plt.show()
9
---> 10 plot_log() # Call the function, generate plot

<ipython-input-1-ef5c75a58138> in plot_log()
5 fig, ax = plt.subplots(2, 1)
6 x = np.linspace(1, 2, 10)
----> 7 ax.plot(x, np.log(x))
8 plt.show()
9

AttributeError: 'numpy.ndarray' object has no attribute 'plot'

But this time we type in the following cell block

%debug

You should be dropped into a new prompt that looks something like this

ipdb>

(You might see pdb> instead)


Now we can investigate the value of our variables at this point in the program, step forward through the
code, etc.
For example, here we simply type the name ax to see whats happening with this object:

3.5. Debugging 245


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

ipdb> ax
array([<matplotlib.axes.AxesSubplot object at 0x290f5d0>,
<matplotlib.axes.AxesSubplot object at 0x2930810>], dtype=object)

Its now very clear that ax is an array, which clarifies the source of the problem
To find out what else you can do from inside ipdb (or pdb), use the online help

ipdb> h

Documented commands (type help <topic>):


========================================
EOF bt cont enable jump pdef r tbreak w
a c continue exit l pdoc restart u whatis
alias cl d h list pinfo return unalias where
args clear debug help n pp run unt
b commands disable ignore next q s until
break condition down j p quit step up

Miscellaneous help topics:


==========================
exec pdb

Undocumented commands:
======================
retval rv

ipdb> h c
c(ont(inue))
Continue execution, only stop when a breakpoint is encountered.

Setting a Break Point

The preceding approach is handy but sometimes insufficient


Consider the following modified version of our function above

import numpy as np
import matplotlib.pyplot as plt

def plot_log():
fig, ax = plt.subplots()
x = np.logspace(1, 2, 10)
ax.plot(x, np.log(x))
plt.show()

plot_log()

Here the original problem is fixed, but weve accidentally written np.logspace(1, 2, 10) instead of
np.linspace(1, 2, 10)
Now there wont be any exception, but the plot wont look right

246 Chapter 3. Advanced Python Programming


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

To investigate, it would be helpful if we could inspect variables like x during execution of the function
To this end , we add a break point by inserting the line from IPython.core.debugger import
Tracer; Tracer()() inside the function code block

import numpy as np
import matplotlib.pyplot as plt
from IPython.core.debugger import Pdb

def plot_log():
Pdb().set_trace()
fig, ax = plt.subplots()
x = np.logspace(1, 2, 10)
ax.plot(x, np.log(x))
plt.show()

plot_log()

Now lets run the script, and investigate via the debugger

> <ipython-input-5-c5864f6d184b>(6)plot_log()
4 def plot_log():
5 from IPython.core.debugger import Tracer; Tracer()()
----> 6 fig, ax = plt.subplots()
7 x = np.logspace(1, 2, 10)
8 ax.plot(x, np.log(x))

ipdb> n
> <ipython-input-5-c5864f6d184b>(7)plot_log()
5 from IPython.core.debugger import Tracer; Tracer()()
6 fig, ax = plt.subplots()
----> 7 x = np.logspace(1, 2, 10)
8 ax.plot(x, np.log(x))
9 plt.show()

ipdb> n
> <ipython-input-5-c5864f6d184b>(8)plot_log()
6 fig, ax = plt.subplots()
7 x = np.logspace(1, 2, 10)
----> 8 ax.plot(x, np.log(x))
9 plt.show()
10

ipdb> x
array([ 10. , 12.91549665, 16.68100537, 21.5443469 ,
27.82559402, 35.93813664, 46.41588834, 59.94842503,
77.42636827, 100. ])

We used n twice to step forward through the code (one line at a time)
Then we printed the value of x to see what was happening with that variable
To exit from the debugger, use q

3.5. Debugging 247


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

3.5.3 Other Useful Magics

In this lecture we used the %debug IPython magic


There are many other useful magics:
• %precision 4 sets printed precision for floats to 4 decimal places
• %whos gives a list of variables and their values
• %quickref gives a list of magics
The full list of magics is here

248 Chapter 3. Advanced Python Programming


CHAPTER

FOUR

DATA AND EMPIRICS

This part of the course provides a set of lectures focused on Data and Empirics using Python

4.1 Pandas

Contents

• Pandas
– Overview
– Series
– DataFrames
– On-Line Data Sources
– Exercises
– Solutions

4.1.1 Overview

Pandas is a package of fast, efficient data analysis tools for Python


Its popularity has surged in recent years, coincident with the rise of fields such as data science and machine
learning
Heres a popularity comparison over time against STATA and SAS, courtesy of Stack Overflow Trends

249
QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Just as NumPy provides the basic array data type plus core array operations, pandas
1. defines fundamental structures for working with data and
2. endows them with methods that facilitate operations such as
• reading in data
• adjusting indices
• working with dates and time series
• sorting, grouping, re-ordering and general data munging1
• dealing with missing values, etc., etc.
More sophisticated statistical functionality is left to other packages, such as statsmodels and scikit-learn,
which are built on top of pandas
This lecture will provide a basic introduction to pandas
Throughout the lecture we will assume that the following imports have taken place

import pandas as pd
import numpy as np

1
Wikipedia defines munging as cleaning data from one raw form into a structured, purged one.

250 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

4.1.2 Series

Two important data types defined by pandas are Series and DataFrame
You can think of a Series as a column of data, such as a collection of observations on a single variable
A DataFrame is an object for storing related columns of data
Lets start with Series

s = pd.Series(np.random.randn(4), name='daily returns')


s

0 0.430271
1 0.617328
2 -0.265421
3 -0.836113
Name: daily returns, dtype: float64

Here you can imagine the indices 0, 1, 2, 3 as indexing four listed companies, and the values being
daily returns on their shares
Pandas Series are built on top of NumPy arrays, and support many similar operations

s * 100

0 43.027108
1 61.732829
2 -26.542104
3 -83.611339
Name: daily returns, dtype: float64

np.abs(s)

0 0.430271
1 0.617328
2 0.265421
3 0.836113
Name: daily returns, dtype: float64

But Series provide more than NumPy arrays


Not only do they have some additional (statistically oriented) methods

s.describe()

count 4.000000
mean -0.013484
std 0.667092
min -0.836113
25% -0.408094
50% 0.082425

4.1. Pandas 251


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

75% 0.477035
max 0.617328
Name: daily returns, dtype: float64

But their indices are more flexible

s.index = ['AMZN', 'AAPL', 'MSFT', 'GOOG']


s

AMZN 0.430271
AAPL 0.617328
MSFT -0.265421
GOOG -0.836113
Name: daily returns, dtype: float64

Viewed in this way, Series are like fast, efficient Python dictionaries (with the restriction that the items in
the dictionary all have the same typein this case, floats)
In fact, you can use much of the same syntax as Python dictionaries

s['AMZN']

0.43027108469945924

s['AMZN'] = 0
s

AMZN 0.000000
AAPL 0.617328
MSFT -0.265421
GOOG -0.836113
Name: daily returns, dtype: float64

'AAPL' in s

True

4.1.3 DataFrames

While a Series is a single column of data, a DataFrame is several columns, one for each variable
In essence, a DataFrame in pandas is analogous to a (highly optimized) Excel spreadsheet
Thus, it is a powerful tool for representing and analyzing data that are naturally organized into rows and
columns, often with descriptive indexes for individual rows and individual columns
Lets look at an example that reads data from the CSV file pandas/data/test_pwt.csv, and can be
downloaded here
Heres the contents of test_pwt.csv

252 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

"country","country isocode","year","POP","XRAT","tcgdp","cc","cg"
"Argentina","ARG","2000","37335.653","0.9995","295072.21869","75.716805379",
,→"5.5788042896"

"Australia","AUS","2000","19053.186","1.72483","541804.6521","67.759025993",
,→"6.7200975332"

"India","IND","2000","1006300.297","44.9416","1728144.3748","64.575551328",
,→"14.072205773"

"Israel","ISR","2000","6114.57","4.07733","129253.89423","64.436450847","10.
,→266688415"

"Malawi","MWI","2000","11801.505","59.543808333","5026.2217836","74.707624181
,→","11.658954494"

"South Africa","ZAF","2000","45064.098","6.93983","227242.36949","72.718710427
,→","5.7265463933"

"United States","USA","2000","282171.957","1","9898700","72.347054303","6.
,→0324539789"

"Uruguay","URY","2000","3219.793","12.099591667","25255.961693","78.978740282
,→","5.108067988"

Supposing you have this data saved as test_pwt.csv in the present working directory (type %pwd in Jupyter
to see what this is), it can be read in as follows:

df = pd.read_csv('https://github.com/QuantEcon/QuantEcon.lectures.code/raw/
,→master/pandas/data/test_pwt.csv')

type(df)

pandas.core.frame.DataFrame

df

country country isocode year POP XRAT


,→ tcgdp cc cg
0 Argentina ARG 2000 37335.653 0.999500 295072.
,→218690 0 75.716805 5.578804
1 Australia AUS 2000 19053.186 1.724830 541804.
,→652100 1 67.759026 6.720098
2 India IND 2000 1006300.297 44.941600 1728144.
,→374800 2 64.575551 14.072206
3 Israel ISR 2000 6114.570 4.077330 129253.
,→894230 3 64.436451 10.266688
4 Malawi MWI 2000 11801.505 59.543808 5026.
,→221784 4 74.707624 11.658954
5 South Africa ZAF 2000 45064.098 6.939830 227242.
,→369490 5 72.718710 5.726546
6 United States USA 2000 282171.957 1.000000 9898700.
,→000000 6 72.347054 6.032454
7 Uruguay URY 2000 3219.793 12.099592 25255.
,→961693 7 78.978740 5.108068

We can select particular rows using standard Python array slicing notation

4.1. Pandas 253


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

df[2:5]

country country isocode year POP XRAT tcgdp


,→ cc cg
2 India IND 2000 1006300.297 44.941600 1728144.374800 64.
,→575551 14.072206
3 Israel ISR 2000 6114.570 4.077330 129253.894230 64.
,→436451 10.266688
4 Malawi MWI 2000 11801.505 59.543808 5026.221784 74.
,→707624 11.658954

To select columns, we can pass a list containing the names of the desired columns represented as strings

df[['country', 'tcgdp']]

country tcgdp
0 Argentina 295072.218690
1 Australia 541804.652100
2 India 1728144.374800
3 Israel 129253.894230
4 Malawi 5026.221784
5 South Africa 227242.369490
6 United States 9898700.000000
7 Uruguay 25255.961693

To select both rows and columns using integers, the iloc attribute should be used with the format .
iloc[rows,columns]

df.iloc[2:5,0:4]

country country isocode year POP


2 India IND 2000 1006300.297
3 Israel ISR 2000 6114.570
4 Malawi MWI 2000 11801.505

To select rows and columns using a mixture of integers and labels, the loc attribute can be used in a similar
way

df.loc[df.index[2:5], ['country', 'tcgdp']]

country tcgdp
2 India 1728144.374800
3 Israel 129253.894230
4 Malawi 5026.221784

Lets imagine that were only interested in population and total GDP (tcgdp)
One way to strip the data frame df down to only these variables is to overwrite the dataframe using the
selection method described above

254 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

df = df[['country','POP','tcgdp']]
df

country POP tcgdp


0 Argentina 37335.653 295072.218690
1 Australia 19053.186 541804.652100
2 India 1006300.297 1728144.374800
3 Israel 6114.570 129253.894230
4 Malawi 11801.505 5026.221784
5 South Africa 45064.098 227242.369490
6 United States 282171.957 9898700.000000
7 Uruguay 3219.793 25255.961693

Here the index 0, 1,..., 7 is redundant, because we can use the country names as an index
To do this, we set the index to be the country variable in the dataframe

df = df.set_index('country')
df

POP tcgdp
country
Argentina 37335.653 295072.218690
Australia 19053.186 541804.652100
India 1006300.297 1728144.374800
Israel 6114.570 129253.894230
Malawi 11801.505 5026.221784
South Africa 45064.098 227242.369490
United States 282171.957 9898700.000000
Uruguay 3219.793 25255.961693

Lets give the columns slightly better names

df.columns = 'population', 'total GDP'


df

population total GDP


country
Argentina 37335.653 295072.218690
Australia 19053.186 541804.652100
India 1006300.297 1728144.374800
Israel 6114.570 129253.894230
Malawi 11801.505 5026.221784
South Africa 45064.098 227242.369490
United States 282171.957 9898700.000000
Uruguay 3219.793 25255.961693

Population is in thousands, lets revert to single units

df['population'] = df['population'] * 1e3


df

4.1. Pandas 255


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

population total GDP


country
Argentina 37335653 295072.218690
Australia 19053186 541804.652100
India 1006300297 1728144.374800
Israel 6114570 129253.894230
Malawi 11801505 5026.221784
South Africa 45064098 227242.369490
United States 282171957 9898700.000000
Uruguay 3219793 25255.961693

Next were going to add a column showing real GDP per capita, multiplying by 1,000,000 as we go because
total GDP is in millions

df['GDP percap'] = df['total GDP'] * 1e6 / df['population']


df

population total GDP GDP percap


country
Argentina 37335653 295072.218690 7903.229085
Australia 19053186 541804.652100 28436.433261
India 1006300297 1728144.374800 1717.324719
Israel 6114570 129253.894230 21138.672749
Malawi 11801505 5026.221784 425.896679
South Africa 45064098 227242.369490 5042.647686
United States 282171957 9898700.000000 35080.381854
Uruguay 3219793 25255.961693 7843.970620

One of the nice things about pandas DataFrame and Series objects is that they have methods for plotting
and visualization that work through Matplotlib
For example, we can easily generate a bar plot of GDP per capita

import matplotlib.pyplot as plt

df['GDP percap'].plot(kind='bar')
plt.show()

The following figure is produced

256 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

At the moment the data frame is ordered alphabetically on the countrieslets change it to GDP per capita

df = df.sort_values(by='GDP percap', ascending=False)


df

population total GDP GDP percap


country
United States 282171957 9898700.000000 35080.381854
Australia 19053186 541804.652100 28436.433261
Israel 6114570 129253.894230 21138.672749
Argentina 37335653 295072.218690 7903.229085
Uruguay 3219793 25255.961693 7843.970620
South Africa 45064098 227242.369490 5042.647686
India 1006300297 1728144.374800 1717.324719
Malawi 11801505 5026.221784 425.896679

Plotting as before now yields

df['GDP percap'].plot(kind='bar')
plt.show()

4.1. Pandas 257


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

4.1.4 On-Line Data Sources

Python makes it straightforward to query on line databases programmatically


An important database for economists is FRED a vast collection of time series data maintained by the St.
Louis Fed
For example, suppose that we are interested in the unemployment rate
Via FRED, the entire series for the US civilian unemployment rate can be downloaded directly by entering
this URL into your browser (note that this requires an internet connection)

https://research.stlouisfed.org/fred2/series/UNRATE/downloaddata/UNRATE.csv

(Equivalently, click here: https://research.stlouisfed.org/fred2/series/UNRATE/downloaddata/UNRATE.


csv)
This request returns a CSV file, which will be handled by your default application for this class of files
Alternatively, we can access the CSV file from within a Python program
This can be done with a variety of methods
We start with a relatively low level method, and then return to pandas

258 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Accessing Data with requests

One option is to use requests, a standard Python library for requesting data over the Internet
To begin, try the following code on your computer

import requests

r = requests.get('http://research.stlouisfed.org/fred2/series/UNRATE/
,→downloaddata/UNRATE.csv')

If theres no error message, then the call has succeeded


If you do get an error, then there are two likely causes
1. You are not connected to the Internet hopefully this isnt the case
2. Your machine is accessing the Internet through a proxy server, and Python isnt aware of this
In the second case, you can either
• switch to another machine
• solve your proxy problem by reading the documentation
Assuming that all is working, you can now proceed to using the source object returned by
the call requests.get('http://research.stlouisfed.org/fred2/series/UNRATE/
downloaddata/UNRATE.csv')

url = 'http://research.stlouisfed.org/fred2/series/UNRATE/downloaddata/UNRATE.
,→csv'

source = requests.get(url).content.decode().split("\n")
source[0]

'DATE,VALUE\r\n'

source[1]

'1948-01-01,3.4\r\n'

source[2]

'1948-02-01,3.8\r\n'

We could now write some additional code to parse this text and store it as an array
But this is unnecessary pandas read_csv function can handle the task for us
We use parse_dates=True so that pandas recognizes our dates column, allowing for simple date filter-
ing

data = pd.read_csv(url, index_col=0, parse_dates=True)

4.1. Pandas 259


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

The data has been read into a pandas DataFrame called data that we can now manipulate in the usual way

type(data)

pandas.core.frame.DataFrame

data.head() # A useful method to get a quick look at a data frame

VALUE
DATE
1948-01-01 3.4
1948-02-01 3.8
1948-03-01 4.0
1948-04-01 3.9
1948-05-01 3.5

pd.set_option('precision', 1)
data.describe() # Your output might differ slightly

VALUE
count 830.0
mean 5.8
std 1.6
min 2.5
25% 4.7
50% 5.6
75% 6.9
max 10.8

We can also plot the unemployment rate from 2006 to 2012 as follows

data['2006':'2012'].plot()
plt.show()

The resulting figure looks as follows

260 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Accessing World Bank Data

Lets look at one more example of downloading and manipulating data this time from the World Bank
The World Bank collects and organizes data on a huge range of indicators
For example, heres some data on government debt as a ratio to GDP
If you click on DOWNLOAD DATA you will be given the option to download the data as an Excel file
The next program does this for you, reads an Excel file into a pandas DataFrame, and plots time series for
the US and Australia

import matplotlib.pyplot as plt


import requests
import pandas as pd

# == Get data and read into file gd.xls == #


wb_data_query = "http://api.worldbank.org/v2/en/indicator/gc.dod.totl.gd.zs?
,→downloadformat=excel"

r = requests.get(wb_data_query)
with open('gd.xls', 'wb') as output:
output.write(r.content)

4.1. Pandas 261


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

# == Parse data into a DataFrame == #


govt_debt = pd.read_excel('gd.xls', sheetname='Data', skiprows=3, index_col=1)

# == Take desired values and plot == #


govt_debt = govt_debt.transpose()
govt_debt = govt_debt[['AUS', 'USA']]
govt_debt = govt_debt[38:]
govt_debt.plot(lw=2)
plt.show()

(The file is pandas/wb_download.py, and can be downloaded here


The figure it produces looks as follows

4.1.5 Exercises

Exercise 1

Write a program to calculate the percentage price change over 2013 for the following shares

ticker_list = {'INTC': 'Intel',


'MSFT': 'Microsoft',
'IBM': 'IBM',
'BHP': 'BHP',
'TM': 'Toyota',
'AAPL': 'Apple',
'AMZN': 'Amazon',
'BA': 'Boeing',
'QCOM': 'Qualcomm',

262 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

'KO': 'Coca-Cola',
'GOOG': 'Google',
'SNE': 'Sony',
'PTR': 'PetroChina'}

A dataset of daily closing prices for the above firms can be found in pandas/data/ticker_data.
csv, and can be downloaded here
Plot the result as a bar graph like follows

4.1.6 Solutions

Exercise 1

ticker = pd.read_csv('https://github.com/QuantEcon/QuantEcon.lectures.code/
,→raw/master/pandas/data/ticker_data.csv')

ticker.set_index('Date', inplace=True)

ticker_list = {'INTC': 'Intel',


'MSFT': 'Microsoft',
'IBM': 'IBM',
'BHP': 'BHP',
'TM': 'Toyota',

4.1. Pandas 263


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

'AAPL': 'Apple',
'AMZN': 'Amazon',
'BA': 'Boeing',
'QCOM': 'Qualcomm',
'KO': 'Coca-Cola',
'GOOG': 'Google',
'SNE': 'Sony',
'PTR': 'PetroChina'}

price_change = pd.Series()

for tick in ticker_list:


change = 100 * (ticker.loc[ticker.index[-1], tick] - ticker.loc[ticker.
,→index[0], tick]) / ticker.loc[ticker.index[0], tick]
name = ticker_list[tick]
price_change[name] = change

price_change.sort_values(inplace=True)
fig, ax = plt.subplots(figsize=(10,8))
price_change.plot(kind='bar', ax=ax)
plt.show()

264 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

4.2 Pandas for Panel Data

Contents

• Pandas for Panel Data


– Overview
– Slicing and reshaping data
– Merging dataframes and filling NaNs
– Grouping and summarizing data
– Final Remarks

4.2. Pandas for Panel Data 265


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

– Exercises
– Solutions

4.2.1 Overview

In an earlier lecture on pandas we looked at working with simple data sets


Econometricians often need to work with more complex data sets, such as panels
Common tasks include
• Importing data, cleaning it and reshaping it across several axes
• Selecting a time series or cross-section from a panel
• Grouping and summarizing data
pandas (derived from panel and data) contains powerful and easy-to-use tools for solving exactly these
kinds of problems
In what follows, we will use a panel data set of real minimum wages from the OECD to create:
• summary statistics over multiple dimensions of our data
• a time series of the average minimum wage of countries in the dataset
• kernel density estimates of wages by continent
We will begin by reading in our long format panel data from a csv file and reshaping the resulting
DataFrame with pivot_table to build a MultiIndex
Additional detail will be added to our DataFrame using pandas merge function, and data will be sum-
marized with the groupby function
Most of this lecture was created by Natasha Watkins

4.2.2 Slicing and reshaping data

We will read in a dataset from the OECD of real minimum wages in 32 countries and assign it to realwage
The dataset pandas_panel/realwage.csv can be downloaded here
Make sure the file is in your current working directory
import pandas as pd

# Display 6 columns for viewing purposes


pd.set_option('display.max_columns', 6)

# Reduce decimal points to 2


pd.options.display.float_format = '{:,.2f}'.format

realwage = pd.read_csv('https://github.com/QuantEcon/QuantEcon.lectures.code/
,→raw/master/pandas_panel/realwage.csv')

266 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Lets have a look at what weve got to work with

realwage.head() # Show first 5 rows

The data is currently in long format, which is difficult to analyse when there are several dimensions to the
data
We will use pivot_table to create a wide format panel, with a MultiIndex to handle higher dimen-
sional data
pivot_table arguments should specify the data (values), the index, and the columns we want in our
resulting dataframe
By passing a list in columns, we can create a MultiIndex in our column axis

realwage = realwage.pivot_table(values='value',
index='Time',
columns=['Country', 'Series', 'Pay period'])
realwage.head()

To more easily filter our time series data later on, we will convert the index into a DateTimeIndex

realwage.index = pd.to_datetime(realwage.index)
type(realwage.index)

4.2. Pandas for Panel Data 267


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

pandas.core.indexes.datetimes.DatetimeIndex

The columns contain multiple levels of indexing, known as a MultiIndex, with levels being ordered
hierarchically (Country > Series > Pay period)
A MultiIndex is the simplest and most flexible way to manage panel data in pandas

type(realwage.columns)

pandas.core.indexes.multi.MultiIndex

realwage.columns.names

FrozenList(['Country', 'Series', 'Pay period'])

Like before, we can select the country (the top level of our MultiIndex)

realwage['United States'].head()

Stacking and unstacking levels of the MultiIndex will be used throughout this lecture to reshape our
dataframe into a format we need
.stack() rotates the lowest level of the column MultiIndex to the row index (.unstack() works
in the opposite direction - try it out)

realwage.stack().head()

268 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

We can also pass in an argument to select the level we would like to stack

realwage.stack(level='Country').head()

Using a DatetimeIndex makes it easy to select a particular time period


Selecting one year and stacking the two lower levels of the MultiIndex creates a cross-section of our
panel data

realwage['2015'].stack(level=(1, 2)).transpose().head()

4.2. Pandas for Panel Data 269


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

For the rest of lecture, we will work with a dataframe of the hourly real minimum wages across countries
and time, measured in 2015 US dollars
To create our filtered dataframe (realwage_f), we can use the xs method to select values at lower levels
in the multiindex, while keeping the higher levels (countries in this case)

realwage_f = realwage.xs(('Hourly', 'In 2015 constant prices at 2015 USD


,→exchange rates'),
level=('Pay period', 'Series'), axis=1)
realwage_f.head()

4.2.3 Merging dataframes and filling NaNs

Similar to relational databases like SQL, pandas has built in methods to merge datasets together
Using country information from WorldData.info, well add the continent of each country to realwage_f
with the merge function
The csv file can be found in pandas_panel/countries.csv, and can be downloaded here

worlddata = pd.read_csv('https://github.com/QuantEcon/QuantEcon.lectures.code/
,→raw/master/pandas_panel/countries.csv', sep=';')
worlddata.head()

270 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

First well select just the country and continent variables from worlddata and rename the column to
Country

worlddata = worlddata[['Country (en)', 'Continent']]


worlddata = worlddata.rename(columns={'Country (en)': 'Country'})
worlddata.head()

We want to merge our new dataframe, worlddata, with realwage_f


The pandas merge function allows dataframes to be joined together by rows
Our dataframes will be merged using country names, requiring us to use the transpose of realwage_f so
that rows correspond to country names in both dataframes

realwage_f.transpose().head()

We can use either left, right, inner, or outer join to merge our datasets:
• left join includes only countries from the left dataset
• right join includes only countries from the right dataset
• outer join includes countries that are in either the left and right datasets
• inner join includes only countries common to both the left and right datasets
By default, merge will use an inner join

4.2. Pandas for Panel Data 271


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Here we will pass how='left' to keep all countries in realwage_f, but discard countries in
worlddata that do not have a corresponding data entry realwage_f
This is illustrated by the red shading in the following diagram

We will also need to specify where the country name is located in each dataframe, which will be the key
that is used to merge the dataframes on
Our left dataframe (realwage_f.transpose()) contains countries in the index, so we set
left_index=True
Our right dataframe (worlddata) contains countries in the Country column, so we set
right_on='Country'

merged = pd.merge(realwage_f.transpose(), worlddata,


how='left', left_index=True, right_on='Country')
merged.head()

Countries that appeared in realwage_f but not in worlddata will have NaN in the Continent column
To check whether this has occurred, we can use .isnull() on the continent column and filter the merged
dataframe

272 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

merged[merged['Continent'].isnull()]

We have three missing values!


One option to deal with NaN values is to create a dictionary containing these countries and their respective
continents
.map() will match countries in merged['Country'] with their continent from the dictionary
Notice how countries not in our dictionary are mapped with NaN

missing_continents = {'Korea': 'Asia',


'Russian Federation': 'Europe',
'Slovak Republic': 'Europe'}

merged['Country'].map(missing_continents)

17 NaN
23 NaN
32 NaN
100 NaN
38 NaN
108 NaN
41 NaN
225 NaN
53 NaN
58 NaN
45 NaN
68 NaN
233 NaN
86 NaN
88 NaN
91 NaN
247 Asia
117 NaN
122 NaN
123 NaN
138 NaN
153 NaN
151 NaN
174 NaN
175 NaN
247 Europe
247 Europe

4.2. Pandas for Panel Data 273


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

198 NaN
200 NaN
227 NaN
241 NaN
240 NaN
Name: Country, dtype: object

We dont want to overwrite the entire series with this mapping


.fillna() only fills in NaN values in merged['Continent'] with the mapping, while leaving other
values in the column unchanged

merged['Continent'] = merged['Continent'].fillna(merged['Country'].
,→map(missing_continents))

# Check for whether continents were correctly mapped

merged[merged['Country'] == 'Korea']

We will also combine the Americas into a single continent - this will make our visualization nicer later on
To do this, we will use .replace() and loop through a list of the continent values we want to replace

replace = ['Central America', 'North America', 'South America']

for country in replace:


merged['Continent'].replace(to_replace=country,
value='America',
inplace=True)

Now that we have all the data we want in a single DataFrame, we will reshape it back into panel form
with a MultiIndex
We should also ensure to sort the index using .sort_index() so that we can efficiently filter our
dataframe later on
By default, levels will be sorted top-down

merged = merged.set_index(['Continent', 'Country']).sort_index()


merged.head()

274 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

While merging, we lost our DatetimeIndex, as we merged columns that were not in datetime format

merged.columns

Index([2006-01-01 00:00:00, 2007-01-01 00:00:00, 2008-01-01 00:00:00,


2009-01-01 00:00:00, 2010-01-01 00:00:00, 2011-01-01 00:00:00,
2012-01-01 00:00:00, 2013-01-01 00:00:00, 2014-01-01 00:00:00,
2015-01-01 00:00:00, 2016-01-01 00:00:00],
dtype='object')

Now that we have set the merged columns as the index, we can recreate a DatetimeIndex using .
to_datetime()

merged.columns = pd.to_datetime(merged.columns)
merged.columns = merged.columns.rename('Time')
merged.columns

DatetimeIndex(['2006-01-01', '2007-01-01', '2008-01-01', '2009-01-01',


'2010-01-01', '2011-01-01', '2012-01-01', '2013-01-01',
'2014-01-01', '2015-01-01', '2016-01-01'],
dtype='datetime64[ns]', name='Time', freq=None)

The DatetimeIndex tends to work more smoothly in the row axis, so we will go ahead and transpose
merged

merged = merged.transpose()
merged.head()

4.2. Pandas for Panel Data 275


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

4.2.4 Grouping and summarizing data

Grouping and summarizing data can be particularly useful for understanding large panel datasets
A simple way to summarize data is to call an aggregation method on the dataframe, such as .mean() or
.max()
For example, we can calculate the average real minimum wage for each country over the period 2006 to
2016 (the default is to aggregate over rows)
merged.mean().head(10)

Continent Country
America Brazil 1.09
Canada 7.82
Chile 1.62
Colombia 1.07
Costa Rica 2.53
Mexico 0.53
United States 7.15
Asia Israel 5.95
Japan 6.18
Korea 4.22
dtype: float64

Using this series, we can plot the average real minimum wage over the past decade for each country in our
data set
import matplotlib.pyplot as plt
import matplotlib
matplotlib.style.use('seaborn')

276 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

%matplotlib inline

merged.mean().sort_values(ascending=False).plot(kind='bar', title="Average
,→real minimum wage 2006 - 2016")

#Set country labels


country_labels = merged.mean().sort_values(ascending=False).index.get_level_
,→values('Country').tolist()
plt.xticks(range(0, len(country_labels)), country_labels)
plt.xlabel('Country')

plt.show()

Passing in axis=1 to .mean() will aggregate over columns (giving the average minimum wage for all
countries over time)

4.2. Pandas for Panel Data 277


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

merged.mean(axis=1).head()

Time
2006-01-01 4.69
2007-01-01 4.84
2008-01-01 4.90
2009-01-01 5.08
2010-01-01 5.11
dtype: float64

We can plot this time series as a line graph

merged.mean(axis=1).plot()
plt.title('Average real minimum wage 2006 - 2016')
plt.ylabel('2015 USD')
plt.xlabel('Year')
plt.show()

We can also specify a level of the MultiIndex (in the column axis) to aggregate over

merged.mean(level='Continent', axis=1).head()

278 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

We can plot the average minimum wages in each continent as a time series

merged.mean(level='Continent', axis=1).plot()
plt.title('Average real minimum wage')
plt.ylabel('2015 USD')
plt.xlabel('Year')
plt.show()

4.2. Pandas for Panel Data 279


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

We will drop Australia as a continent for plotting purposes

merged = merged.drop('Australia', level='Continent', axis=1)


merged.mean(level='Continent', axis=1).plot()
plt.title('Average real minimum wage')
plt.ylabel('2015 USD')
plt.xlabel('Year')
plt.show()

.describe() is useful for quickly retrieving a number of common summary statistics

merged.stack().describe()

280 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

This is a simplified way to use groupby


Using groupby generally follows a split-apply-combine process:
• split: data is grouped based on one or more keys
• apply: a function is called on each group independently
• combine: the results of the function calls are combined into a new data structure
The groupby method achieves the first step of this process, creating a new DataFrameGroupBy object
with data split into groups
Lets split merged by continent again, this time using the groupby function, and name the resulting object
grouped

grouped = merged.groupby(level='Continent', axis=1)


grouped

<pandas.core.groupby.DataFrameGroupBy object at 0x7f2900efd160>

Calling an aggregation method on the object applies the function to each group, the results of which are
combined in a new data structure
For example, we can return the number of countries in our dataset for each continent using .size()
In this case, our new data structure is a Series

grouped.size()

Continent
America 7
Asia 4

4.2. Pandas for Panel Data 281


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Europe 19
dtype: int64

Calling .get_group() to return just the countries in a single group, we can create a kernel density
estimate of the distribution of real minimum wages in 2016 for each continent
grouped.groups.keys() will return the keys from the groupby object

import seaborn as sns

continents = grouped.groups.keys()

for continent in continents:


sns.kdeplot(grouped.get_group(continent)['2015'].unstack(),
,→label=continent, shade=True)

plt.title('Real minimum wages in 2015')


plt.xlabel('US dollars')
plt.show()

282 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

4.2.5 Final Remarks

This lecture has provided an introduction to some of pandas more advanced features, including multiindices,
merging, grouping and plotting
Other tools that may be useful in panel data analysis include xarray, a python package that extends pandas
to N-dimensional data structures

4.2.6 Exercises

Exercise 1

In these exercises youll work with a dataset of employment rates in Europe by age and sex from Eurostat
The dataset pandas_panel/employ.csv can be downloaded here
Reading in the csv file returns a panel dataset in long format. Use .pivot_table() to construct a wide
format dataframe with a MultiIndex in the columns
Start off by exploring the dataframe and the variables available in the MultiIndex levels
Write a program that quickly returns all values in the MultiIndex

Exercise 2

Filter the above dataframe to only include employment as a percentage of active population
Create a grouped boxplot using seaborn of employment rates in 2015 by age group and sex
Hint: GEO includes both areas and countries

4.2.7 Solutions

Exercise 1

employ = pd.read_csv('https://github.com/QuantEcon/QuantEcon.lectures.code/
,→raw/master/pandas_panel/employ.csv')

employ = employ.pivot_table(values='Value',
index=['DATE'],
columns=['UNIT','AGE', 'SEX', 'INDIC_EM', 'GEO'])
employ.index = pd.to_datetime(employ.index) # ensure that dates are datetime
,→format
employ.head()

4.2. Pandas for Panel Data 283


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

This is a large dataset so it is useful to explore the levels and variables available
employ.columns.names

FrozenList(['UNIT', 'AGE', 'SEX', 'INDIC_EM', 'GEO'])

Variables within levels can be quickly retrieved with a loop


for name in employ.columns.names:
print(name, employ.columns.get_level_values(name).unique())

UNIT Index(['Percentage of total population', 'Thousand persons'], dtype=


,→'object', name='UNIT')

AGE Index(['From 15 to 24 years', 'From 25 to 54 years', 'From 55 to 64 years


,→'], dtype='object', name='AGE')
SEX Index(['Females', 'Males', 'Total'], dtype='object', name='SEX')
INDIC_EM Index(['Active population', 'Total employment (resident population
,→concept - LFS)'], dtype='object', name='INDIC_EM')

GEO Index(['Austria', 'Belgium', 'Bulgaria', 'Croatia', 'Cyprus', 'Czech


,→Republic',

'Denmark', 'Estonia', 'Euro area (17 countries)',


'Euro area (18 countries)', 'Euro area (19 countries)',
'European Union (15 countries)', 'European Union (27 countries)',
'European Union (28 countries)', 'Finland',
'Former Yugoslav Republic of Macedonia, the', 'France',
'France (metropolitan)',
'Germany (until 1990 former territory of the FRG)', 'Greece', 'Hungary
,→',

'Iceland', 'Ireland', 'Italy', 'Latvia', 'Lithuania', 'Luxembourg',


'Malta', 'Netherlands', 'Norway', 'Poland', 'Portugal', 'Romania',
'Slovakia', 'Slovenia', 'Spain', 'Sweden', 'Switzerland', 'Turkey',

284 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

'United Kingdom'],
dtype='object', name='GEO')

Exercise 2

To easily filter by country, swap GEO to the top level and sort the MultiIndex

employ.columns = employ.columns.swaplevel(0,-1)
employ = employ.sort_index(axis=1)

We need to get rid of a few items in GEO which are not countries
A fast way to get rid of the EU areas is to use a list comprehension to find the level values in GEO that begin
with Euro

geo_list = employ.columns.get_level_values('GEO').unique().tolist()
countries = [x for x in geo_list if not x.startswith('Euro')]
employ = employ[countries]
employ.columns.get_level_values('GEO').unique()

Index(['Austria', 'Belgium', 'Bulgaria', 'Croatia', 'Cyprus', 'Czech Republic


,→',

'Denmark', 'Estonia', 'Finland',


'Former Yugoslav Republic of Macedonia, the', 'France',
'France (metropolitan)',
'Germany (until 1990 former territory of the FRG)', 'Greece', 'Hungary
,→',
'Iceland', 'Ireland', 'Italy', 'Latvia', 'Lithuania', 'Luxembourg',
'Malta', 'Netherlands', 'Norway', 'Poland', 'Portugal', 'Romania',
'Slovakia', 'Slovenia', 'Spain', 'Sweden', 'Switzerland', 'Turkey',
'United Kingdom'],
dtype='object', name='GEO')

Select only percentage employed in the active population from the dataframe

employ_f = employ.xs(('Percentage of total population', 'Active population'),


level=('UNIT', 'INDIC_EM'),
axis=1)
employ_f.head()

4.2. Pandas for Panel Data 285


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Drop the Total value before creating the grouped boxplot

employ_f = employ_f.drop('Total', level='SEX', axis=1)

box = employ_f['2015'].unstack().reset_index()
sns.boxplot(x="AGE", y=0, hue="SEX", data=box, palette=("husl"),
,→showfliers=False)
plt.xlabel('')
plt.xticks(rotation=35)
plt.ylabel('Percentage of population (%)')
plt.title('Employment in Europe (2015)')
plt.legend(bbox_to_anchor=(1,0.5))
plt.show()

286 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

4.3 Linear Regression in Python

Contents

• Linear Regression in Python


– Overview
– Simple Linear Regression
– Extending the Linear Regression Model
– Endogeneity
– Summary
– Exercises
– Solutions

4.3. Linear Regression in Python 287


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

4.3.1 Overview

Linear regression is a standard tool for analyzing the relationship between two or more variables
In this lecture well use the Python package statsmodels to estimate, interpret, and visualize linear
regression models
Along the way well discuss a variety of topics, including
• simple and multivariate linear regression
• visualization
• endogeneity and omitted variable bias
• two-stage least squares
As an example, we will replicate results from Acemoglu, Johnson and Robinsons seminal paper [AJR01]
• You can download a copy here
In the paper, the authors emphasize the importance of institutions in economic development
The main contribution is the use of settler mortality rates as a source of exogenous variation in institutional
differences
Such variation is needed to determine whether it is institutions that give rise to greater economic growth,
rather than the other way around

Prerequisites

This lecture assumes you are familiar with basic econometrics


For an introductory text covering these topics, see, for example, [Woo15]

Comments

This lecture is coauthored with Natasha Watkins

4.3.2 Simple Linear Regression

[AJR01] wish to determine whether or not differences in institutions can help to explain observed economic
outcomes
How do we measure institutional differences and economic outcomes?
In this paper,
• economic outcomes are proxied by log GDP per capita in 1995, adjusted for exchange rates
• institutional differences are proxied by an index of protection against expropriation on average over
1985-95, constructed by the Political Risk Services Group

288 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

These variables and other data used in the paper are available for download on Daron Acemoglus webpage
We will use pandas .read_stata() function to read in data contained in the .dta files to dataframes
import pandas as pd

df1 = pd.read_stata('https://github.com/QuantEcon/QuantEcon.lectures.code/raw/
,→master/ols/maketable1.dta')

df1.head()

Lets use a scatterplot to see whether any obvious relationship exists between GDP per capita and the pro-
tection against expropriation index
import matplotlib.pyplot as plt
plt.style.use('seaborn')

df1.plot(x='avexpr', y='logpgp95', kind='scatter')


plt.show()

4.3. Linear Regression in Python 289


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

The plot shows a fairly strong positive relationship between protection against expropriation and log GDP
per capita
Specifically, if higher protection against expropriation is a measure of institutional quality, then better insti-
tutions appear to be positively correlated with better economic outcomes (higher GDP per capita)
Given the plot, choosing a linear model to describe this relationship seems like a reasonable assumption
We can write our model as

logpgp95i = β0 + β1 avexpri + ui

where:
• β0 is the intercept of the linear trend line on the y-axis
• β1 is the slope of the linear trend line, representing the marginal effect of protection against risk on
log GDP per capita
• ui is a random error term (deviations of observations from the linear trend due to factors not included
in the model)
Visually, this linear model involves choosing a straight line that best fits the data, as in the following plot
(Figure 2 in [AJR01])

import numpy as np

# Dropping NA's is required to use numpy's polyfit


df1_subset = df1.dropna(subset=['logpgp95', 'avexpr'])

# Use only 'base sample' for plotting purposes


df1_subset = df1_subset[df1_subset['baseco'] == 1]

X = df1_subset['avexpr']
y = df1_subset['logpgp95']
labels = df1_subset['shortnam']

# Replace markers with country labels


plt.scatter(X, y, marker='')

for i, label in enumerate(labels):


plt.annotate(label, (X.iloc[i], y.iloc[i]))

# Fit a linear trend line


plt.plot(np.unique(X),
np.poly1d(np.polyfit(X, y, 1))(np.unique(X)),
color='black')

plt.xlim([3.3,10.5])
plt.ylim([4,10.5])
plt.xlabel('Average Expropriation Risk 1985-95')
plt.ylabel('Log GDP per capita, PPP, 1995')
plt.title('Figure 2: OLS relationship between expropriation risk and income')
plt.show()

290 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

The most common technique to estimate the parameters (βs) of the linear model is Ordinary Least Squares
(OLS)
As the name implies, an OLS model is solved by finding the parameters that minimize the sum of squared
residuals, ie.


N
min û2i
β̂ i=1

where ûi is the difference between the observation and the predicted value of the dependent variable
To estimate the constant term β0 , we need to add a column of 1s to our dataset (consider the equation if β0
was replaced with β0 xi and xi = 1)

df1['const'] = 1

Now we can construct our model in statsmodels using the OLS function
We will use pandas dataframes with statsmodels, however standard arrays can also be used as argu-
ments

import statsmodels.api as sm

4.3. Linear Regression in Python 291


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

reg1 = sm.OLS(endog=df1['logpgp95'], exog=df1[['const', 'avexpr']], missing=


,→'drop')
type(reg1)

statsmodels.regression.linear_model.OLS

So far we have simply constructed our model


We need to use .fit() to obtain parameter estimates β̂0 and β̂1

results = reg1.fit()
type(results)

statsmodels.regression.linear_model.RegressionResultsWrapper

We now have the fitted regression model stored in results


To view the OLS regression results, we can call the .summary() method
Note that an observation was mistakenly dropped from the results in the original paper (see the note located
in maketable2.do from Acemoglus webpage), and thus the coefficients differ slightly

print(results.summary())

OLS Regression Results


==============================================================================
Dep. Variable: logpgp95 R-squared: 0.611
Model: OLS Adj. R-squared: 0.608
Method: Least Squares F-statistic: 171.4
Date: Mon, 17 Jul 2017 Prob (F-statistic): 4.16e-24
Time: 18:41:28 Log-Likelihood: -119.71
No. Observations: 111 AIC: 243.4
Df Residuals: 109 BIC: 248.8
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const 4.6261 0.301 15.391 0.000 4.030 5.222
avexpr 0.5319 0.041 13.093 0.000 0.451 0.612
==============================================================================
Omnibus: 9.251 Durbin-Watson: 1.689
Prob(Omnibus): 0.010 Jarque-Bera (JB): 9.170
Skew: -0.680 Prob(JB): 0.0102
Kurtosis: 3.362 Cond. No. 33.2
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is
,→correctly specified.

From our results, we see that

292 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

• The intercept β̂0 = 4.63


• The slope β̂1 = 0.53
• The positive β̂1 parameter estimate implies that institutional quality has a positive effect on economic
outcomes, as we saw in the figure
• The p-value of 0.000 for β̂1 implies that the effect of institutions on GDP is statistically significant
(using p < 0.05 as a rejection rule)
• The R-squared value of 0.611 indicates that around 61% of variation in log GDP per capita is ex-
plained by protection against expropriation
Using our parameter estimates, we can now write our estimated relationship as

\ i = 4.63 + 0.53 avexpri


logpgp95

This equation describes the line that best fits our data, as shown in Figure 2
We can use this equation to predict the level of log GDP per capita for a value of the index of expropriation
protection
For example, for a country with an index value of 7.07 (the average for the dataset), we find that their
predicted level of log GDP per capita in 1995 is 8.38

mean_expr = np.mean(df1_subset['avexpr'])
mean_expr

6.515625

predicted_logpdp95 = 4.63 + 0.53 * 7.07


predicted_logpdp95

8.3771

An easier (and more accurate) way to obtain this result is to use .predict() and set constant = 1 and
avexpri = mean_expr

results.predict(exog=[1, mean_expr])

array([ 8.09156367])

We can obtain an array of predicted logpgp95i for every value of avexpri in our dataset by calling .
predict() on our results
Plotting the predicted values against avexpri shows that the predicted values lie along the linear line that
we fitted above
The observed values of logpgp95i are also plotted for comparison purposes

# Drop missing observations from whole sample

df1_plot = df1.dropna(subset=['logpgp95', 'avexpr'])

4.3. Linear Regression in Python 293


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

# Plot predicted values

plt.scatter(df1_plot['avexpr'], results.predict(), alpha=0.5, label='predicted


,→')

# Plot observed values

plt.scatter(df1_plot['avexpr'], df1_plot['logpgp95'], alpha=0.5, label=


,→'observed')

plt.legend()
plt.title('OLS predicted values')
plt.xlabel('avexpr')
plt.ylabel('logpgp95')
plt.show()

4.3.3 Extending the Linear Regression Model

So far we have only accounted for institutions affecting economic performance - almost certainly there are
numerous other factors affecting GDP that are not included in our model

294 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Leaving out variables that affect logpgp95i will result in omitted variable bias, yielding biased and incon-
sistent parameter estimates
We can extend our bivariate regression model to a multivariate regression model by adding in other factors
that may affect logpgp95i
[AJR01] consider other factors such as:
• the effect of climate on economic outcomes; latitude is used to proxy this
• differences that affect both economic performance and institutions, eg. cultural, historical, etc.; con-
trolled for with the use of continent dummies
Lets estimate some of the extended models considered in the paper (Table 2) using data from
maketable2.dta

df2 = pd.read_stata('https://github.com/QuantEcon/QuantEcon.lectures.code/raw/
,→master/ols/maketable2.dta')

# Add constant term to dataset


df2['const'] = 1

# Create lists of variables to be used in each regression


X1 = ['const', 'avexpr']
X2 = ['const', 'avexpr', 'lat_abst']
X3 = ['const', 'avexpr', 'lat_abst', 'asia', 'africa', 'other']

# Estimate an OLS regression for each set of variables


reg1 = sm.OLS(df2['logpgp95'], df2[X1], missing='drop').fit()
reg2 = sm.OLS(df2['logpgp95'], df2[X2], missing='drop').fit()
reg3 = sm.OLS(df2['logpgp95'], df2[X3], missing='drop').fit()

Now that we have fitted our model, we will use summary_col to display the results in a single table
(model numbers correspond to those in the paper)

from statsmodels.iolib.summary2 import summary_col

info_dict={'R-squared' : lambda x: f"{x.rsquared:.2f}",


'No. observations' : lambda x: f"{int(x.nobs):d}"}

results_table = summary_col(results=[reg1,reg2,reg3],
float_format='%0.2f',
stars = True,
model_names=['Model 1',
'Model 3',
'Model 4'],
info_dict=info_dict,
regressor_order=['const',
'avexpr',
'lat_abst',
'asia',
'africa'])

results_table.add_title('Table 2 - OLS Regressions')

4.3. Linear Regression in Python 295


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

print(results_table)

Table 2 - OLS Regressions


=========================================
Model 1 Model 3 Model 4
-----------------------------------------
const 4.63*** 4.87*** 5.85***
(0.30) (0.33) (0.34)
avexpr 0.53*** 0.46*** 0.39***
(0.04) (0.06) (0.05)
lat_abst 0.87* 0.33
(0.49) (0.45)
asia -0.15
(0.15)
africa -0.92***
(0.17)
other 0.30
(0.37)
R-squared 0.61 0.62 0.72
No. observations 111 111 111
=========================================
Standard errors in parentheses.
* p<.1, ** p<.05, ***p<.01

4.3.4 Endogeneity

As [AJR01] discuss, the OLS models likely suffer from endogeneity issues, resulting in biased and incon-
sistent model estimates
Namely, there is likely a two-way relationship between institutions and economic outcomes:
• richer countries may be able to afford or prefer better institutions
• variables that affect income may also be correlated with institutional differences
• the construction of the index may be biased; analysts may be biased towards seeing countries with
higher income having better institutions
To deal with endogeneity, we can use two-stage least squares (2SLS) regression, which is an extension of
OLS regression
This method requires replacing the endogenous variable avexpri with a variable that is:
1. correlated with avexpri
2. not correlated with the error term (ie. it should not directly affect the dependent variable, otherwise it
would be correlated with ui due to omitted variable bias)
The new set of regressors is called an instrument, which aims to remove endogeneity in our proxy of
institutional differences

296 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

The main contribution of [AJR01] is the use of settler mortality rates to instrument for institutional differ-
ences
They hypothesize that higher mortality rates of colonizers led to the establishment of institutions that were
more extractive in nature (less protection against expropriation), and these institutions still persist today
Using a scatterplot (Figure 3 in [AJR01]), we can see protection against expropriation is negatively corre-
lated with settler mortality rates, coinciding with the authors hypothesis and satisfying the first condition of
a valid instrument

# Dropping NA's is required to use numpy's polyfit


df1_subset2 = df1.dropna(subset=['logem4', 'avexpr'])

X = df1_subset2['logem4']
y = df1_subset2['avexpr']
labels = df1_subset2['shortnam']

# Replace markers with country labels


plt.scatter(X, y, marker='')

for i, label in enumerate(labels):


plt.annotate(label, (X.iloc[i], y.iloc[i]))

# Fit a linear trend line


plt.plot(np.unique(X),
np.poly1d(np.polyfit(X, y, 1))(np.unique(X)),
color='black')

plt.xlim([1.8,8.4])
plt.ylim([3.3,10.4])
plt.xlabel('Log of Settler Mortality')
plt.ylabel('Average Expropriation Risk 1985-95')
plt.title('Figure 3: First-stage relationship between settler mortality and
,→expropriation risk')

plt.show()

4.3. Linear Regression in Python 297


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

The second condition may not be satisfied if settler mortality rates in the 17th to 19th centuries have a direct
effect on current GDP (in addition to their indirect effect through institutions)
For example, settler mortality rates may be related to the current disease environment in a country, which
could affect current economic performance
[AJR01] argue this is unlikely because:
• The majority of settler deaths were due to malaria and yellow fever, and had limited effect on local
people
• The disease burden on local people in Africa or India, for example, did not appear to be higher than
average, supported by relatively high population densities in these areas before colonization
As we appear to have a valid instrument, we can use 2SLS regression to obtain consistent and unbiased
parameter estimates
First stage
The first stage involves regressing the endogenous variable (avexpri ) on the instrument
The instrument is the set of all exogenous variables in our model (and not just the variable we have replaced)
Using model 1 as an example, our instrument is simply a constant and settler mortality rates logem4i

298 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Therefore, we will estimate the first-stage regression as

avexpri = δ0 + δ1 logem4i + vi

The data we need to estimate this equation is located in maketable4.dta (only complete data, indicated
by baseco = 1, is used for estimation)

# Import and select the data


df4 = pd.read_stata('https://github.com/QuantEcon/QuantEcon.lectures.code/raw/
,→master/ols/maketable4.dta')

df4 = df4[df4['baseco'] == 1]

# Add a constant variable


df4['const'] = 1

# Fit the first stage regression and print summary


results_fs = sm.OLS(df4['avexpr'],
df4[['const', 'logem4']],
missing='drop').fit()
print(results_fs.summary())

OLS Regression Results


==============================================================================
Dep. Variable: avexpr R-squared: 0.270
Model: OLS Adj. R-squared: 0.258
Method: Least Squares F-statistic: 22.95
Date: Mon, 17 Jul 2017 Prob (F-statistic): 1.08e-05
Time: 18:41:29 Log-Likelihood: -104.83
No. Observations: 64 AIC: 213.7
Df Residuals: 62 BIC: 218.0
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const 9.3414 0.611 15.296 0.000 8.121 10.562
logem4 -0.6068 0.127 -4.790 0.000 -0.860 -0.354
==============================================================================
Omnibus: 0.035 Durbin-Watson: 2.003
Prob(Omnibus): 0.983 Jarque-Bera (JB): 0.172
Skew: 0.045 Prob(JB): 0.918
Kurtosis: 2.763 Cond. No. 19.4
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is
,→correctly specified.

Second stage
We need to retrieve the predicted values of avexpri using .predict()
\ i in the original linear
We then replace the endogenous variable avexpri with the predicted values avexpr
model

4.3. Linear Regression in Python 299


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Our second stage regression is thus

\ i + ui
logpgp95i = β0 + β1 avexpr

df4['predicted_avexpr'] = results_fs.predict()

results_ss = sm.OLS(df4['logpgp95'],
df4[['const', 'predicted_avexpr']]).fit()
print(results_ss.summary())

OLS Regression Results


==============================================================================
Dep. Variable: logpgp95 R-squared: 0.477
Model: OLS Adj. R-squared: 0.469
Method: Least Squares F-statistic: 56.60
Date: Mon, 17 Jul 2017 Prob (F-statistic): 2.66e-10
Time: 18:41:29 Log-Likelihood: -72.268
No. Observations: 64 AIC: 148.5
Df Residuals: 62 BIC: 152.9
Df Model: 1
Covariance Type: nonrobust
====================================================================================
coef std err t P>|t| [0.025
,→0.975]

------------------------------------------------------------------------------
,→------

const 1.9097 0.823 2.320 0.024 0.264


,→ 3.555

predicted_avexpr 0.9443 0.126 7.523 0.000 0.693


,→ 1.195

==============================================================================
Omnibus: 10.547 Durbin-Watson: 2.137
Prob(Omnibus): 0.005 Jarque-Bera (JB): 11.010
Skew: -0.790 Prob(JB): 0.00407
Kurtosis: 4.277 Cond. No. 58.1
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is
,→correctly specified.

The second-stage regression results give us an unbiased and consistent estimate of the effect of institutions
on economic outcomes
The result suggests a stronger positive relationship than what the OLS results indicated
Note that while our parameter estimates are correct, our standard errors are not and for this reason, comput-
ing 2SLS manually (in stages with OLS) is not recommended
We can correctly estimate a 2SLS regression in one step using the linearmodels package, an extension of
statsmodels
To install this package, you will need to run pip install linearmodels in your command line

300 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

from linearmodels.iv import IV2SLS

Note that when using IV2SLS, the exogenous and instrument variables are split up in the function arguments
(whereas before the instrument included exogenous variables)

iv = IV2SLS(dependent=df4['logpgp95'],
exog=df4['const'],
endog=df4['avexpr'],
instruments=df4['logem4']).fit(cov_type='unadjusted')

print(iv.summary)

IV-2SLS Estimation Summary


==============================================================================
Dep. Variable: logpgp95 R-squared: 0.1870
Estimator: IV-2SLS Adj. R-squared: 0.1739
No. Observations: 64 F-statistic: 37.568
Date: Mon, Jul 17 2017 P-value (F-stat) 0.0000
Time: 18:41:29 Distribution: chi2(1)
Cov. Estimator: unadjusted

Parameter Estimates
==============================================================================
Parameter Std. Err. T-stat P-value Lower CI Upper CI
------------------------------------------------------------------------------
const 1.9097 1.0106 1.8897 0.0588 -0.0710 3.8903
avexpr 0.9443 0.1541 6.1293 0.0000 0.6423 1.2462
==============================================================================

Endogenous: avexpr
Instruments: logem4
Unadjusted Covariance (Homoskedastic)
Debiased: False

Given that we now have consistent and unbiased estimates, we can infer from the model we have estimated
that institutional differences (stemming from institutions set up during colonization) can help to explain
differences in income levels across countries today
[AJR01] use a marginal effect of 0.94 to calculate that the difference in the index between Chile and Nige-
ria (ie. institutional quality) implies up to a 7-fold difference in income, emphasizing the significance of
institutions in economic development

4.3.5 Summary

We have demonstrated basic OLS and 2SLS regression in statsmodels and linearmodels
If you are familiar with R, you may want use the formula interface to statsmodels, or consider using
r2py to call R from within Python

4.3. Linear Regression in Python 301


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

4.3.6 Exercises

Exercise 1

In the lecture, we think the original model suffers from endogeneity bias due to the likely effect income has
on institutional development
Although endogeneity is often best identified by thinking about the data and model, we can formally test for
endogeneity using the Hausman test
We want to test for correlation between the endogenous variable, avexpri , and the errors, ui
H0 : Cov(avexpri , ui ) = 0 (no endogeneity)
H1 : Cov(avexpri , ui ) ̸= 0 (endogeneity)
This test is run is two stages
First, we regress avexpri on the instrument, logem4i

avexpri = π0 + π1 logem4i + υi

Second, we retrieve the residuals υ̂i and include them in the original equation

logpgp95i = β0 + β1 avexpri + αυ̂i + ui

If α is statistically significant (with a p-value < 0.05), then we reject the null hypothesis and conclude that
avexpri is endogenous
Using the above information, estimate a Hausman test and interpret your results

Exercise 2

The OLS parameter β can also be estimated using matrix algebra and numpy (you may need to review the
numpy lecture to complete this exercise)
The linear equation we want to estimate is (written in matrix form)

y = Xβ + u

To solve for the unknown parameter β, we want to minimise the sum of squared residuals
minû′ û
β̂

Rearranging the first equation and substituting into the second equation, we can write

min (Y − X β̂)′ (Y − X β̂)


β̂

Solving this optimization problem gives the solution for the β̂ coefficients

β̂ = (X ′ X)−1 X ′ y

Using the above information, compute β̂ from model 1 using numpy - your results should be the same as
those in the statsmodels output from earlier in the lecture

302 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

4.3.7 Solutions

Exercise 1

# Load in data
df4 = pd.read_stata('https://github.com/QuantEcon/QuantEcon.lectures.code/raw/
,→master/ols/maketable4.dta')

# Add a constant term


df4['const'] = 1

# Estimate the first stage regression


reg1 = sm.OLS(endog=df4['avexpr'],
exog=df4[['const', 'logem4']],
missing='drop').fit()

# Retrieve the residuals


df4['resid'] = reg1.resid

# Estimate the second stage residuals


reg2 = sm.OLS(endog=df4['logpgp95'],
exog=df4[['const', 'avexpr', 'resid']],
missing='drop').fit()

print(reg2.summary())

OLS Regression Results


==============================================================================
Dep. Variable: logpgp95 R-squared: 0.689
Model: OLS Adj. R-squared: 0.679
Method: Least Squares F-statistic: 74.05
Date: Mon, 17 Jul 2017 Prob (F-statistic): 1.07e-17
Time: 18:41:29 Log-Likelihood: -62.031
No. Observations: 70 AIC: 130.1
Df Residuals: 67 BIC: 136.8
Df Model: 2
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const 2.4782 0.547 4.530 0.000 1.386 3.570
avexpr 0.8564 0.082 10.406 0.000 0.692 1.021
resid -0.4951 0.099 -5.017 0.000 -0.692 -0.298
==============================================================================
Omnibus: 17.597 Durbin-Watson: 2.086
Prob(Omnibus): 0.000 Jarque-Bera (JB): 23.194
Skew: -1.054 Prob(JB): 9.19e-06
Kurtosis: 4.873 Cond. No. 53.8
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is
,→correctly specified.

4.3. Linear Regression in Python 303


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

The output shows that the coefficient on the residuals is statistically significant, indicating avexpri is en-
dogenous

Exercise 2

# Load in data
df1 = pd.read_stata('https://github.com/QuantEcon/QuantEcon.lectures.code/raw/
,→master/ols/maketable1.dta')
df1 = df1.dropna(subset=['logpgp95', 'avexpr'])

# Add a constant term


df1['const'] = 1

# Define the X and y variables


y = np.asarray(df1['logpgp95'])
X = np.asarray(df1[['const', 'avexpr']])

# Compute β_hat
β_hat = np.linalg.solve(X.T @ X, X.T @ y)

# Print out the results from the 2 x 1 vector β_hat


print(f'β_0 = {β_hat[0]:.2}')
print(f'β_1 = {β_hat[1]:.2}')

β_0 = 4.6
β_1 = 0.53

It is also possible to use np.linalg.inv(X.T @ X) @ X.T @ y to solve for β, however .solve()


is preferred as it involves fewer computations

4.4 Maximum Likelihood Estimation

Contents

• Maximum Likelihood Estimation


– Overview
– Set Up and Assumptions
– Conditional Distributions
– Maximum Likelihood Estimation
– MLE with Numerical Methods
– Maximum Likelihood Estimation with statsmodels

304 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

– Summary
– Exercises
– Solutions

4.4.1 Overview

In a previous lecture we estimated the relationship between dependent and explanatory variables using linear
regression
But what if a linear relationship is not an appropriate assumption for our model?
One widely used alternative is maximum likelihood estimation, which involves specifying a class of distri-
butions, indexed by unknown parameters, and then using the data to pin down these parameter values
The benefit relative to linear regression is that it allows more flexibility in the probabilistic relationships
between variables
Here we illustrate maximum likelihood by replicating Daniel Treismans (2016) paper, Russias Billionaires,
which connects the number of billionaires in a country to its economic characteristics
The paper concludes that Russia has a higher number of billionaires than economic factors such as market
size and tax rate predict

Prerequisites

We assume familiarity with basic probability and multivariate calculus

Comments

This lecture is coauthored with Natasha Watkins

4.4.2 Set Up and Assumptions

Lets consider the steps we need to go through in maximum likelihood estimation and how they pertain to
this study

Flow of Ideas

The first step with maximum likelihood estimation is to choose the probability distribution believed to be
generating the data
More precisely, we need to make an assumption as to which parametric class of distributions is generating
the data
• e.g., the class of all normal distributions, or the class of all gamma distributions
Each such class is a family of distributions indexed by a finite number of parameters

4.4. Maximum Likelihood Estimation 305


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

• e.g., the class of normal distributions is a family of distributions indexed by its mean µ ∈ (−∞, ∞)
and standard deviation σ ∈ (0, ∞)
Well let the data pick out a particular element of the class by pinning down the parameters
The parameter estimates so produced will be called maximum likelihood estimates

Counting Billionaires

Treisman [Tre16] is interested in estimating the number of billionaires in different countries


The number of billionaires is integer valued
Hence we consider distributions that take values only in the nonnegative integers
(This is one reason least squares regression is not the best tool for the present problem, since the dependent
variable in linear regression is not restricted to integer values)
One integer distribution is the Poisson distribution, the probability mass function (pmf) of which is
µy −µ
f (y) = e , y = 0, 1, 2, . . . , ∞
y!
We can plot the Poisson distribution over y for different values of µ as follows

from numpy import exp


from scipy.special import factorial
import matplotlib.pyplot as plt

poisson_pmf = lambda y, µ: µ**y / factorial(y) * exp(-µ)


y_values = range(0, 25)

fig, ax = plt.subplots(figsize=(12, 8))

for µ in [1, 5, 10]:


distribution = []
for y_i in y_values:
distribution.append(poisson_pmf(y_i, µ))
ax.plot(y_values,
distribution,
label=f'$\mu$={µ}',
alpha=0.5,
marker='o',
markersize=8)

ax.grid()
ax.set_xlabel('$y$', fontsize=14)
ax.set_ylabel('$f(y \mid \mu)$', fontsize=14)
ax.axis(xmin=0, ymin=0)
ax.legend(fontsize=14)

plt.show()

306 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Notice that the Poisson distribution begins to resemble a normal distribution as the mean of y increases
Lets have a look at the distribution of the data well be working with in this lecture
Treismans main source of data is Forbes annual rankings of billionaires and their estimated net worth
The dataset mle/fp.dta can be downloaded from here or its AER page
import pandas as pd
pd.options.display.max_columns = 10

# Load in data and view


df = pd.read_stata('https://github.com/QuantEcon/QuantEcon.lectures.code/raw/
,→master/mle/fp.dta')
df.head()

4.4. Maximum Likelihood Estimation 307


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Using a histogram, we can view the distribution of the number of billionaires per country, numbil0, in
2008 (the United States is dropped for plotting purposes)

numbil0_2008 = df[(df['year'] == 2008) & (


df['country'] != 'United States')].loc[:, 'numbil0']

plt.subplots(figsize=(12, 8))
plt.hist(numbil0_2008, bins=30)
plt.xlim(xmin=0)
plt.grid()
plt.xlabel('Number of billionaires in 2008')
plt.ylabel('Count')
plt.show()

From the histogram, it appears that the Poisson assumption is not unreasonable (albeit with a very low µ
and some outliers)

4.4.3 Conditional Distributions

In Treismans paper, the dependent variable the number of billionaires yi in country i is modeled as a
function of GDP per capita, population size, and years membership in GATT and WTO
Hence, the distribution of yi needs to be conditioned on the vector of explanatory variables xi
The standard formulation the so-called poisson regression model is as follows:

308 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

µyi i −µi
f (yi | xi ) = e ; yi = 0, 1, 2, . . . , ∞. (4.1)
yi !

where µi = exp(x′i β) = exp(β0 + β1 xi1 + . . . + βk xik )


To illustrate the idea that the distribution of yi depends on xi lets run a simple simulation
We use our poisson_pmf function from above and arbitrary values for β and xi

import numpy as np

y_values = range(0, 20)

# Define a parameter vector with estimates


β = np.array([0.26, 0.18, 0.25, -0.1, -0.22]).T

# Create some observations X


datasets = [np.array([0, 1, 1, 1, 2]),
np.array([2, 3, 2, 4, 0]),
np.array([3, 4, 5, 3, 2]),
np.array([6, 5, 4, 4, 7])]

fig, ax = plt.subplots(figsize=(12, 8))

for X in datasets:
µ = exp(X @ β)
distribution = []
for y_i in y_values:
distribution.append(poisson_pmf(y_i, µ))
ax.plot(y_values,
distribution,
label=f'$\mu_i$={µ:.1}',
marker='o',
markersize=8,
alpha=0.5)

ax.grid()
ax.legend()
ax.set_xlabel('$y \mid x_i$')
ax.set_ylabel(r'$f(y \mid x_i; \beta )$')
ax.axis(xmin=0, ymin=0)
plt.show()

4.4. Maximum Likelihood Estimation 309


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

We can see that the distribution of yi is conditional on xi (µi is no longer constant)

4.4.4 Maximum Likelihood Estimation

In our model for number of billionaires, the conditional distribution contains 4 (k = 4) parameters that we
need to estimate
We will label our entire parameter vector as β where
 
β0
β1 
β=
β2 

β3

To estimate the model using MLE, we want to maximize the likelihood that our estimate β̂ is the true
parameter β
Intuitively, we want to find the β̂ that best fits our data
First we need to construct the likelihood function L(β), which is similar to a joint probability density
function
Assume we have some data yi = {y1 , y2 } and yi ∼ f (yi )
If y1 and y2 are independent, the joint pmf of these data is f (y1 , y2 ) = f (y1 ) · f (y2 )

310 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

If yi follows a Poisson distribution with λ = 7, we can visualize the joint pmf like so

from mpl_toolkits.mplot3d import Axes3D

def plot_joint_poisson(µ=7, y_n=20):


yi_values = np.arange(0, y_n, 1)

# Create coordinate points of X and Y


X, Y = np.meshgrid(yi_values, yi_values)

# Multiply distributions together


Z = poisson_pmf(X, µ) * poisson_pmf(Y, µ)

fig = plt.figure(figsize=(12, 8))


ax = fig.add_subplot(111, projection='3d')
ax.plot_surface(X, Y, Z.T, cmap='terrain', alpha=0.6)
ax.scatter(X, Y, Z.T, color='black', alpha=0.5, linewidths=1)
ax.set(xlabel='$y_1$', ylabel='$y_2$')
ax.set_zlabel('$f(y_1, y_2)$', labelpad=10)
plt.show()

plot_joint_poisson(µ=7, y_n=20)

Similarly, the joint pmf of our data (which is distributed as a conditional Poisson distribution) can be written

4.4. Maximum Likelihood Estimation 311


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

as

n
µyi
f (y1 , y2 , . . . , yn | x1 , x2 , . . . , xn ; β) = i
e−µi
yi !
i=1

yi is conditional on both the values of xi and the parameters β


The likelihood function is the same as the joint pmf, but treats the parameter β as a random variable and
takes the observations (yi , xi ) as given

n
µyi
L(β | y1 , y2 , . . . , yn ; x1 , x2 , . . . , xn ) = i
e−µi
yi !
i=1
=f (y1 , y2 , . . . , yn | x1 , x2 , . . . , xn ; β)

Now that we have our likelihood function, we want to find the β̂ that yields the maximum likelihood value

maxL(β)
β

In doing so it is generally easier to maximize the log-likelihood (consider differentiating f (x) = x exp(x)
vs. f (x) = log(x) + x)
Given that taking a logarithm is a monotone increasing transformation, a maximizer of the likelihood func-
tion will also be a maximizer of the log-likelihood function
In our case the log-likelihood is
( )
log L(β) = log f (y1 ; β) · f (y2 ; β) · . . . · f (yn ; β)

n
= log f (yi ; β)
i=1

n ( µyi )
= log i
e−µi
yi !
i=1
∑n ∑
n ∑
n
= yi log µi − µi − log y!
i=1 i=1 i=1

The MLE of the Poisson to the Poisson for β̂ can be obtained by solving
(∑
n ∑
n ∑
n )
max yi log µi − µi − log y!
β
i=1 i=1 i=1

However, no analytical solution exists to the above problem – to find the MLE we need to use numerical
methods

4.4.5 MLE with Numerical Methods

Many distributions do not have nice, analytical solutions and therefore require numerical methods to solve
for parameter estimates
One such numerical method is the Newton-Raphson algorithm

312 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Our goal is find the maximum likelihood estimate β̂


At β̂, the first derivative of the log-likelihood function will be equal to 0
Lets illustrate this by supposing

log L(β) = −(β − 10)2 − 10

β = np.linspace(1, 20)
logL = -(β - 10) ** 2 - 10
dlogL = -2 * β + 20

fig, (ax1, ax2) = plt.subplots(2, sharex=True, figsize=(12, 8))

ax1.plot(β, logL, lw=2)


ax2.plot(β, dlogL, lw=2)

ax1.set_ylabel(r'$log \mathcal{L(\beta)}$',
rotation=0,
labelpad=35,
fontsize=15)
ax2.set_ylabel(r'$\frac{dlog \mathcal{L(\beta)}}{d \beta}$ ',
rotation=0,
labelpad=35,
fontsize=19)
ax2.set_xlabel(r'$\beta$', fontsize=15)
ax1.grid(), ax2.grid()
plt.axhline(c='black')
plt.show()

4.4. Maximum Likelihood Estimation 313


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

d log L(β)
The plot shows that the maximum likelihood value (the top plot) occurs when dβ = 0 (the bottom
plot)
Therefore, the likelihood is maximized when β = 10
We can also ensure that this value is a maximum (as opposed to a minimum) by checking that the second
derivative (slope of the bottom plot) is negative
The Newton-Raphson algorithm finds a point where the first derivative is 0
To use the algorithm, we take an initial guess at the maximum value, β0 (the OLS parameter estimates might
be a reasonable guess), then
1. Use the updating rule to iterate the algorithm
G(β (k) )
β (k+1) = β (k) −
H(β (k) )
where:
d log L(β (∥) )
G(β (k) ) =
dβ (k)
d2 log L(β (∥) )
H(β (k) ) =
dβ 2(k)

2. Check whether β (k+1) − β (k) < tol

• If true, then stop iterating and set β̂ = β (k+1)

314 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

• If false, then update β (k+1)


As can be seen from the updating equation, β (k+1) = β (k) only when G(β (k) ) = 0 ie. where the first
derivative is equal to 0
(In practice, we stop iterating when the difference is below a small tolerance threshold)
Lets have a go at implementing the Newton-Raphson algorithm
First, well create a class called PoissonRegression so we can easily recompute the values of the log
likelihood, gradient and Hessian for every iteration
class PoissonRegression:

def __init__(self, y, X, β):


self.X, self.y, self.β = X, y, β
self.n, self.k = X.shape

def µ(self):
return np.exp(self.X @ self.β.T)

def logL(self):
y = self.y
µ = self.µ()
return np.sum(y * np.log(µ) - µ - np.log(factorial(y)))

def G(self):
µ = self.µ()
return ((self.y - µ) @ self.X).reshape(self.k, 1)

def H(self):
X = self.X
µ = self.µ()
return -(µ * X.T @ X)

Our function newton_raphson will take a PoissonRegression object that has an initial guess of
the parameter vector β 0
The algorithm will update the parameter vector according to the updating rule, and recalculate the gradient
and Hessian matrices at the new parameter estimates
Iteration will end when either:
• The difference between the parameter and the updated parameter is below a tolerance level
• The maximum number of iterations has been achieved (meaning convergence is not achieved)
So we can get an idea of whats going on while the algorithm is running, an option display=True is
added to print out values at each iteration
def newton_raphson(model, tol=1e-3, max_iter=1000, display=True):

i = 0
error = 100 # Initial error value

# Print header of output

4.4. Maximum Likelihood Estimation 315


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

if display:
header = f'{"Iteration_k":<13}{"Log-likelihood":<16}{"θ":<60}'
print(header)
print("-" * len(header))

# While loop runs while any value in error is greater


# than the tolerance until max iterations are reached
while np.any(error > tol) and i < max_iter:
H, G = model.H(), model.G()
β_new = model.β - (np.linalg.inv(H) @ G).T
error = β_new - model.β
model.β = β_new.flatten()

# Print iterations
if display:
β_list = [f'{t:.3}' for t in list(model.β)]
update = f'{i:<13}{model.logL():<16.8}{β_list}'
print(update)

i += 1

print(f'Number of iterations: {i}')


print(f'β_hat = {model.β}')

return model.β

Lets try out our algorithm with a small dataset of 5 observations and 3 variables in X
X = np.array([[1, 2, 5],
[1, 1, 3],
[1, 4, 2],
[1, 5, 2],
[1, 3, 1]])

y = np.array([1, 0, 1, 1, 0])

# Take a guess at initial βs


init_β = np.array([0.1, 0.1, 0.1])

# Create an object with Poisson model values


poi = PoissonRegression(y, X, β=init_β)

# Use newton_raphson to find the MLE


β_hat = newton_raphson(poi, display=True)

Iteration_k Log-likelihood Θ
-----------------------------------------------------------
0 -4.34476224 ['-1.4890', '0.2650', '0.2440']
1 -3.5742413 ['-3.3840', '0.5280', '0.4740']
2 -3.39995256 ['-5.0640', '0.7820', '0.7020']
3 -3.37886465 ['-5.9150', '0.9090', '0.8200']
4 -3.3783559 ['-6.0740', '0.9330', '0.8430']
5 -3.37835551 ['-6.0780', '0.9330', '0.8430']

316 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Number of iterations: 6
β_hat = [-6.07848205 0.93340226 0.84329625]

As this was a simple model with few observations, the algorithm achieved convergence in only 6 iterations
You can see that with each iteration, the log-likelihood value increased
Remember, our objective was to maximize the log-likelihood function, which the algorithm has worked to
achieve
Also note that the increase in log L(β (k) ) becomes smaller with each iteration
This is because the gradient is approaching 0 as we reach the maximum, and therefore the numerator in our
updating equation is becoming smaller
The gradient vector should be close to 0 at β̂

poi.G()

array([[ -3.95169226e-07],
[ -1.00114804e-06],
[ -7.73114556e-07]])

The iterative process can be visualized in the following diagram, where the maximum is found at β = 10

logL = lambda x: -(x - 10) ** 2 - 10

def find_tangent(β, a=0.01):


y1 = logL(β)
y2 = logL(β+a)
x = np.array([[β, 1], [β+a, 1]])
m, c = np.linalg.lstsq(x, np.array([y1, y2]))[0]
return m, c

β = np.linspace(2, 18)
fig, ax = plt.subplots(figsize=(12, 8))
ax.plot(β, logL(β), lw=2, c='black')

for β in [7, 8.5, 9.5, 10]:


β_line = np.linspace(β-2, β+2)
m, c = find_tangent(β)
y = m * β_line + c
ax.plot(β_line, y, '-', c='purple', alpha=0.8)
ax.text(β+2.05, y[-1], f'$G({β}) = {abs(m):.0f}$', fontsize=12)
ax.vlines(β, -24, logL(β), linestyles='--', alpha=0.5)
ax.hlines(logL(β), 6, β, linestyles='--', alpha=0.5)

ax.set(ylim=(-24, -4), xlim=(6, 13))


ax.set_xlabel(r'$\beta$', fontsize=15)
ax.set_ylabel(r'$log \mathcal{L(\beta)}$',
rotation=0,
labelpad=25,
fontsize=15)

4.4. Maximum Likelihood Estimation 317


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

ax.grid(alpha=0.3)
plt.show()

Note that our implementation of the Newton-Raphson algorithm is rather basic for more robust implemen-
tations see, for example, scipy.optimize

4.4.6 Maximum Likelihood Estimation with statsmodels

Now that we know whats going on under the hood, we can apply MLE to an interesting application
Well use the Poisson regression model in statsmodels to obtain richer output with standard errors, test
values, and more
statsmodels uses the same algorithm as above to find the maximum likelihood estimates
Before we begin, lets re-estimate our simple model with statsmodels to confirm we obtain the same
coefficients and log-likelihood value

from statsmodels.api import Poisson


from scipy import stats
stats.chisqprob = lambda chisq, df: stats.chi2.sf(chisq, df)

X = np.array([[1, 2, 5],
[1, 1, 3],
[1, 4, 2],

318 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

[1, 5, 2],
[1, 3, 1]])

y = np.array([1, 0, 1, 1, 0])

stats_poisson = Poisson(y, X).fit()


print(stats_poisson.summary())

Optimization terminated successfully.


Current function value: 0.675671
Iterations 7
Poisson Regression Results
==============================================================================
Dep. Variable: y No. Observations: 5
Model: Poisson Df Residuals: 2
Method: MLE Df Model: 2
Date: Wed, 26 Jul 2017 Pseudo R-squ.: 0.2546
Time: 15:41:38 Log-Likelihood: -3.3784
converged: True LL-Null: -4.5325
LLR p-value: 0.3153
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
const -6.0785 5.279 -1.151 0.250 -16.425 4.268
x1 0.9334 0.829 1.126 0.260 -0.691 2.558
x2 0.8433 0.798 1.057 0.291 -0.720 2.407
==============================================================================

Now lets replicate results from Daniel Treismans paper, Russias Billionaires, mentioned earlier in the lecture
Treisman starts by estimating equation (4.1), where:
• yi is number of billionairesi
• xi1 is log GDP per capitai
• xi2 is log populationi
• xi3 is years in GAT T i – years membership in GATT and WTO (to proxy access to international
markets)
The paper only considers the year 2008 for estimation
We will set up our variables for estimation like so (you should have the data assigned to df from earlier in
the lecture)

# Keep only year 2008


df = df[df['year'] == 2008]

# Add a constant
df['const'] = 1

# Variable sets
reg1 = ['const', 'lngdppc', 'lnpop', 'gattwto08']

4.4. Maximum Likelihood Estimation 319


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

reg2 = ['const', 'lngdppc', 'lnpop',


'gattwto08', 'lnmcap08', 'rintr', 'topint08']
reg3 = ['const', 'lngdppc', 'lnpop', 'gattwto08', 'lnmcap08',
'rintr', 'topint08', 'nrrents', 'roflaw']

Then we can use the Poisson function from statsmodels to fit the model
Well use robust standard errors as in the authors paper

import statsmodels.api as sm

# Specify model
poisson_reg = sm.Poisson(df[['numbil0']], df[reg1],
missing='drop').fit(cov_type='HC0')
print(poisson_reg.summary())

Warning: Maximum number of iterations has been exceeded.


Current function value: 2.226090
Iterations: 35
Poisson Regression Results
==============================================================================
Dep. Variable: numbil0 No. Observations: 197
Model: Poisson Df Residuals: 193
Method: MLE Df Model: 3
Date: Wed, 26 Jul 2017 Pseudo R-squ.: 0.8574
Time: 15:41:38 Log-Likelihood: -438.54
converged: False LL-Null: -3074.7
LLR p-value: 0.000
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
const -29.0495 2.578 -11.268 0.000 -34.103 -23.997
lngdppc 1.0839 0.138 7.834 0.000 0.813 1.355
lnpop 1.1714 0.097 12.024 0.000 0.980 1.362
gattwto08 0.0060 0.007 0.868 0.386 -0.008 0.019
==============================================================================

Here we received a warning message saying Maximum number of iterations has been exceeded.
Lets try increasing the maximum number of iterations that the algorithm is allowed (the .fit() docstring
tells us the default number of iterations is 35)

poisson_reg = sm.Poisson(df[['numbil0']], df[reg1],


missing='drop').fit(cov_type='HC0', maxiter=100)
print(poisson_reg.summary())

Optimization terminated successfully.


Current function value: 2.226090
Iterations 36
Poisson Regression Results
==============================================================================
Dep. Variable: numbil0 No. Observations: 197

320 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Model: Poisson
Df Residuals: 193
Method: MLE
Df Model: 3
Date: Wed, 26 Jul 2017
Pseudo R-squ.: 0.8574
Time: 15:41:38
Log-Likelihood: -438.54
converged: True
LL-Null: -3074.7
LLR p-value: 0.000
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
const -29.0495 2.578 -11.268 0.000 -34.103 -23.997
lngdppc 1.0839 0.138 7.834 0.000 0.813 1.355
lnpop 1.1714 0.097 12.024 0.000 0.980 1.362
gattwto08 0.0060 0.007 0.868 0.386 -0.008 0.019
==============================================================================

Success! The algorithm was able to achieve convergence in 36 iterations


Our output indicates that GDP per capita, population, and years of membership in the General Agreement
in Tariffs and Trade (GATT) are positively related to the number of billionaires a country has, as expected
Lets also estimate the authors more full-featured models and display them in a single table

from statsmodels.iolib.summary2 import summary_col

regs = [reg1, reg2, reg3]


reg_names = ['Model 1', 'Model 2', 'Model 3']
info_dict = {'Pseudo R-squared': lambda x: f"{x.prsquared:.2f}",
'No. observations': lambda x: f"{int(x.nobs):d}"}
regressor_order = ['const',
'lngdppc',
'lnpop',
'gattwto08',
'lnmcap08',
'rintr',
'topint08',
'nrrents',
'roflaw']
results = []

for reg in regs:


result = sm.Poisson(df[['numbil0']], df[reg],
missing='drop').fit(cov_type='HC0', maxiter=100,
,→disp=0)

results.append(result)

results_table = summary_col(results=results,
float_format='%0.3f',
stars=True,
model_names=reg_names,
info_dict=info_dict,
regressor_order=regressor_order)
results_table.add_title('Table 1 - Explaining the Number of Billionaires in
,→2008')

4.4. Maximum Likelihood Estimation 321


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

print(results_table)

Table 1 - Explaining the Number of Billionaires in 2008


=================================================
Model 1 Model 2 Model 3
-------------------------------------------------
const -29.050*** -19.444*** -20.858***
(2.578) (4.820) (4.255)
lngdppc 1.084*** 0.717*** 0.737***
(0.138) (0.244) (0.233)
lnpop 1.171*** 0.806*** 0.929***
(0.097) (0.213) (0.195)
gattwto08 0.006 0.007 0.004
(0.007) (0.006) (0.006)
lnmcap08 0.399** 0.286*
(0.172) (0.167)
rintr -0.010 -0.009
(0.010) (0.010)
topint08 -0.051*** -0.058***
(0.011) (0.012)
nrrents -0.005
(0.010)
roflaw 0.203
(0.372)
Pseudo R-squared 0.86 0.90 0.90
No. observations 197 131 131
=================================================
Standard errors in parentheses.
* p<.1, ** p<.05, ***p<.01

The output suggests that the frequency of billionaires is positively correlated with GDP per capita, popula-
tion size, stock market capitalization, and negatively correlated with top marginal income tax rate
To analyze our results by country, we can plot the difference between the predicted an actual values, then
sort from highest to lowest and plot the first 15
data = ['const', 'lngdppc', 'lnpop', 'gattwto08', 'lnmcap08', 'rintr',
'topint08', 'nrrents', 'roflaw', 'numbil0', 'country']
results_df = df[data].dropna()

# Use last model (model 3)


results_df['prediction'] = results[-1].predict()

# Calculate difference
results_df['difference'] = results_df['numbil0'] - results_df['prediction']

# Sort in descending order


results_df.sort_values('difference', ascending=False, inplace=True)

# Plot the first 15 data points


results_df[:15].plot('country', 'difference', kind='bar', figsize=(12,8),
,→legend=False)
plt.ylabel('Number of billionaires above predicted level')

322 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

plt.xlabel('Country')
plt.show()

As we can see, Russia has by far the highest number of billionaires in excess of what is predicted by the
model (around 50 more than expected)
Treisman uses this empirical result to discuss possible reasons for Russias excess of billionaires, including
the origination of wealth in Russia, the political climate, and the history of privatization in the years after
the USSR

4.4.7 Summary

In this lecture we used Maximum Likelihood Estimation to estimate the parameters of a Poisson model
statsmodels contains other built-in likelihood models such as Probit and Logit
For further flexibility, statsmodels provides a way to specify the distribution manually using the
GenericLikelihoodModel class - an example notebook can be found here

4.4. Maximum Likelihood Estimation 323


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

4.4.8 Exercises

Exercise 1

Suppose we wanted to estimate the probability of an event yi occurring, given some observations
We could use a probit regression model, where the pmf of yi is

f (yi ; β) = µyi i (1 − µi )1−yi , yi = 0, 1


where µi = Φ(x′i β)

Φ represents the cumulative normal distribution and constrains the predicted yi to be between 0 and 1 (as
required for a probability)
β is a vector of coefficients
Following the example in the lecture, write a class to represent the Probit model
To begin, find the log-likelihood function and derive the gradient and Hessian
The scipy module stats.norm contains the functions needed to compute the cmf and pmf of the normal
distribution

Exercise 2

Use the following dataset and initial values of β to estimate the MLE with the Newton-Raphson algorithm
developed earlier in the lecture
   
1 2 4 1  
1 1 1 0 0.1
   
   
X = 1 4 3 y = 1 β (0) = 0.1 
1 5 6 1 0.1
1 3 5 0

Verify your results with statsmodels - you can import the Probit function with the following import
statement

from statsmodels.discrete.discrete_model import Probit

Note that the simple Newton-Raphson algorithm developed in this lecture is very sensitive to initial values,
and therefore you may fail to achieve convergence with different starting values

4.4.9 Solutions

Exercise 1

The log-likelihood can be written as


n
[ ]
log L = yi log Φ(x′i β) + (1 − yi ) log(1 − Φ(x′i β))
i=1

324 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Using the fundamental theorem of calculus, the derivative of a cumulative probability distribution is its
marginal distribution


Φ(s) = ϕ(s)
∂s
where ϕ is the marginal normal distribution
The gradient vector of the Probit model is

∂ log L ∑ [ ϕ(x′i β) ϕ(x′i β) ]


n
= yi − (1 − yi ) xi
∂β Φ(x′i β) 1 − Φ(x′i β)
i=1

The Hessian of the Probit model is

∂ 2 log L ∑n [ ϕ(x′ β) + x′ βΦ(x′ β) ϕi (x′i β) − x′i β(1 − Φ(x′i β)) ]



= − ϕ(x β) y i i i
+ (1 − y ) xi x′i
∂β∂β ′
i i
i
[Φ(x′i β)]2 [1 − Φ(x′i β)]2
i=1

Using these results, we can write a class for the Probit model as follows

from scipy.stats import norm

class ProbitRegression:

def __init__(self, y, X, β):


self.X, self.y, self.β = X, y, β
self.n, self.k = X.shape

def µ(self):
return norm.cdf(self.X @ self.β.T)

def (self):
return norm.pdf(self.X @ self.β.T)

def logL(self):
µ = self.µ()
return np.sum(y * np.log(µ) + (1 - y) * np.log(1 - µ))

def G(self):
µ = self.µ()
= self.()
return np.sum((X.T * y * / µ - X.T * (1 - y) * / (1 - µ)), axis=1)

def H(self):
X = self.X
β = self.β
µ = self.µ()
= self.()
a = ( + (X @ β.T) * µ) / µ**2
b = ( - (X @ β.T) * (1 - µ)) / (1 - µ)**2
return -( * (y * a + (1 - y) * b) * X.T) @ X

4.4. Maximum Likelihood Estimation 325


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Exercise 2

X = np.array([[1, 2, 4],
[1, 1, 1],
[1, 4, 3],
[1, 5, 6],
[1, 3, 5]])

y = np.array([1, 0, 1, 1, 0])

# Take a guess at initial βs


β = np.array([0.1, 0.1, 0.1])

# Create instance of Probit regression class


prob = ProbitRegression(y, X, β)

# Run Newton-Raphson algorithm


newton_raphson(prob)

Iteration_k Log-likelihood θ
-----------------------------------------------------------
0 -2.37968841 ['-1.3400', '0.7750', '-0.1570']
1 -2.36875259 ['-1.5350', '0.7750', '-0.0980']
2 -2.36872942 ['-1.5460', '0.7780', '-0.0970']
3 -2.36872942 ['-1.5460', '0.7780', '-0.0970']
Number of iterations: 4
β_hat = [-1.54625858 0.77778952 -0.09709757]

array([-1.54625858, 0.77778952, -0.09709757])

# Use statsmodels to verify results

print(Probit(y, X).fit().summary())

Optimization terminated successfully.


Current function value: 0.473746
Iterations 6
Probit Regression Results
==============================================================================
Dep. Variable: y No. Observations: 5
Model: Probit Df Residuals: 2
Method: MLE Df Model: 2
Date: Wed, 26 Jul 2017 Pseudo R-squ.: 0.2961
Time: 15:41:39 Log-Likelihood: -2.3687
converged: True LL-Null: -3.3651
LLR p-value: 0.3692
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
const -1.5463 1.866 -0.829 0.407 -5.204 2.111
x1 0.7778 0.788 0.986 0.324 -0.768 2.323
x2 -0.0971 0.590 -0.165 0.869 -1.254 1.060

326 Chapter 4. Data and Empirics


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

==============================================================================

4.4. Maximum Likelihood Estimation 327


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

328 Chapter 4. Data and Empirics


CHAPTER

FIVE

TOOLS AND TECHNIQUES

This section of the course contains foundational mathematical and statistical tools and techniques

5.1 Linear Algebra

Contents

• Linear Algebra
– Overview
– Vectors
– Matrices
– Solving Systems of Equations
– Eigenvalues and Eigenvectors
– Further Topics
– Exercises
– Solutions

5.1.1 Overview

Linear algebra is one of the most useful branches of applied mathematics for economists to invest in
For example, many applied problems in economics and finance require the solution of a linear system of
equations, such as

y1 = ax1 + bx2
y2 = cx1 + dx2

or, more generally,

329
QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

y1 = a11 x1 + a12 x2 + · · · + a1k xk


.. (5.1)
.
yn = an1 x1 + an2 x2 + · · · + ank xk

The objective here is to solve for the unknowns x1 , . . . , xk given a11 , . . . , ank and y1 , . . . , yn
When considering such problems, it is essential that we first consider at least some of the following questions
• Does a solution actually exist?
• Are there in fact many solutions, and if so how should we interpret them?
• If no solution exists, is there a best approximate solution?
• If a solution exists, how should we compute it?
These are the kinds of topics addressed by linear algebra
In this lecture we will cover the basics of linear and matrix algebra, treating both theory and computation
We admit some overlap with this lecture, where operations on NumPy arrays were first explained
Note that this lecture is more theoretical than most, and contains background material that will be used in
applications as we go along

5.1.2 Vectors

A vector of length n is just a sequence (or array, or tuple) of n numbers, which we write as x = (x1 , . . . , xn )
or x = [x1 , . . . , xn ]
We will write these sequences either horizontally or vertically as we please
(Later, when we wish to perform certain matrix operations, it will become necessary to distinguish between
the two)
The set of all n-vectors is denoted by Rn
For example, R2 is the plane, and a vector in R2 is just a point in the plane
Traditionally, vectors are represented visually as arrows from the origin to the point
The following figure represents three vectors in this manner

import matplotlib.pyplot as plt

fig, ax = plt.subplots(figsize=(10, 8))


# Set the axes through the origin
for spine in ['left', 'bottom']:
ax.spines[spine].set_position('zero')
for spine in ['right', 'top']:
ax.spines[spine].set_color('none')

ax.set(xlim=(-5, 5), ylim=(-5, 5))


ax.grid()

330 Chapter 5. Tools and Techniques


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

vecs = ((2, 4), (-3, 3), (-4, -3.5))


for v in vecs:
ax.annotate('', xy=v, xytext=(0, 0),
arrowprops=dict(facecolor='blue',
shrink=0,
alpha=0.7,
width=0.5))
ax.text(1.1 * v[0], 1.1 * v[1], str(v))
plt.show()

Vector Operations

The two most common operators for vectors are addition and scalar multiplication, which we now describe
As a matter of definition, when we add two vectors, we add them element by element
     
x1 y1 x1 + y1
 x2   y2   x2 + y2 
     
x + y =  .  +  .  :=  .. 
 ..   ..   . 
xn yn x n + yn

5.1. Linear Algebra 331


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Scalar multiplication is an operation that takes a number γ and a vector x and produces
 
γx1
 γx2 
 
γx :=  . 
 .. 
γxn

Scalar multiplication is illustrated in the next figure

import numpy as np

fig, ax = plt.subplots(figsize=(10, 8))


# Set the axes through the origin
for spine in ['left', 'bottom']:
ax.spines[spine].set_position('zero')
for spine in ['right', 'top']:
ax.spines[spine].set_color('none')

ax.set(xlim=(-5, 5), ylim=(-5, 5))


x = (2, 2)
ax.annotate('', xy=x, xytext=(0, 0),
arrowprops=dict(facecolor='blue',
shrink=0,
alpha=1,
width=0.5))
ax.text(x[0] + 0.4, x[1] - 0.2, '$x$', fontsize='16')

scalars = (-2, 2)
x = np.array(x)

for s in scalars:
v = s * x
ax.annotate('', xy=v, xytext=(0, 0),
arrowprops=dict(facecolor='red',
shrink=0,
alpha=0.5,
width=0.5))
ax.text(v[0] + 0.4, v[1] - 0.2, f'${s} x$', fontsize='16')
plt.show()

332 Chapter 5. Tools and Techniques


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

In Python, a vector can be represented as a list or tuple, such as x = (2, 4, 6), but is more commonly
represented as a NumPy array
One advantage of NumPy arrays is that scalar multiplication and addition have very natural syntax

x = np.ones(3) # Vector of three ones


y = np.array((2, 4, 6)) # Converts tuple (2, 4, 6) into array
x + y

array([ 3., 5., 7.])

4 * x

array([ 4., 4., 4.])

5.1. Linear Algebra 333


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Inner Product and Norm

The inner product of vectors x, y ∈ Rn is defined as


n

x y := xi yi
i=1

Two vectors are called orthogonal if their inner product is zero


The norm of a vector x represents its length (i.e., its distance from the zero vector) and is defined as
( n )1/2
√ ∑
∥x∥ := x′ x := x2i
i=1

The expression ∥x − y∥ is thought of as the distance between x and y


Continuing on from the previous example, the inner product and norm can be computed as follows

np.sum(x * y) # Inner product of x and y

12.0

np.sqrt(np.sum(x**2)) # Norm of x, take one

1.7320508075688772

np.linalg.norm(x) # Norm of x, take two

1.7320508075688772

Span

Given a set of vectors A := {a1 , . . . , ak } in Rn , its natural to think about the new vectors we can create by
performing linear operations
New vectors created in this manner are called linear combinations of A
In particular, y ∈ Rn is a linear combination of A := {a1 , . . . , ak } if

y = β1 a1 + · · · + βk ak for some scalars β1 , . . . , βk

In this context, the values β1 , . . . , βk are called the coefficients of the linear combination
The set of linear combinations of A is called the span of A
The next figure shows the span of A = {a1 , a2 } in R3
The span is a 2 dimensional plane passing through these two points and the origin

334 Chapter 5. Tools and Techniques


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

from matplotlib import cm


from mpl_toolkits.mplot3d import Axes3D
from scipy.interpolate import interp2d

fig = plt.figure(figsize=(10, 8))


ax = fig.gca(projection='3d')

x_min, x_max = -5, 5


y_min, y_max = -5, 5

α, β = 0.2, 0.1

ax.set(xlim=(x_min, x_max), ylim=(x_min, x_max), zlim=(x_min, x_max),


xticks=(0,), yticks=(0,), zticks=(0,))

gs = 3
z = np.linspace(x_min, x_max, gs)
x = np.zeros(gs)
y = np.zeros(gs)
ax.plot(x, y, z, 'k-', lw=2, alpha=0.5)
ax.plot(z, x, y, 'k-', lw=2, alpha=0.5)
ax.plot(y, z, x, 'k-', lw=2, alpha=0.5)

# Fixed linear function, to generate a plane


def f(x, y):
return α * x + β * y

# Vector locations, by coordinate


x_coords = np.array((3, 3))
y_coords = np.array((4, -4))
z = f(x_coords, y_coords)
for i in (0, 1):
ax.text(x_coords[i], y_coords[i], z[i], f'$a_{i+1}$', fontsize=14)

# Lines to vectors
for i in (0, 1):
x = (0, x_coords[i])
y = (0, y_coords[i])
z = (0, f(x_coords[i], y_coords[i]))
ax.plot(x, y, z, 'b-', lw=1.5, alpha=0.6)

# Draw the plane


grid_size = 20
xr2 = np.linspace(x_min, x_max, grid_size)
yr2 = np.linspace(y_min, y_max, grid_size)
x2, y2 = np.meshgrid(xr2, yr2)
z2 = f(x2, y2)
ax.plot_surface(x2, y2, z2, rstride=1, cstride=1, cmap=cm.jet,
linewidth=0, antialiased=True, alpha=0.2)
plt.show()

5.1. Linear Algebra 335


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Examples

If A contains only one vector a1 ∈ R2 , then its span is just the scalar multiples of a1 , which is the unique
line passing through both a1 and the origin
If A = {e1 , e2 , e3 } consists of the canonical basis vectors of R3 , that is
     
1 0 0
e1 :=  0  , e2 :=  1  , e3 :=  0 
0 0 1

then the span of A is all of R3 , because, for any x = (x1 , x2 , x3 ) ∈ R3 , we can write

x = x1 e1 + x2 e2 + x3 e3

Now consider A0 = {e1 , e2 , e1 + e2 }


If y = (y1 , y2 , y3 ) is any linear combination of these vectors, then y3 = 0 (check it)
Hence A0 fails to span all of R3

336 Chapter 5. Tools and Techniques


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Linear Independence

As well see, its often desirable to find families of vectors with relatively large span, so that many vectors
can be described by linear operators on a few vectors
The condition we need for a set of vectors to have a large span is whats called linear independence
In particular, a collection of vectors A := {a1 , . . . , ak } in Rn is said to be
• linearly dependent if some strict subset of A has the same span as A
• linearly independent if it is not linearly dependent
Put differently, a set of vectors is linearly independent if no vector is redundant to the span, and linearly
dependent otherwise
To illustrate the idea, recall the figure that showed the span of vectors {a1 , a2 } in R3 as a plane through the
origin
If we take a third vector a3 and form the set {a1 , a2 , a3 }, this set will be
• linearly dependent if a3 lies in the plane
• linearly independent otherwise
As another illustration of the concept, since Rn can be spanned by n vectors (see the discussion of canonical
basis vectors above), any collection of m > n vectors in Rn must be linearly dependent
The following statements are equivalent to linear independence of A := {a1 , . . . , ak } ⊂ Rn
1. No vector in A can be formed as a linear combination of the other elements
2. If β1 a1 + · · · βk ak = 0 for scalars β1 , . . . , βk , then β1 = · · · = βk = 0
(The zero in the first expression is the origin of Rn )

Unique Representations

Another nice thing about sets of linearly independent vectors is that each element in the span has a unique
representation as a linear combination of these vectors
In other words, if A := {a1 , . . . , ak } ⊂ Rn is linearly independent and

y = β1 a1 + · · · βk ak

then no other coefficient sequence γ1 , . . . , γk will produce the same vector y


Indeed, if we also have y = γ1 a1 + · · · γk ak , then

(β1 − γ1 )a1 + · · · + (βk − γk )ak = 0

Linear independence now implies γi = βi for all i

5.1. Linear Algebra 337


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

5.1.3 Matrices

Matrices are a neat way of organizing data for use in linear operations
An n × k matrix is a rectangular array A of numbers with n rows and k columns:
 
a11 a12 · · · a1k
 a21 a22 · · · a2k 
 
A= . .. .. 
 .. . . 
an1 an2 · · · ank

Often, the numbers in the matrix represent coefficients in a system of linear equations, as discussed at the
start of this lecture
For obvious reasons, the matrix A is also called a vector if either n = 1 or k = 1
In the former case, A is called a row vector, while in the latter it is called a column vector
If n = k, then A is called square
The matrix formed by replacing aij by aji for every i and j is called the transpose of A, and denoted A′ or
A⊤
If A = A′ , then A is called symmetric
For a square matrix A, the i elements of the form aii for i = 1, . . . , n are called the principal diagonal
A is called diagonal if the only nonzero entries are on the principal diagonal
If, in addition to being diagonal, each element along the principal diagonal is equal to 1, then A is called the
identity matrix, and denoted by I

Matrix Operations

Just as was the case for vectors, a number of algebraic operations are defined for matrices
Scalar multiplication and addition are immediate generalizations of the vector case:
   
a11 · · · a1k γa11 · · · γa1k
 ..  :=  .. .. 
γA = γ  ... ..
. .   .
..
. . 
an1 · · · ank γan1 · · · γank

and
     
a11 · · · a1k b11 · · · b1k a11 + b11 · · · a1k + b1k
 ..  +  ..  :=  
A + B =  ... ..
. .  
..
.
..
. .  
..
.
..
.
..
. 
an1 · · · ank bn1 · · · bnk an1 + bn1 · · · ank + bnk

In the latter case, the matrices must have the same shape in order for the definition to make sense
We also have a convention for multiplying two matrices
The rule for matrix multiplication generalizes the idea of inner products discussed above, and is designed to
make multiplication play well with basic linear operations

338 Chapter 5. Tools and Techniques


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

If A and B are two matrices, then their product AB is formed by taking as its i, j-th element the inner
product of the i-th row of A and the j-th column of B
There are many tutorials to help you visualize this operation, such as this one, or the discussion on the
Wikipedia page
If A is n × k and B is j × m, then to multiply A and B we require k = j, and the resulting matrix AB is
n×m
As perhaps the most important special case, consider multiplying n × k matrix A and k × 1 column vector
x
According to the preceding rule, this gives us an n × 1 column vector

    
a11 · · · a1k x1 a11 x1 + · · · + a1k xk
 ..   ..  :=  
Ax =  ... ..
. .  .  
..
.  (5.2)
an1 · · · ank xk an1 x1 + · · · + ank xk

Note: AB and BA are not generally the same thing

Another important special case is the identity matrix


You should check that if A is n × k and I is the k × k identity matrix, then AI = A
If I is the n × n identity matrix, then IA = A

Matrices in NumPy

NumPy arrays are also used as matrices, and have fast, efficient functions and methods for all the standard
matrix operations1
You can create them manually from tuples of tuples (or lists of lists) as follows

A = ((1, 2),
(3, 4))

type(A)

tuple

A = np.array(A)

type(A)

numpy.ndarray

1
Although there is a specialized matrix data type defined in NumPy, its more standard to work with ordinary NumPy arrays.
See this discussion.

5.1. Linear Algebra 339


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

A.shape

(2, 2)

The shape attribute is a tuple giving the number of rows and columns see here for more discussion
To get the transpose of A, use A.transpose() or, more simply, A.T
There are many convenient functions for creating common matrices (matrices of zeros, ones, etc.) see here
Since operations are performed elementwise by default, scalar multiplication and addition have very natural
syntax

A = np.identity(3)
B = np.ones((3, 3))
2 * A

array([[ 2., 0., 0.],


[ 0., 2., 0.],
[ 0., 0., 2.]])

A + B

array([[ 2., 1., 1.],


[ 1., 2., 1.],
[ 1., 1., 2.]])

To multiply matrices we use the @ symbol


In particular, A @ B is matrix multiplication, whereas A * B is element by element multiplication
See here for more discussion

Matrices as Maps

Each n × k matrix A can be identified with a function f (x) = Ax that maps x ∈ Rk into y = Ax ∈ Rn
These kinds of functions have a special property: they are linear
A function f : Rk → Rn is called linear if, for all x, y ∈ Rk and all scalars α, β, we have

f (αx + βy) = αf (x) + βf (y)

You can check that this holds for the function f (x) = Ax + b when b is the zero vector, and fails when b is
nonzero
In fact, its known that f is linear if and only if there exists a matrix A such that f (x) = Ax for all x

5.1.4 Solving Systems of Equations

Recall again the system of equations (5.1)

340 Chapter 5. Tools and Techniques


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

If we compare (5.1) and (5.2), we see that (5.1) can now be written more conveniently as

y = Ax (5.3)

The problem we face is to determine a vector x ∈ Rk that solves (5.3), taking y and A as given
This is a special case of a more general problem: Find an x such that y = f (x)
Given an arbitrary function f and a y, is there always an x such that y = f (x)?
If so, is it always unique?
The answer to both these questions is negative, as the next figure shows

def f(x):
return 0.6 * np.cos(4 * x) + 1.4

xmin, xmax = -1, 1


x = np.linspace(xmin, xmax, 160)
y = f(x)
ya, yb = np.min(y), np.max(y)

fig, axes = plt.subplots(2, 1, figsize=(10, 10))

for ax in axes:
# Set the axes through the origin
for spine in ['left', 'bottom']:
ax.spines[spine].set_position('zero')
for spine in ['right', 'top']:
ax.spines[spine].set_color('none')

ax.set(ylim=(-0.6, 3.2), xlim=(xmin, xmax),


yticks=(), xticks=())

ax.plot(x, y, 'k-', lw=2, label='$f$')


ax.fill_between(x, ya, yb, facecolor='blue', alpha=0.05)
ax.vlines([0], ya, yb, lw=3, color='blue', label='range of $f$')
ax.text(0.04, -0.3, '$0$', fontsize=16)

ax = axes[0]

ax.legend(loc='upper right', frameon=False)


ybar = 1.5
ax.plot(x, x * 0 + ybar, 'k--', alpha=0.5)
ax.text(0.05, 0.8 * ybar, '$y$', fontsize=16)
for i, z in enumerate((-0.35, 0.35)):
ax.vlines(z, 0, f(z), linestyle='--', alpha=0.5)
ax.text(z, -0.2, f'$x_{i}$', fontsize=16)

ax = axes[1]

ybar = 2.6

5.1. Linear Algebra 341


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

ax.plot(x, x * 0 + ybar, 'k--', alpha=0.5)


ax.text(0.04, 0.91 * ybar, '$y$', fontsize=16)

plt.show()

In the first plot there are multiple solutions, as the function is not one-to-one, while in the second there are
no solutions, since y lies outside the range of f
Can we impose conditions on A in (5.3) that rule out these problems?
In this context, the most important thing to recognize about the expression Ax is that it corresponds to a
linear combination of the columns of A

342 Chapter 5. Tools and Techniques


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

In particular, if a1 , . . . , ak are the columns of A, then

Ax = x1 a1 + · · · + xk ak

Hence the range of f (x) = Ax is exactly the span of the columns of A


We want the range to be large, so that it contains arbitrary y
As you might recall, the condition that we want for the span to be large is linear independence
A happy fact is that linear independence of the columns of A also gives us uniqueness
Indeed, it follows from our earlier discussion that if {a1 , . . . , ak } are linearly independent and y = Ax =
x1 a1 + · · · + xk ak , then no z ̸= x satisfies y = Az

The n × n Case

Lets discuss some more details, starting with the case where A is n × n
This is the familiar case where the number of unknowns equals the number of equations
For arbitrary y ∈ Rn , we hope to find a unique x ∈ Rn such that y = Ax
In view of the observations immediately above, if the columns of A are linearly independent, then their span,
and hence the range of f (x) = Ax, is all of Rn
Hence there always exists an x such that y = Ax
Moreover, the solution is unique
In particular, the following are equivalent
1. The columns of A are linearly independent
2. For any y ∈ Rn , the equation y = Ax has a unique solution
The property of having linearly independent columns is sometimes expressed as having full column rank

Inverse Matrices

Can we give some sort of expression for the solution?


If y and A are scalar with A ̸= 0, then the solution is x = A−1 y
A similar expression is available in the matrix case
In particular, if square matrix A has full column rank, then it possesses a multiplicative inverse matrix A−1 ,
with the property that AA−1 = A−1 A = I
As a consequence, if we pre-multiply both sides of y = Ax by A−1 , we get x = A−1 y
This is the solution that were looking for

5.1. Linear Algebra 343


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Determinants

Another quick comment about square matrices is that to every such matrix we assign a unique number called
the determinant of the matrix you can find the expression for it here
If the determinant of A is not zero, then we say that A is nonsingular
Perhaps the most important fact about determinants is that A is nonsingular if and only if A is of full column
rank
This gives us a useful one-number summary of whether or not a square matrix can be inverted

More Rows than Columns

This is the n × k case with n > k


This case is very important in many settings, not least in the setting of linear regression (where n is the
number of observations, and k is the number of explanatory variables)
Given arbitrary y ∈ Rn , we seek an x ∈ Rk such that y = Ax
In this setting, existence of a solution is highly unlikely
Without much loss of generality, lets go over the intuition focusing on the case where the columns of A are
linearly independent
It follows that the span of the columns of A is a k-dimensional subspace of Rn
This span is very unlikely to contain arbitrary y ∈ Rn
To see why, recall the figure above, where k = 2 and n = 3
Imagine an arbitrarily chosen y ∈ R3 , located somewhere in that three dimensional space
Whats the likelihood that y lies in the span of {a1 , a2 } (i.e., the two dimensional plane through these points)?
In a sense it must be very small, since this plane has zero thickness
As a result, in the n > k case we usually give up on existence
However, we can still seek a best approximation, for example an x that makes the distance ∥y − Ax∥ as
small as possible
To solve this problem, one can use either calculus or the theory of orthogonal projections
The solution is known to be x̂ = (A′ A)−1 A′ y see for example chapter 3 of these notes

More Columns than Rows

This is the n × k case with n < k, so there are fewer equations than unknowns
In this case there are either no solutions or infinitely many in other words, uniqueness never holds
For example, consider the case where k = 3 and n = 2
Thus, the columns of A consists of 3 vectors in R2

344 Chapter 5. Tools and Techniques


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

This set can never be linearly independent, since it is possible to find two vectors that span R2
(For example, use the canonical basis vectors)
It follows that one column is a linear combination of the other two
For example, lets say that a1 = αa2 + βa3
Then if y = Ax = x1 a1 + x2 a2 + x3 a3 , we can also write

y = x1 (αa2 + βa3 ) + x2 a2 + x3 a3 = (x1 α + x2 )a2 + (x1 β + x3 )a3

In other words, uniqueness fails

Linear Equations with SciPy

Heres an illustration of how to solve linear equations with SciPys linalg submodule
All of these routines are Python front ends to time-tested and highly optimized FORTRAN code

from scipy.linalg import inv, solve, det

A = ((1, 2), (3, 4))


A = np.array(A)
y = np.ones((2, 1)) # Column vector
det(A) # Check that A is nonsingular, and hence invertible

-2.0

A_inv = inv(A) # Compute the inverse


A_inv

array([[-2. , 1. ],
[ 1.5, -0.5]])

x = A_inv @ y # Solution
A @ x # Should equal y

array([[ 1.],
[ 1.]])

solve(A, y) # Produces same solution

array([[-1.],
[ 1.]])

Observe how we can solve for x = A−1 y by either via inv(A) @ y, or using solve(A, y)
The latter method uses a different algorithm (LU decomposition) that is numerically more stable, and hence
should almost always be preferred
To obtain the least squares solution x̂ = (A′ A)−1 A′ y, use scipy.linalg.lstsq(A, y)

5.1. Linear Algebra 345


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

5.1.5 Eigenvalues and Eigenvectors

Let A be an n × n square matrix


If λ is scalar and v is a non-zero vector in Rn such that

Av = λv

then we say that λ is an eigenvalue of A, and v is an eigenvector


Thus, an eigenvector of A is a vector such that when the map f (x) = Ax is applied, v is merely scaled
The next figure shows two eigenvectors (blue arrows) and their images under A (red arrows)
As expected, the image Av of each v is just a scaled version of the original
from scipy.linalg import eig

A = ((1, 2),
(2, 1))
A = np.array(A)
evals, evecs = eig(A)
evecs = evecs[:, 0], evecs[:, 1]

fig, ax = plt.subplots(figsize=(10, 8))


# Set the axes through the origin
for spine in ['left', 'bottom']:
ax.spines[spine].set_position('zero')
for spine in ['right', 'top']:
ax.spines[spine].set_color('none')
ax.grid(alpha=0.4)

xmin, xmax = -3, 3


ymin, ymax = -3, 3
ax.set(xlim=(xmin, xmax), ylim=(ymin, ymax))

# Plot each eigenvector


for v in evecs:
ax.annotate('', xy=v, xytext=(0, 0),
arrowprops=dict(facecolor='blue',
shrink=0,
alpha=0.6,
width=0.5))

# Plot the image of each eigenvector


for v in evecs:
v = A @ v
ax.annotate('', xy=v, xytext=(0, 0),
arrowprops=dict(facecolor='red',
shrink=0,
alpha=0.6,
width=0.5))

# Plot the lines they run through


x = np.linspace(xmin, xmax, 3)

346 Chapter 5. Tools and Techniques


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

for v in evecs:
a = v[1] / v[0]
ax.plot(x, a * x, 'b-', lw=0.4)

plt.show()

The eigenvalue equation is equivalent to (A − λI)v = 0, and this has a nonzero solution v only when the
columns of A − λI are linearly dependent
This in turn is equivalent to stating that the determinant is zero
Hence to find all eigenvalues, we can look for λ such that the determinant of A − λI is zero
This problem can be expressed as one of solving for the roots of a polynomial in λ of degree n
This in turn implies the existence of n solutions in the complex plane, although some might be repeated
Some nice facts about the eigenvalues of a square matrix A are as follows
1. The determinant of A equals the product of the eigenvalues
2. The trace of A (the sum of the elements on the principal diagonal) equals the sum of the eigenvalues

5.1. Linear Algebra 347


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

3. If A is symmetric, then all of its eigenvalues are real


4. If A is invertible and λ1 , . . . , λn are its eigenvalues, then the eigenvalues of A−1 are 1/λ1 , . . . , 1/λn
A corollary of the first statement is that a matrix is invertible if and only if all its eigenvalues are nonzero
Using SciPy, we can solve for the eigenvalues and eigenvectors of a matrix as follows

A = ((1, 2),
(2, 1))

A = np.array(A)
evals, evecs = eig(A)
evals

array([ 3.+0.j, -1.+0.j])

evecs

array([[ 0.70710678, -0.70710678],


[ 0.70710678, 0.70710678]])

Note that the columns of evecs are the eigenvectors


Since any scalar multiple of an eigenvector is an eigenvector with the same eigenvalue (check it), the eig
routine normalizes the length of each eigenvector to one

Generalized Eigenvalues

It is sometimes useful to consider the generalized eigenvalue problem, which, for given matrices A and B,
seeks generalized eigenvalues λ and eigenvectors v such that

Av = λBv

This can be solved in SciPy via scipy.linalg.eig(A, B)


Of course, if B is square and invertible, then we can treat the generalized eigenvalue problem as an ordinary
eigenvalue problem B −1 Av = λv, but this is not always the case

5.1.6 Further Topics

We round out our discussion by briefly mentioning several other important topics

Series Expansions

Recall the usual summation formula for a geometric progression, which states that if |a| < 1, then
∑ ∞ −1
k=0 a = (1 − a)
k

A generalization of this idea exists in the matrix setting

348 Chapter 5. Tools and Techniques


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Matrix Norms

Let A be a square matrix, and let

∥A∥ := max ∥Ax∥


∥x∥=1

The norms on the right-hand side are ordinary vector norms, while the norm on the left-hand side is a matrix
norm in this case, the so-called spectral norm
For example, for a square matrix S, the condition ∥S∥ < 1 means that S is contractive, in the sense that it
pulls all vectors towards the origin2

Neumanns Theorem

Let A be a square matrix and let Ak := AAk−1 with A1 := A


In other words, Ak is the k-th power of A
Neumanns theorem states the following: If ∥Ak ∥ < 1 for some k ∈ N, then I − A is invertible, and



(I − A)−1 = Ak (5.4)
k=0

Spectral Radius

A result known as Gelfands formula tells us that, for any square matrix A,

ρ(A) = lim ∥Ak ∥1/k


k→∞

Here ρ(A) is the spectral radius, defined as maxi |λi |, where {λi }i is the set of eigenvalues of A
As a consequence of Gelfands formula, if all eigenvalues are strictly less than one in modulus, there exists a
k with ∥Ak ∥ < 1
In which case (5.4) is valid

Positive Definite Matrices

Let A be a symmetric n × n matrix


We say that A is
1. positive definite if x′ Ax > 0 for every x ∈ Rn \ {0}
2. positive semi-definite or nonnegative definite if x′ Ax ≥ 0 for every x ∈ Rn
2
Suppose that ∥S∥ < 1. Take any nonzero vector x, and let r := ∥x∥. We have ∥Sx∥ = r∥S(x/r)∥ ≤ r∥S∥ < r = ∥x∥.
Hence every point is pulled towards the origin.

5.1. Linear Algebra 349


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Analogous definitions exist for negative definite and negative semi-definite matrices
It is notable that if A is positive definite, then all of its eigenvalues are strictly positive, and hence A is
invertible (with positive definite inverse)

Differentiating Linear and Quadratic forms

The following formulas are useful in many economic contexts. Let


• z, x and a all be n × 1 vectors
• A be an n × n matrix
• B be an m × n matrix and y be an m × 1 vector
Then
∂a′ x
1. ∂x =a
2. ∂Ax
∂x = A′
∂x′ Ax
3. ∂x = (A + A′ )x
∂y ′ Bz
4. ∂y = Bz
∂y ′ Bz
5. ∂B = yz ′
Exercise 1 below asks you to apply these formulas

Further Reading

The documentation of the scipy.linalg submodule can be found here


Chapters 2 and 3 of the Econometric Theory contains a discussion of linear algebra along the same lines as
above, with solved exercises
If you dont mind a slightly abstract approach, a nice intermediate-level text on linear algebra is [Janich94]

5.1.7 Exercises

Exercise 1

Let x be a given n × 1 vector and consider the problem


{ }
v(x) = max −y ′ P y − u′ Qu
y,u

subject to the linear constraint

y = Ax + Bu

Here
• P is an n × n matrix and Q is an m × m matrix

350 Chapter 5. Tools and Techniques


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

• A is an n × n matrix and B is an n × m matrix


• both P and Q are symmetric and positive semidefinite
(What must the dimensions of y and u be to make this a well-posed problem?)
One way to solve the problem is to form the Lagrangian

L = −y ′ P y − u′ Qu + λ′ [Ax + Bu − y]

where λ is an n × 1 vector of Lagrange multipliers


Try applying the formulas given above for differentiating quadratic and linear forms to obtain the first-order
conditions for maximizing L with respect to y, u and minimizing it with respect to λ
Show that these conditions imply that
1. λ = −2P y
2. The optimizing choice of u satisfies u = −(Q + B ′ P B)−1 B ′ P Ax
3. The function v satisfies v(x) = −x′ P̃ x where P̃ = A′ P A − A′ P B(Q + B ′ P B)−1 B ′ P A
As we will see, in economic contexts Lagrange multipliers often are shadow prices

Note: If we dont care about the Lagrange multipliers, we can substitute the constraint into the objective
function, and then just maximize −(Ax + Bu)′ P (Ax + Bu) − u′ Qu with respect to u. You can verify that
this leads to the same maximizer.

5.1.8 Solutions

Solution to Exercise 1

We have an optimization problem:

v(x) = max{−y ′ P y − u′ Qu}


y,u

s.t.

y = Ax + Bu

with primitives
• P be a symmetric and positive semidefinite n × n matrix
• Q be a symmetric and positive semidefinite m × m matrix
• A an n × n matrix
• B an n × m matrix
The associated Lagrangian is :

L = −y ′ P y − u′ Qu + λ′ [Ax + Bu − y]

5.1. Linear Algebra 351


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

1.

Differentiating Lagrangian equation w.r.t y and setting its derivative equal to zero yields

∂L
= −(P + P ′ )y − λ = −2P y − λ = 0 ,
∂y
since P is symmetric
Accordingly, the first-order condition for maximizing L w.r.t. y implies

λ = −2P y

2.

Differentiating Lagrangian equation w.r.t. u and setting its derivative equal to zero yields

∂L
= −(Q + Q′ )u − B ′ λ = −2Qu + B ′ λ = 0
∂u
Substituting λ = −2P y gives

Qu + B ′ P y = 0

Substituting the linear constraint y = Ax + Bu into above equation gives

Qu + B ′ P (Ax + Bu) = 0

(Q + B ′ P B)u + B ′ P Ax = 0
which is the first-order condition for maximizing L w.r.t. u
Thus, the optimal choice of u must satisfy

u = −(Q + B ′ P B)−1 B ′ P Ax ,

which follows from the definition of the first-order conditions for Lagrangian equation

3.

Rewriting our problem by substituting the constraint into the objective function, we get

v(x) = max{−(Ax + Bu)′ P (Ax + Bu) − u′ Qu}


u

Since we know the optimal choice of u satisfies $ u = -(Q + BPB)^{-1}BPAx $, then

v(x) = −(Ax + Bu)′ P (Ax + Bu) − u′ Qu with u = −(Q + B ′ P B)−1 B ′ P Ax

352 Chapter 5. Tools and Techniques


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

To evaluate the function


v(x) = −(Ax + Bu)′ P (Ax + Bu) − u′ Qu
= −(x′ A′ + u′ B ′ )P (Ax + Bu) − u′ Qu
= −x′ A′ P Ax − u′ B ′ P Ax − x′ A′ P Bu − u′ B ′ P Bu − u′ Qu
= −x′ A′ P Ax − 2u′ B ′ P Ax − u′ (Q + B ′ P B)u

For simplicity, denote by S := (Q + B ′ P B)−1 B ′ P A, then $ u = -Sx$


Regarding the second term −2u′ B ′ P Ax,

−2u′ B ′ P Ax = −2x′ S ′ B ′ P Ax
= 2x′ A′ P B(Q + B ′ P B)−1 B ′ P Ax

Notice that the term (Q + B ′ P B)−1 is symmetric as both P and Q are symmetric
Regarding the third term −u′ (Q + B ′ P B)u,

−u′ (Q + B ′ P B)u = −x′ S ′ (Q + B ′ P B)Sx


= −x′ A′ P B(Q + B ′ P B)−1 B ′ P Ax

Hence, the summation of second and third terms is x′ A′ P B(Q + B ′ P B)−1 B ′ P Ax


This implies that

v(x) = −x′ A′ P Ax − 2u′ B ′ P Ax − u′ (Q + B ′ P B)u


= −x′ A′ P Ax + x′ A′ P B(Q + B ′ P B)−1 B ′ P Ax
= −x′ [A′ P A − A′ P B(Q + B ′ P B)−1 B ′ P A]x

Therefore, the solution to the optimization problem v(x) = −x′ P̃ x follows the above result by denoting
P̃ := A′ P A − A′ P B(Q + B ′ P B)−1 B ′ P A

5.2 Orthogonal Projections and Their Applications

Contents

• Orthogonal Projections and Their Applications


– Overview
– Key Definitions
– The Orthogonal Projection Theorem
– Orthonormal Basis
– Projection Using Matrix Algebra
– Least Squares Regression

5.2. Orthogonal Projections and Their Applications 353


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

– Orthogonalization and Decomposition


– Exercises
– Solutions

5.2.1 Overview

Orthogonal projection is a cornerstone of vector space methods, with many diverse applications
These include, but are not limited to,
• Least squares projection, also known as linear regression
• Conditional expectations for multivariate normal (Gaussian) distributions
• Gram–Schmidt orthogonalization
• QR decomposition
• Orthogonal polynomials
• etc
In this lecture we focus on
• key ideas
• least squares regression

Further Reading

For background and foundational concepts, see our lecture on linear algebra
For more proofs and greater theoretical detail, see A Primer in Econometric Theory
For a complete set of proofs in a general setting, see, for example, [Rom05]
For an advanced treatment of projection in the context of least squares prediction, see this book chapter

5.2.2 Key Definitions

Assume x, z ∈ Rn

Define ⟨x, z⟩ = i xi zi
Recall ∥x∥2 = ⟨x, x⟩
The law of cosines states that ⟨x, z⟩ = ∥x∥∥z∥ cos(θ) where θ is the angle between the vectors x and z
When ⟨x, z⟩ = 0, then cos(θ) = 0 and x and z are said to be orthogonal and we write x ⊥ z

354 Chapter 5. Tools and Techniques


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

For a linear subspace S ⊂ Rn , we call x ∈ Rn orthogonal to S if x ⊥ z for all z ∈ S, and write x ⊥ S

5.2. Orthogonal Projections and Their Applications 355


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

The orthogonal complement of linear subspace S ⊂ RRn is the set S ⊥ := {x ∈ Rn : x ⊥ S}

356 Chapter 5. Tools and Techniques


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

S ⊥ is a linear subspace of Rn
• To see this, fix x, y ∈ S ⊥ and α, β ∈ R
• Observe that if z ∈ S, then
⟨αx + βy, z⟩ = α⟨x, z⟩ + β⟨y, z⟩ = α × 0 + β × 0 = 0

• Hence αx + βy ∈ S ⊥ , as was to be shown


A set of vectors {x1 , . . . , xk } ⊂ Rn is called an orthogonal set if xi ⊥ xj whenever i ̸= j
If {x1 , . . . , xk } is an orthogonal set, then the Pythagorean Law states that

∥x1 + · · · + xk ∥2 = ∥x1 ∥2 + · · · + ∥xk ∥2

For example, when k = 2, x1 ⊥ x2 implies

∥x1 + x2 ∥2 = ⟨x1 + x2 , x1 + x2 ⟩ = ⟨x1 , x1 ⟩ + 2⟨x2 , x1 ⟩ + ⟨x2 , x2 ⟩ = ∥x1 ∥2 + ∥x2 ∥2

Linear Independence vs Orthogonality

If X ⊂ Rn is an orthogonal set and 0 ∈


/ X, then X is linearly independent
Proving this is a nice exercise
While the converse is not true, a kind of partial converse holds, as well see below

5.2. Orthogonal Projections and Their Applications 357


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

5.2.3 The Orthogonal Projection Theorem

What vector within a linear subspace of Rn best approximates a given vector in Rn ?


The next theorem provides answers this question
Theorem (OPT) Given y ∈ Rn and linear subspace S ⊂ Rn, there exists a unique solution to the mini-
mization problem

ŷ := arg min ∥y − z∥
z∈S

The minimizer ŷ is the unique vector in Rn that satisfies


• ŷ ∈ S
• y − ŷ ⊥ S
The vector ŷ is called the orthogonal projection of y onto S
The next figure provides some intuition

Proof of sufficiency

Well omit the full proof.


But we will prove sufficiency of the asserted conditions

358 Chapter 5. Tools and Techniques


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

To this end, let y ∈ Rn and let S be a linear subspace of Rn


Let ŷ be a vector in Rn such that ŷ ∈ S and y − ŷ ⊥ S
Let z be any other point in S and use the fact that S is a linear subspace to deduce
∥y − z∥2 = ∥(y − ŷ) + (ŷ − z)∥2 = ∥y − ŷ∥2 + ∥ŷ − z∥2
Hence ∥y − z∥ ≥ ∥y − ŷ∥, which completes the proof

Orthogonal Projection as a Mapping

For a linear space Y and a fixed linear subspace S, we have a functional relationship
y ∈ Y 7→ its orthogonal projection ŷ ∈ S
By the OPT, this is a well-defined mapping or operator from Rn to Rn
In what follows we denote this operator by a matrix P
• P y represents the projection ŷ
• This is sometimes expressed as ÊS y = P y, where Ê denotes a wide-sense expectations operator
and the subscript S indicates that we are projecting y onto the linear subspace S
The operator P is called the orthogonal projection mapping onto S

It is immediate from the OPT that for any y ∈ Rn

5.2. Orthogonal Projections and Their Applications 359


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

1. P y ∈ S and
2. y − P y ⊥ S
From this we can deduce additional useful properties, such as
1. ∥y∥2 = ∥P y∥2 + ∥y − P y∥2 and
2. ∥P y∥ ≤ ∥y∥
For example, to prove 1, observe that y = P y + y − P y and apply the Pythagorean law

Orthogonal Complement

Let S ⊂ Rn .
The orthogonal complement of S is the linear subspace S ⊥ that satisfies x1 ⊥ x2 for every x1 ∈ S and
x2 ∈ S ⊥
Let Y be a linear space with linear subspace S and its orthogonal complement S ⊥
We write

Y = S ⊕ S⊥

to indicate that for every y ∈ Y there is unique x1 ∈ S and a unique x2 ∈ S ⊥ such that y = x1 + x2 .
Moreover, x1 = ÊS y and x2 = y − ÊS y
This amounts to another version of the OPT:
Theorem. If S is a linear subspace of Rn , ÊS y = P y and ÊS ⊥ y = M y, then

Py ⊥ My and y = P y + M y for all y ∈ Rn

The next figure illustrates

360 Chapter 5. Tools and Techniques


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

5.2.4 Orthonormal Basis

An orthogonal set of vectors O ⊂ Rn is called an orthonormal set if ∥u∥ = 1 for all u ∈ O


Let S be a linear subspace of Rn and let O ⊂ S
If O is orthonormal and span O = S, then O is called an orthonormal basis of S
O is necessarily a basis of S (being independent by orthogonality and the fact that no element is the zero
vector)
One example of an orthonormal set is the canonical basis {e1 , . . . , en } that forms an orthonormal basis of
Rn, where ei is the i th unit vector
If {u1 , . . . , uk } is an orthonormal basis of linear subspace S, then


k
x= ⟨x, ui ⟩ui for all x∈S
i=1

To see this, observe that since x ∈ span{u1 , . . . , uk }, we can find scalars α1 , . . . , αk that verify


k
x= αj u j (5.5)
j=1

5.2. Orthogonal Projections and Their Applications 361


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Taking the inner product with respect to ui gives


k
⟨x, ui ⟩ = αj ⟨uj , ui ⟩ = αi
j=1

Combining this result with (5.5) verifies the claim

Projection onto an Orthonormal Basis

When the subspace onto which are projecting is orthonormal, computing the projection simplifies:
Theorem If {u1 , . . . , uk } is an orthonormal basis for S, then


k
Py = ⟨y, ui ⟩ui , ∀ y ∈ Rn (5.6)
i=1

Proof: Fix y ∈ Rn and let P y be defined as in (5.6)


Clearly, P y ∈ S
We claim that y − P y ⊥ S also holds
It sufficies to show that y − P y ⊥ any basis vector ui (why?)
This is true because
⟨ ⟩

k ∑
k
y− ⟨y, ui ⟩ui , uj = ⟨y, uj ⟩ − ⟨y, ui ⟩⟨ui , uj ⟩ = 0
i=1 i=1

5.2.5 Projection Using Matrix Algebra

Let S be a linear subspace of Rn and let y ∈ Rn .


We want to compute the matrix P that verifies

ÊS y = P y

Evidently P y is a linear function from y ∈ Rn to P y ∈ Rn


This reference is useful https://en.wikipedia.org/wiki/Linear_map#Matrices
Theorem. Let the columns of n × k matrix X form a basis of S. Then

P = X(X ′ X)−1 X ′

Proof: Given arbitrary y ∈ Rn and P = X(X ′ X)−1 X ′ , our claim is that


1. P y ∈ S, and
2. y − P y ⊥ S

362 Chapter 5. Tools and Techniques


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Claim 1 is true because

P y = X(X ′ X)−1 X ′ y = Xa when a := (X ′ X)−1 X ′ y

An expression of the form Xa is precisely a linear combination of the columns of X, and hence an element
of S
Claim 2 is equivalent to the statement

y − X(X ′ X)−1 X ′ y ⊥ Xb for all b ∈ RK

This is true: If b ∈ RK , then

(Xb)′ [y − X(X ′ X)−1 X ′ y] = b′ [X ′ y − X ′ y] = 0

The proof is now complete

Starting with X

It is common in applications to start with n × k matrix X with linearly independent columns and let

S := span X := span{col1 X, . . . , colk X}

Then the columns of X form a basis of S


From the preceding theorem, P = X(X ′ X)−1 X ′ y projects y onto S
In this context, P is often called the projection matrix
• The matrix M = I − P satisfies M y = ÊS ⊥ y and is sometimes called the annihilator matrix

The Orthonormal Case

Suppose that U is n × k with orthonormal columns


Let ui := col Ui for each i, let S := span U and let y ∈ Rn
We know that the projection of y onto S is

P y = U (U ′ U )−1 U ′ y

Since U has orthonormal columns, we have U ′ U = I


Hence


k
P y = U U ′y = ⟨ui , y⟩ui
i=1

We have recovered our earlier result about projecting onto the span of an orthonormal basis

5.2. Orthogonal Projections and Their Applications 363


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Application: Overdetermined Systems of Equations

Let y ∈ Rn and let X is n × k with linearly independent columns


Given X and y, we seek b ∈ Rk satisfying the system of linear equations Xb = y
If n > k (more equations than unknowns), then b is said to be overdetermined
Intuitively, we may not be able find a b that satisfies all n equations
The best approach here is to
• Accept that an exact solution may not exist
• Look instead for an approximate solution
By approximate solution, we mean a b ∈ Rk such that Xb is as close to y as possible
The next theorem shows that the solution is well defined and unique
The proof uses the OPT
Theorem The unique minimizer of ∥y − Xb∥ over b ∈ RK is

β̂ := (X ′ X)−1 X ′ y

Proof: Note that

X β̂ = X(X ′ X)−1 X ′ y = P y

Since P y is the orthogonal projection onto span(X) we have

∥y − P y∥ ≤ ∥y − z∥ for any z ∈ span(X)

Because Xb ∈ span(X)

∥y − X β̂∥ ≤ ∥y − Xb∥ for any b ∈ RK

This is what we aimed to show

5.2.6 Least Squares Regression

Lets apply the theory of orthogonal projection to least squares regression


This approach provides insights about many geometric properties of linear regression
We treat only some examples

Squared risk measures

Given pairs (x, y) ∈ RK × R, consider choosing f : RK → R to minimize the risk


R(f ) := E [(y − f (x))2 ]

364 Chapter 5. Tools and Techniques


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

If probabilities and hence E are unknown, we cannot solve this problem directly
However, if a sample is available, we can estimate the risk with the empirical risk:

1 ∑
N
min (yn − f (xn ))2
f ∈F N
n=1

Minimizing this expression is called empirical risk minimization


The set F is sometimes called the hypothesis space
The theory of statistical learning tells us that to prevent overfitting we should take the set F to be relatively
simple
If we let F be the class of linear functions 1/N , the problem is


N
min (yn − b′ xn )2
b∈R K
n=1

This is the sample linear least squares problem

Solution

Define the matrices


   
y1 xn1
 y2   xn2 
   
y :=  .. , xn :=  ..  = n-th obs on all regressors
 .   . 
yN xnK
and
   
x′1 x11 x12 ··· x1K
 x′2   x21 x22 ··· x2K 
   
X :=  ..  :=:  .. .. .. 
 .   . . . 
x′N xN 1 xN 2 · · · xN K
We assume throughout that N > K and X is full column rank
∑N
If you work through the algebra, you will be able to verify that ∥y − Xb∥2 = n=1 (yn − b′ xn )2
Since monotone transforms dont affect minimizers, we have

N
arg min (yn − b′ xn )2 = arg min ∥y − Xb∥
b∈ RK n=1 b∈ RK
By our results about overdetermined linear systems of equations, the solution is

β̂ := (X ′ X)−1 X ′ y

Let P and M be the projection and annihilator associated with X:

P := X(X ′ X)−1 X ′ and M := I − P

5.2. Orthogonal Projections and Their Applications 365


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

The vector of fitted values is

ŷ := X β̂ = P y

The vector of residuals is

û := y − ŷ = y − P y = M y

Here are some more standard definitions:


• The total sum of squares is := ∥y∥2
• The sum of squared residuals is := ∥û∥2
• The explained sum of squares is := ∥ŷ∥2
TSS = ESS + SSR
We can prove this easily using the OPT
From the OPT we have y = ŷ + û and û ⊥ ŷ
Applying the Pythagorean law completes the proof

5.2.7 Orthogonalization and Decomposition

Lets return to the connection between linear independence and orthogonality touched on above
A result of much interest is a famous algorithm for constructing orthonormal sets from linearly independent
sets
The next section gives details

Gram-Schmidt Orthogonalization

Theorem For each linearly independent set {x1 , . . . , xk } ⊂ Rn, there exists an orthonormal set
{u1 , . . . , uk } with

span{x1 , . . . , xi } = span{u1 , . . . , ui } for i = 1, . . . , k

The Gram-Schmidt orthogonalization procedure constructs an orthogonal set {u1 , u2 , . . . , un }


One description of this procedure is as follows:
• For i = 1, . . . , k, form Si := span{x1 , . . . , xi } and Si⊥
• Set v1 = x1
• For i ≥ 2 set vi := ÊS ⊥ xi and ui := vi /∥vi ∥
i−1

The sequence u1 , . . . , uk has the stated properties


A Gram-Schmidt orthogonalization construction is a key idea behind the Kalman filter described in A First
Look at the Kalman filter
In some exercises below you are asked to implement this algorithm and test it using projection

366 Chapter 5. Tools and Techniques


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

QR Decomposition

The following result uses the preceding algorithm to produce a useful decomposition
Theorem If X is n × k with linearly independent columns, then there exists a factorization X = QR where
• R is k × k, upper triangular, and nonsingular
• Q is n × k with orthonormal columns
Proof sketch: Let
• xj := colj (X)
• {u1 , . . . , uk } be orthonormal with same span as {x1 , . . . , xk } (to be constructed using
Gram–Schmidt)
• Q be formed from cols ui
Since xj ∈ span{u1 , . . . , uj }, we have


j
xj = ⟨ui , xj ⟩ui for j = 1, . . . , k
i=1

Some rearranging gives X = QR

Linear Regression via QR Decomposition

For matrices X and y that overdetermine beta in the linear equation system y = Xβ, we found the least
squares approximator β̂ = (X ′ X)−1 X ′ y
Using the QR decomposition X = QR gives

β̂ = (R′ Q′ QR)−1 R′ Q′ y
= (R′ R)−1 R′ Q′ y
= R−1 (R′ )−1 R′ Q′ y = R−1 Q′ y

Numerical routines would in this case use the alternative form Rβ̂ = Q′ y and back substitution

5.2.8 Exercises

Exercise 1

Show that, for any linear subspace S ⊂ Rn , S ∩ S ⊥ = {0}

Exercise 2

Let P = X(X ′ X)−1 X ′ and let M = I − P . Show that P and M are both idempotent and symmetric. Can
you give any intuition as to why they should be idempotent?

5.2. Orthogonal Projections and Their Applications 367


QuantEcon.lectures-python3 PDF, Release 2018-Sep-29

Exercise 3

Using Gram-Schmidt orthogonalization, produce a linear projection of y onto the column space of X and
verify this using the projection matrix P := X(X ′ X)−1 X ′ and also using QR decomposition, where:
 
1
y :=  3  ,
−3

and
 
1 0
X :=  0 −6 
2 2

5.2.9 Solutions

Exercise 1

If x ∈ S and x ∈ S ⊥ , then we have in particular that ⟨x, x⟩ = 0. But then x = 0.

Exercise 2

Symmetry and idempotence of M and P can be established using standard rules for matrix algebra. The
intuition behind idempotence of M and P is that both are orthogonal projections. After a point is projected
into a given subspace, applying the projection again makes no difference. (A point inside the subspace is
not shifted by orthogonal projection onto that space because it is already the closest point in the subspace to
itself.)

Exercise 3

Heres a function that computes the orthonormal vectors using the GS algorithm given in the lecture.

import numpy as np

def gram_s