100% found this document useful (1 vote)
2K views1,590 pages

Quantitative Economics With Python

A description of the application of python in the realm of Quantitative Economics

Uploaded by

Carter Drew
Copyright
Β© Β© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
2K views1,590 pages

Quantitative Economics With Python

A description of the application of python in the realm of Quantitative Economics

Uploaded by

Carter Drew
Copyright
Β© Β© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1590

Quantitative Economics with

Python

Thomas J. Sargent and John Stachurski

October 20, 2019


2
Contents

I Introduction to Python 1

1 About Python 3

2 Setting up Your Python Environment 13

3 An Introductory Example 35

4 Python Essentials 55

5 OOP I: Introduction to Object Oriented Programming 73

II The Scientific Libraries 79

6 NumPy 81

7 Matplotlib 99

8 SciPy 111

9 Numba 123

10 Other Scientific Libraries 135

III Advanced Python Programming 145

11 Writing Good Code 147

12 OOP II: Building Classes 155

13 OOP III: Samuelson Multiplier Accelerator 171

14 More Language Features 205

15 Debugging 237

IV Data and Empirics 243

16 Pandas 245

17 Pandas for Panel Data 259

3
4 CONTENTS

18 Linear Regression in Python 279

19 Maximum Likelihood Estimation 295

V Tools and Techniques 315

20 Geometric Series for Elementary Economics 317

21 Linear Algebra 337

22 Complex Numbers and Trignometry 361

23 Orthogonal Projections and Their Applications 371

24 LLN and CLT 387

25 Linear State Space Models 405

26 Finite Markov Chains 429

27 Continuous State Markov Chains 453

28 Cass-Koopmans Optimal Growth Model 475

29 A First Look at the Kalman Filter 503

30 Reverse Engineering a la Muth 521

VI Dynamic Programming 529

31 Shortest Paths 531

32 Job Search I: The McCall Search Model 541

33 Job Search II: Search and Separation 553

34 A Problem that Stumped Milton Friedman 565

35 Job Search III: Search with Learning 583

36 Job Search IV: Modeling Career Choice 599

37 Job Search V: On-the-Job Search 611

38 Optimal Growth I: The Stochastic Optimal Growth Model 621

39 Optimal Growth II: Time Iteration 639

40 Optimal Growth III: The Endogenous Grid Method 655

41 Optimal Savings III: Occasionally Binding Constraints 663

42 Discrete State Dynamic Programming 679


CONTENTS 5

VII LQ Control 701

43 LQ Dynamic Programming Problems 703

44 Optimal Savings I: The Permanent Income Model 731

45 Optimal Savings II: LQ Techniques 749

46 Consumption Smoothing with Complete and Incomplete Markets 767

47 Tax Smoothing with Complete and Incomplete Markets 783

48 Robustness 813

49 Markov Jump Linear Quadratic Dynamic Programming 833

50 How to Pay for a War: Part 1 873

51 How to Pay for a War: Part 2 883

52 How to Pay for a War: Part 3 897

53 Optimal Taxation in an LQ Economy 903

VIII Multiple Agent Models 923

54 Schelling’s Segregation Model 925

55 A Lake Model of Employment and Unemployment 937

56 Rational Expectations Equilibrium 961

57 Markov Perfect Equilibrium 975

58 Robust Markov Perfect Equilibrium 991

59 Uncertainty Traps 1009

60 The Aiyagari Model 1023

61 Default Risk and Income Fluctuations 1033

62 Globalization and Cycles 1051

63 Coase’s Theory of the Firm 1067

IX Recursive Models of Dynamic Linear Economies 1081

64 Recursive Models of Dynamic Linear Economies 1083

65 Growth in Dynamic Linear Economies 1119

66 Lucas Asset Pricing Using DLE 1131


6 CONTENTS

67 IRFs in Hall Models 1139

68 Permanent Income Model using the DLE Class 1147

69 Rosen Schooling Model 1153

70 Cattle Cycles 1159

71 Shock Non Invertibility 1167

X Classic Linear Models 1175

72 Von Neumann Growth Model (and a Generalization) 1177

XI Time Series Models 1193

73 Covariance Stationary Processes 1195

74 Estimation of Spectra 1217

75 Additive and Multiplicative Functionals 1231

76 Classical Control with Linear Algebra 1253

77 Classical Prediction and Filtering With Linear Algebra 1275

XII Asset Pricing and Finance 1295

78 Asset Pricing I: Finite State Models 1297

79 Asset Pricing II: The Lucas Asset Pricing Model 1317

80 Asset Pricing III: Incomplete Markets 1327

81 Two Modifications of Mean-variance Portfolio Theory 1339

XIII Dynamic Programming Squared 1363

82 Stackelberg Plans 1365

83 Ramsey Plans, Time Inconsistency, Sustainable Plans 1389

84 Optimal Taxation with State-Contingent Debt 1413

85 Optimal Taxation without State-Contingent Debt 1443

86 Fluctuating Interest Rates Deliver Fiscal Insurance 1469

87 Fiscal Risk and Government Debt 1495

88 Competitive Equilibria of Chang Model 1521


CONTENTS 7

89 Credible Government Policies in Chang Model 1551


8 CONTENTS
Part I

Introduction to Python

1
Chapter 1

About Python

1.1 Contents

β€’ Overview 1.2

β€’ What’s Python? 1.3

β€’ Scientific Programming 1.4

β€’ Learn More 1.5

1.2 Overview

In this lecture we will

β€’ Outline what Python is


β€’ Showcase some of its abilities
β€’ Compare it to some other languages

At this stage, it’s not our intention that you try to replicate all you see.
We will work through what follows at a slow pace later in the lecture series.
Our only objective for this lecture is to give you some feel of what Python is, and what it can
do.

1.3 What’s Python?

Python is a general-purpose programming language conceived in 1989 by Dutch programmer


Guido van Rossum.
Python is free and open source, with development coordinated through the Python Software
Foundation.
Python has experienced rapid adoption in the last decade and is now one of the most popular
programming languages.

3
4 CHAPTER 1. ABOUT PYTHON

1.3.1 Common Uses

Python is a general-purpose language used in almost all application domains

β€’ communications
β€’ web development
β€’ CGI and graphical user interfaces
β€’ games
β€’ multimedia, data processing, security, etc., etc., etc.

Used extensively by Internet service and high tech companies such as

β€’ Google
β€’ Dropbox
β€’ Reddit
β€’ YouTube
β€’ Walt Disney Animation, etc., etc.

Often used to teach computer science and programming.


For reasons we will discuss, Python is particularly popular within the scientific community

β€’ academia, NASA, CERN, Wall St., etc., etc.

1.3.2 Relative Popularity

The following chart, produced using Stack Overflow Trends, shows one measure of the relative
popularity of Python

The figure indicates not only that Python is widely used but also that adoption of Python
has accelerated significantly since 2012.
We suspect this is driven at least in part by uptake in the scientific domain, particularly in
rapidly growing fields like data science.
1.3. WHAT’S PYTHON? 5

For example, the popularity of pandas, a library for data analysis with Python has exploded,
as seen here.
(The corresponding time path for MATLAB is shown for comparison)

Note that pandas takes off in 2012, which is the same year that we seek Python’s popularity
begin to spike in the first figure.
Overall, it’s clear that

β€’ Python is one of the most popular programming languages worldwide.


β€’ Python is a major tool for scientific computing, accounting for a rapidly rising share of
scientific work around the globe.

1.3.3 Features

Python is a high-level language suitable for rapid development.


It has a relatively small core language supported by many libraries.
Other features

β€’ A multiparadigm language, in that multiple programming styles are supported (proce-


dural, object-oriented, functional, etc.).
β€’ Interpreted rather than compiled.

1.3.4 Syntax and Design

One nice feature of Python is its elegant syntax β€” we’ll see many examples later on.
Elegant code might sound superfluous but in fact it’s highly beneficial because it makes the
syntax easy to read and easy to remember.
Remembering how to read from files, sort dictionaries and other such routine tasks means
that you don’t need to break your flow in order to hunt down correct syntax.
6 CHAPTER 1. ABOUT PYTHON

Closely related to elegant syntax is an elegant design.


Features like iterators, generators, decorators, list comprehensions, etc. make Python highly
expressive, allowing you to get more done with less code.
Namespaces improve productivity by cutting down on bugs and syntax errors.

1.4 Scientific Programming

Python has become one of the core languages of scientific computing.


It’s either the dominant player or a major player in

β€’ Machine learning and data science


β€’ Astronomy
β€’ Artificial intelligence
β€’ Chemistry
β€’ Computational biology
β€’ Meteorology
β€’ etc., etc.

Its popularity in economics is also beginning to rise.


This section briefly showcases some examples of Python for scientific programming.

β€’ All of these topics will be covered in detail later on.

1.4.1 Numerical Programming

Fundamental matrix and array processing capabilities are provided by the excellent NumPy
library.
NumPy provides the basic array data type plus some simple processing operations.
For example, let’s build some arrays
import numpy as np # Load the library
[1]:
a = np.linspace(-np.pi, np.pi, 100) # Create even grid from -Ο€ to Ο€
b = np.cos(a) # Apply cosine to each element of a
c = np.sin(a) # Apply sin to each element of a

Now let’s take the inner product


b @ c
[2]:
2.706168622523819e-16
[2]:

The number you see here might vary slightly but it’s essentially zero.
(For older versions of Python and NumPy you need to use the np.dot function)
The SciPy library is built on top of NumPy and provides additional functionality.
2
For example, let’s calculate βˆ«βˆ’2 πœ™(𝑧)𝑑𝑧 where πœ™ is the standard normal density.
1.4. SCIENTIFIC PROGRAMMING 7

from scipy.stats import norm


[3]: from scipy.integrate import quad

οΏ½ = norm()
value, error = quad(οΏ½.pdf, -2, 2) # Integrate using Gaussian quadrature
value

0.9544997361036417
[3]:

SciPy includes many of the standard routines used in

β€’ linear algebra
β€’ integration
β€’ interpolation
β€’ optimization
β€’ distributions and random number generation
β€’ signal processing
β€’ etc., etc.

1.4.2 Graphics

The most popular and comprehensive Python library for creating figures and graphs is Mat-
plotlib.

β€’ Plots, histograms, contour images, 3D, bar charts, etc., etc.


β€’ Output in many formats (PDF, PNG, EPS, etc.)
β€’ LaTeX integration

Example 2D plot with embedded LaTeX annotations

Example contour plot


8 CHAPTER 1. ABOUT PYTHON

Example 3D plot

More examples can be found in the Matplotlib thumbnail gallery.


Other graphics libraries include

β€’ Plotly
β€’ Bokeh
1.4. SCIENTIFIC PROGRAMMING 9

β€’ VPython β€” 3D graphics and animations

1.4.3 Symbolic Algebra

It’s useful to be able to manipulate symbolic expressions, as in Mathematica or Maple.


The SymPy library provides this functionality from within the Python shell.
from sympy import Symbol
[4]:
x, y = Symbol('x'), Symbol('y') # Treat 'x' and 'y' as algebraic symbols
x + x + x + y

[4]:
3π‘₯ + 𝑦
We can manipulate expressions
expression = (x + y)**2
[5]: expression.expand()

[5]:
π‘₯2 + 2π‘₯𝑦 + 𝑦2
solve polynomials
from sympy import solve
[6]:
solve(x**2 + x + 2)

[-1/2 - sqrt(7)*I/2, -1/2 + sqrt(7)*I/2]


[6]:

and calculate limits, derivatives and integrals


from sympy import limit, sin, diff
[7]:
limit(1 / x, x, 0)

[7]:
∞
limit(sin(x) / x, x, 0)
[8]:
[8]:
1
diff(sin(x), x)
[9]:
[9]:
cos (π‘₯)
The beauty of importing this functionality into Python is that we are working within a fully
fledged programming language.
Can easily create tables of derivatives, generate LaTeX output, add it to figures, etc., etc.

1.4.4 Statistics

Python’s data manipulation and statistics libraries have improved rapidly over the last few
years.
Pandas
One of the most popular libraries for working with data is pandas.
Pandas is fast, efficient, flexible and well designed.
10 CHAPTER 1. ABOUT PYTHON

Here’s a simple example, using some fake data


import pandas as pd
[10]: np.random.seed(1234)

data = np.random.randn(5, 2) # 5x2 matrix of N(0, 1) random draws


dates = pd.date_range('28/12/2010', periods=5)

df = pd.DataFrame(data, columns=('price', 'weight'), index=dates)


print(df)

price weight
2010-12-28 0.471435 -1.190976
2010-12-29 1.432707 -0.312652
2010-12-30 -0.720589 0.887163
2010-12-31 0.859588 -0.636524
2011-01-01 0.015696 -2.242685

df.mean()
[11]:
price 0.411768
[11]: weight -0.699135
dtype: float64

Other Useful Statistics Libraries


- statsmodels β€” various statistical routines
- scikit-learn β€” machine learning in Python (sponsored by Google, among others)
- pyMC β€” for Bayesian data analysis
- pystan Bayesian analysis based on stan

1.4.5 Networks and Graphs

Python has many libraries for studying graphs.


One well-known example is NetworkX

β€’ Standard graph algorithms for analyzing network structure, etc.


β€’ Plotting routines
β€’ etc., etc.

Here’s some example code that generates and plots a random graph, with node color deter-
mined by shortest path length from a central node.
import networkx as nx
[12]: import matplotlib.pyplot as plt
%matplotlib inline
np.random.seed(1234)

# Generate a random graph


p = dict((i, (np.random.uniform(0, 1), np.random.uniform(0, 1)))
for i in range(200))
g = nx.random_geometric_graph(200, 0.12, pos=p)
pos = nx.get_node_attributes(g, 'pos')

# Find node nearest the center point (0.5, 0.5)


dists = [(x - 0.5)**2 + (y - 0.5)**2 for x, y in list(pos.values())]
ncenter = np.argmin(dists)

# Plot graph, coloring by path length from central node


1.4. SCIENTIFIC PROGRAMMING 11

p = nx.single_source_shortest_path_length(g, ncenter)
plt.figure()
nx.draw_networkx_edges(g, pos, alpha=0.4)
nx.draw_networkx_nodes(g,
pos,
nodelist=list(p.keys()),
node_size=120, alpha=0.5,
node_color=list(p.values()),
cmap=plt.cm.jet_r)
plt.show()

/home/ubuntu/anaconda3/lib/python3.7/site-
packages/networkx/drawing/nx_pylab.py:579: MatplotlibDeprecationWarning:
The iterable function was deprecated in Matplotlib 3.1 and will be removed in
3.3. Use np.iterable instead.
if not cb.iterable(width):

1.4.6 Cloud Computing

Running your Python code on massive servers in the cloud is becoming easier and easier.
A nice example is Anaconda Enterprise.
See also
- Amazon Elastic Compute Cloud
- The Google App Engine (Python, Java, PHP or Go)
- Pythonanywhere
- Sagemath Cloud
12 CHAPTER 1. ABOUT PYTHON

1.4.7 Parallel Processing

Apart from the cloud computing options listed above, you might like to consider
- Parallel computing through IPython clusters.
- The Starcluster interface to Amazon’s EC2.
- GPU programming through PyCuda, PyOpenCL, Theano or similar.

1.4.8 Other Developments

There are many other interesting developments with scientific programming in Python.
Some representative examples include
- Jupyter β€” Python in your browser with code cells, embedded images, etc.
- Numba β€” Make Python run at the same speed as native machine code!
- Blaze β€” a generalization of NumPy.
- PyTables β€” manage large data sets.
- CVXPY β€” convex optimization in Python.

1.5 Learn More

β€’ Browse some Python projects on GitHub.


β€’ Have a look at some of the Jupyter notebooks people have shared on various scientific
topics.

- Visit the Python Package Index.


- View some of the questions people are asking about Python on Stackoverflow.
- Keep up to date on what’s happening in the Python community with the Python subreddit.
Chapter 2

Setting up Your Python


Environment

2.1 Contents

β€’ Overview 2.2

β€’ Anaconda 2.3

β€’ Jupyter Notebooks 2.4

β€’ Installing Libraries 2.5

β€’ Working with Files 2.6

β€’ Editors and IDEs 2.7

β€’ Exercises 2.8

2.2 Overview

In this lecture, you will learn how to

1. get a Python environment up and running with all the necessary tools
2. execute simple Python commands
3. run a sample program
4. install the code libraries that underpin these lectures

2.3 Anaconda

The core Python package is easy to install but not what you should choose for these lectures.
These lectures require the entire scientific programming ecosystem, which

β€’ the core installation doesn’t provide


β€’ is painful to install one piece at a time

13
14 CHAPTER 2. SETTING UP YOUR PYTHON ENVIRONMENT

Hence the best approach for our purposes is to install a free Python distribution that contains

1. the core Python language and


2. the most popular scientific libraries

The best such distribution is Anaconda.


Anaconda is

β€’ very popular
β€’ cross platform
β€’ comprehensive
β€’ completely unrelated to the Nicki Minaj song of the same name

Anaconda also comes with a great package management system to organize your code li-
braries.
All of what follows assumes that you adopt this recommendation!.

2.3.1 Installing Anaconda

Installing Anaconda is straightforward: download the binary and follow the instructions.
Important points:

β€’ Install the latest version.


β€’ If you are asked during the installation process whether you’d like to make Anaconda
your default Python installation, say yes.
β€’ Otherwise, you can accept all of the defaults.

2.3.2 Updating Anaconda

Anaconda supplies a tool called conda to manage and upgrade your Anaconda packages.
One conda command you should execute regularly is the one that updates the whole Ana-
conda distribution.
As a practice run, please execute the following

1. Open up a terminal
2. Type conda update anaconda

For more information on conda, type conda help in a terminal.

2.4 Jupyter Notebooks

Jupyter notebooks are one of the many possible ways to interact with Python and the scien-
tific libraries.
They use a browser-based interface to Python with
2.4. JUPYTER NOTEBOOKS 15

β€’ The ability to write and execute Python commands.


β€’ Formatted output in the browser, including tables, figures, animation, etc.
β€’ The option to mix in formatted text and mathematical expressions.

Because of these possibilities, Jupyter is fast turning into a major player in the scientific com-
puting ecosystem.
Here’s an image showing execution of some code (borrowed from here) in a Jupyter notebook

You can find a nice example of the kinds of things you can do in a Jupyter notebook (such as
include maths and text) here.
While Jupyter isn’t the only way to code in Python, it’s great for when you wish to

β€’ start coding in Python


β€’ test new ideas or interact with small pieces of code
β€’ share or collaborate scientific ideas with students or colleagues

These lectures are designed for executing in Jupyter notebooks.


16 CHAPTER 2. SETTING UP YOUR PYTHON ENVIRONMENT

2.4.1 Starting the Jupyter Notebook

Once you have installed Anaconda, you can start the Jupyter notebook.
Either

β€’ search for Jupyter in your applications menu, or

β€’ open up a terminal and type jupyter notebook

– Windows users should substitute β€œAnaconda command prompt” for β€œterminal” in


the previous line.

If you use the second option, you will see something like this (click to enlarge)

The output tells us the notebook is running at http://localhost:8888/

β€’ localhost is the name of the local machine


β€’ 8888 refers to port number 8888 on your computer

Thus, the Jupyter kernel is listening for Python commands on port 8888 of our local machine.
Hopefully, your default browser has also opened up with a web page that looks something like
this (click to enlarge)
2.4. JUPYTER NOTEBOOKS 17

What you see here is called the Jupyter dashboard.


If you look at the URL at the top, it should be localhost:8888 or similar, matching the
message above.
Assuming all this has worked OK, you can now click on New at the top right and select
Python 3 or similar.
Here’s what shows up on our machine:
18 CHAPTER 2. SETTING UP YOUR PYTHON ENVIRONMENT

The notebook displays an active cell, into which you can type Python commands.

2.4.2 Notebook Basics

Let’s start with how to edit code and run simple programs.
Running Cells
Notice that in the previous figure the cell is surrounded by a green border.
This means that the cell is in edit mode.
As a result, you can type in Python code and it will appear in the cell.
When you’re ready to execute the code in a cell, hit Shift-Enter instead of the usual En-
ter.
2.4. JUPYTER NOTEBOOKS 19

(Note: There are also menu and button options for running code in a cell that you can find
by exploring)
Modal Editing
The next thing to understand about the Jupyter notebook is that it uses a modal editing sys-
tem.
This means that the effect of typing at the keyboard depends on which mode you are in.
The two modes are

1. Edit mode

β€’ Indicated by a green border around one cell


β€’ Whatever you type appears as is in that cell

1. Command mode

β€’ The green border is replaced by a grey border


β€’ Key strokes are interpreted as commands β€” for example, typing b adds a new cell be-
low the current one

To switch to

β€’ command mode from edit mode, hit the Esc key or Ctrl-M
20 CHAPTER 2. SETTING UP YOUR PYTHON ENVIRONMENT

β€’ edit mode from command mode, hit Enter or click in a cell

The modal behavior of the Jupyter notebook is a little tricky at first but very efficient when
you get used to it.
User Interface Tour
At this stage, we recommend you take your time to

β€’ look at the various options in the menus and see what they do
β€’ take the β€œuser interface tour”, which can be accessed through the help menu

Inserting Unicode (e.g., Greek Letters)


Python 3 introduced support for unicode characters, allowing the use of characters such as 𝛼
and 𝛽 in your code.
Unicode characters can be typed quickly in Jupyter using the tab key.
Try creating a new code cell and typing οΏ½, then hitting the tab key on your keyboard.
A Test Program
Let’s run a test program.
Here’s an arbitrary program we can use: http://matplotlib.org/1.4.1/examples/
pie_and_polar_charts/polar_bar_demo.html.
On that page, you’ll see the following code
import numpy as np
[1]: import matplotlib.pyplot as plt
%matplotlib inline

N = 20
ΞΈ = np.linspace(0.0, 2 * np.pi, N, endpoint=False)
radii = 10 * np.random.rand(N)
width = np.pi / 4 * np.random.rand(N)

ax = plt.subplot(111, polar=True)
bars = ax.bar(ΞΈ, radii, width=width, bottom=0.0)

# Use custom colors and opacity


for r, bar in zip(radii, bars):
bar.set_facecolor(plt.cm.jet(r / 10.))
bar.set_alpha(0.5)

plt.show()
2.4. JUPYTER NOTEBOOKS 21

Don’t worry about the details for now β€” let’s just run it and see what happens.
The easiest way to run this code is to copy and paste into a cell in the notebook.
(In older versions of Jupyter you might need to add the command %matplotlib inline
before you generate the figure)

2.4.3 Working with the Notebook

Here are a few more tips on working with Jupyter notebooks.


Tab Completion
In the previous program, we executed the line import numpy as np

β€’ NumPy is a numerical library we’ll work with in depth.

After this import command, functions in NumPy can be accessed with


np.<function_name> type syntax.

β€’ For example, try np.random.randn(3).

We can explore these attributes of np using the Tab key.


For example, here we type np.ran and hit Tab (click to enlarge)
22 CHAPTER 2. SETTING UP YOUR PYTHON ENVIRONMENT

Jupyter offers up the two possible completions, random and rank.


In this way, the Tab key helps remind you of what’s available and also saves you typing.
On-Line Help
To get help on np.rank, say, we can execute np.rank?.
Documentation appears in a split window of the browser, like so
2.4. JUPYTER NOTEBOOKS 23

Clicking on the top right of the lower split closes the on-line help.
Other Content
In addition to executing code, the Jupyter notebook allows you to embed text, equations, fig-
ures and even videos in the page.
For example, here we enter a mixture of plain text and LaTeX instead of code
24 CHAPTER 2. SETTING UP YOUR PYTHON ENVIRONMENT

Next we Esc to enter command mode and then type m to indicate that we are writing Mark-
down, a mark-up language similar to (but simpler than) LaTeX.
(You can also use your mouse to select Markdown from the Code drop-down box just below
the list of menu items)
Now we Shift+Enter to produce this
2.4. JUPYTER NOTEBOOKS 25

2.4.4 Sharing Notebooks

Notebook files are just text files structured in JSON and typically ending with .ipynb.
You can share them in the usual way that you share files β€” or by using web services such as
nbviewer.
The notebooks you see on that site are static html representations.
To run one, download it as an ipynb file by clicking on the download icon at the top right.
Save it somewhere, navigate to it from the Jupyter dashboard and then run as discussed
above.

2.4.5 QuantEcon Notes

QuantEcon has its own site for sharing Jupyter notebooks related to economics – QuantEcon
Notes.
Notebooks submitted to QuantEcon Notes can be shared with a link, and are open to com-
ments and votes by the community.
26 CHAPTER 2. SETTING UP YOUR PYTHON ENVIRONMENT

2.5 Installing Libraries

Most of the libraries we need come in Anaconda.


Other libraries can be installed with pip.
One library we’ll be using is QuantEcon.py.
You can install QuantEcon.py by starting Jupyter and typing

!pip install --upgrade quantecon

into a cell.
Alternatively, you can type the following into a terminal

pip install quantecon

More instructions can be found on the library page.


To upgrade to the latest version, which you should do regularly, use

pip install --upgrade quantecon

Another library we will be using is interpolation.py.


This can be installed by typing in Jupyter

!pip install interpolation

2.6 Working with Files

How does one run a locally saved Python file?


There are a number of ways to do this but let’s focus on methods using Jupyter notebooks.

2.6.1 Option 1: Copy and Paste

The steps are:

1. Navigate to your file with your mouse/trackpad using a file browser.


2. Click on your file to open it with a text editor.
3. Copy and paste into a cell and Shift-Enter.

2.6.2 Method 2: Run

Using the run command is often easier than copy and paste.

β€’ For example, %run test.py will run the file test.py.


2.6. WORKING WITH FILES 27

(You might find that the % is unnecessary β€” use %automagic to toggle the need for %)
Note that Jupyter only looks for test.py in the present working directory (PWD).
If test.py isn’t in that directory, you will get an error.
Let’s look at a successful example, where we run a file test.py with contents:
for i in range(5):
[2]: print('foobar')

foobar
foobar
foobar
foobar
foobar

Here’s the notebook (click to enlarge)

Here

β€’ pwd asks Jupyter to show the PWD (or %pwd β€” see the comment about automagic
above)

– This is where Jupyter is going to look for files to run.


– Your output will look a bit different depending on your OS.
28 CHAPTER 2. SETTING UP YOUR PYTHON ENVIRONMENT

β€’ ls asks Jupyter to list files in the PWD (or %ls)

– Note that test.py is there (on our computer, because we saved it there earlier).

β€’ cat test.py asks Jupyter to print the contents of test.py (or !type test.py on
Windows)

β€’ run test.py runs the file and prints any output

2.6.3 But File X isn’t in my PWD!

If you’re trying to run a file not in the present working directory, you’ll get an error.
To fix this error you need to either

1. Shift the file into the PWD, or


2. Change the PWD to where the file lives

One way to achieve the first option is to use the Upload button

β€’ The button is on the top level dashboard, where Jupyter first opened to
β€’ Look where the pointer is in this picture

The second option can be achieved using the cd command

β€’ On Windows it might look like this cd C:/Python27/Scripts/dir


β€’ On Linux / OSX it might look like this cd /home/user/scripts/dir

Note: You can type the first letter or two of each directory name and then use the tab key to
expand.
2.7. EDITORS AND IDES 29

2.6.4 Loading Files

It’s often convenient to be able to see your code before you run it.
In the following example, we execute load white_noise_plot.py where
white_noise_plot.py is in the PWD.
(Use %load if automagic is off)
Now the code from the file appears in a cell ready to execute.

2.6.5 Saving Files

To save the contents of a cell as file foo.py

β€’ put %%file foo.py as the first line of the cell


β€’ Shift+Enter

Here %%file is an example of a cell magic.

2.7 Editors and IDEs

The preceding discussion covers most of what you need to know to interact with this website.
30 CHAPTER 2. SETTING UP YOUR PYTHON ENVIRONMENT

However, as you start to write longer programs, you might want to experiment with your
workflow.
There are many different options and we mention them only in passing.

2.7.1 JupyterLab

JupyterLab is an integrated development environment centered around Jupyter notebooks.


It is available through Anaconda and will soon be made the default environment for Jupyter
notebooks.
Reading the docs or searching for a recent YouTube video will give you more information.

2.7.2 Text Editors

A text editor is an application that is specifically designed to work with text files β€” such as
Python programs.
Nothing beats the power and efficiency of a good text editor for working with program text.
A good text editor will provide

β€’ efficient text editing commands (e.g., copy, paste, search and replace)
β€’ syntax highlighting, etc.

Among the most popular are Sublime Text and Atom.


For a top quality open source text editor with a steeper learning curve, try Emacs.
If you want an outstanding free text editor and don’t mind a seemingly vertical learning
curve plus long days of pain and suffering while all your neural pathways are rewired, try
Vim.

2.7.3 Text Editors Plus IPython Shell

A text editor is for writing programs.


To run them you can continue to use Jupyter as described above.
Another option is to use the excellent IPython shell.
To use an IPython shell, open up a terminal and type ipython.
You should see something like this
2.7. EDITORS AND IDES 31

The IPython shell has many of the features of the notebook: tab completion, color syntax,
etc.
It also has command history through the arrow key.
The up arrow key to brings previously typed commands to the prompt.
This saves a lot of typing…
Here’s one set up, on a Linux box, with

β€’ a file being edited in Vim


β€’ an IPython shell next to it, to run the file
32 CHAPTER 2. SETTING UP YOUR PYTHON ENVIRONMENT

2.7.4 IDEs

IDEs are Integrated Development Environments, which allow you to edit, execute and inter-
act with code from an integrated environment.
One of the most popular in recent times is VS Code, which is now available via Anaconda.
We hear good things about VS Code β€” please tell us about your experiences on the forum.

2.8 Exercises

2.8.1 Exercise 1

If Jupyter is still running, quit by using Ctrl-C at the terminal where you started it.
Now launch again, but this time using jupyter notebook --no-browser.
This should start the kernel without launching the browser.
Note also the startup message: It should give you a URL such as
http://localhost:8888 where the notebook is running.
Now

1. Start your browser β€” or open a new tab if it’s already running.


2. Enter the URL from above (e.g. http://localhost:8888) in the address bar at the
top.

You should now be able to run a standard Jupyter notebook session.


This is an alternative way to start the notebook that can also be handy.
2.8. EXERCISES 33

2.8.2 Exercise 2

This exercise will familiarize you with git and GitHub.


Git is a version control system β€” a piece of software used to manage digital projects such as
code libraries.
In many cases, the associated collections of files β€” called repositories β€” are stored on
GitHub.
GitHub is a wonderland of collaborative coding projects.
For example, it hosts many of the scientific libraries we’ll be using later on, such as this one.
Git is the underlying software used to manage these projects.
Git is an extremely powerful tool for distributed collaboration β€” for example, we use it to
share and synchronize all the source files for these lectures.
There are two main flavors of Git

1. the plain vanilla command line Git version


2. the various point-and-click GUI versions

β€’ See, for example, the GitHub version

As an exercise, try

1. Installing Git.
2. Getting a copy of QuantEcon.py using Git.

For example, if you’ve installed the command line version, open up a terminal and enter.

git clone https://github.com/QuantEcon/QuantEcon.py.

(This is just git clone in front of the URL for the repository)
Even better,

1. Sign up to GitHub.
2. Look into β€˜forking’ GitHub repositories (forking means making your own copy of a
GitHub repository, stored on GitHub).
3. Fork QuantEcon.py.
4. Clone your fork to some local directory, make edits, commit them, and push them back
up to your forked GitHub repo.
5. If you made a valuable improvement, send us a pull request!

For reading on these and other topics, try

β€’ The official Git documentation.


β€’ Reading through the docs on GitHub.
β€’ Pro Git Book by Scott Chacon and Ben Straub.
β€’ One of the thousands of Git tutorials on the Net.
34 CHAPTER 2. SETTING UP YOUR PYTHON ENVIRONMENT
Chapter 3

An Introductory Example

3.1 Contents

β€’ Overview 3.2

β€’ The Task: Plotting a White Noise Process 3.3

β€’ Version 1 3.4

β€’ Alternative Versions 3.5

β€’ Exercises 3.6

β€’ Solutions 3.7

We’re now ready to start learning the Python language itself.


The level of this and the next few lectures will suit those with some basic knowledge of pro-
gramming.
But don’t give up if you have noneβ€”you are not excluded.
You just need to cover a few of the fundamentals of programming before returning here.
Good references for first time programmers include:

β€’ The first 5 or 6 chapters of How to Think Like a Computer Scientist.


β€’ Automate the Boring Stuff with Python.
β€’ The start of Dive into Python 3.

Note: These references offer help on installing Python but you should probably stick with the
method on our set up page.
You’ll then have an outstanding scientific computing environment (Anaconda) and be ready
to move on to the rest of our course.

3.2 Overview

In this lecture, we will write and then pick apart small Python programs.

35
36 CHAPTER 3. AN INTRODUCTORY EXAMPLE

The objective is to introduce you to basic Python syntax and data structures.
Deeper concepts will be covered in later lectures.

3.2.1 Prerequisites

The lecture on getting started with Python.

3.3 The Task: Plotting a White Noise Process

Suppose we want to simulate and plot the white noise process πœ–0 , πœ–1 , … , πœ–π‘‡ , where each draw
πœ–π‘‘ is independent standard normal.
In other words, we want to generate figures that look something like this:

We’ll do this in several different ways.

3.4 Version 1

Here are a few lines of code that perform the task we set
import numpy as np
[1]: import matplotlib.pyplot as plt
%matplotlib inline

x = np.random.randn(100)
plt.plot(x)
plt.show()
3.4. VERSION 1 37

Let’s break this program down and see how it works.

3.4.1 Import Statements

The first two lines of the program import functionality.


The first line imports NumPy, a favorite Python package for tasks like

β€’ working with arrays (vectors and matrices)


β€’ common mathematical functions like cos and sqrt
β€’ generating random numbers
β€’ linear algebra, etc.

After import numpy as np we have access to these attributes via the syntax np..
Here’s another example
import numpy as np
[2]:
np.sqrt(4)

2.0
[2]:

We could also just write


import numpy
[3]:
numpy.sqrt(4)

2.0
[3]:

But the former method is convenient and more standard.


Why all the Imports?
38 CHAPTER 3. AN INTRODUCTORY EXAMPLE

Remember that Python is a general-purpose language.


The core language is quite small so it’s easy to learn and maintain.
When you want to do something interesting with Python, you almost always need to import
additional functionality.
Scientific work in Python is no exception.
Most of our programs start off with lines similar to the import statements seen above.
Packages
As stated above, NumPy is a Python package.
Packages are used by developers to organize a code library.
In fact, a package is just a directory containing

1. files with Python code β€” called modules in Python speak


2. possibly some compiled code that can be accessed by Python (e.g., functions compiled
from C or FORTRAN code)
3. a file called __init__.py that specifies what will be executed when we type import
package_name

In fact, you can find and explore the directory for NumPy on your computer easily enough if
you look around.
On this machine, it’s located in

anaconda3/lib/python3.6/site-packages/numpy

Subpackages
Consider the line x = np.random.randn(100).
Here np refers to the package NumPy, while random is a subpackage of NumPy.
You can see the contents here.
Subpackages are just packages that are subdirectories of another package.

3.4.2 Importing Names Directly

Recall this code that we saw above


import numpy as np
[4]:
np.sqrt(4)

2.0
[4]:

Here’s another way to access NumPy’s square root function


from numpy import sqrt
[5]:
sqrt(4)

2.0
[5]:
3.5. ALTERNATIVE VERSIONS 39

This is also fine.


The advantage is less typing if we use sqrt often in our code.
The disadvantage is that, in a long program, these two lines might be separated by many
other lines.
Then it’s harder for readers to know where sqrt came from, should they wish to.

3.5 Alternative Versions

Let’s try writing some alternative versions of our first program.


Our aim in doing this is to illustrate some more Python syntax and semantics.
The programs below are less efficient but

β€’ help us understand basic constructs like loops


β€’ illustrate common data types like lists

3.5.1 A Version with a For Loop

Here’s a version that illustrates loops and Python lists.


ts_length = 100
[6]: οΏ½_values = [] # Empty list

for i in range(ts_length):
e = np.random.randn()
οΏ½_values.append(e)

plt.plot(οΏ½_values)
plt.show()
40 CHAPTER 3. AN INTRODUCTORY EXAMPLE

In brief,

β€’ The first pair of lines import functionality as before.


β€’ The next line sets the desired length of the time series.
β€’ The next line creates an empty list called οΏ½_values that will store the πœ–π‘‘ values as we
generate them.
β€’ The next three lines are the for loop, which repeatedly draws a new random number πœ–π‘‘
and appends it to the end of the list οΏ½_values.
β€’ The last two lines generate the plot and display it to the user.

Let’s study some parts of this program in more detail.

3.5.2 Lists

Consider the statement οΏ½_values = [], which creates an empty list.


Lists are a native Python data structure used to group a collection of objects.
For example, try
x = [10, 'foo', False] # We can include heterogeneous data inside a list
[7]: type(x)

list
[7]:

The first element of x is an integer, the next is a string and the third is a Boolean value.
When adding a value to a list, we can use the syntax list_name.append(some_value)
x
[8]:
[10, 'foo', False]
[8]:
x.append(2.5)
[9]: x

[10, 'foo', False, 2.5]


[9]:

Here append() is what’s called a method, which is a function β€œattached to” an objectβ€”in
this case, the list x.
We’ll learn all about methods later on, but just to give you some idea,

β€’ Python objects such as lists, strings, etc. all have methods that are used to manipulate
the data contained in the object.
β€’ String objects have string methods, list objects have list methods, etc.

Another useful list method is pop()


x
[10]:
[10, 'foo', False, 2.5]
[10]:
x.pop()
[11]:
2.5
[11]:
x
[12]:
3.5. ALTERNATIVE VERSIONS 41

[10, 'foo', False]


[12]:

The full set of list methods can be found here.


Following C, C++, Java, etc., lists in Python are zero-based
x
[13]:
[10, 'foo', False]
[13]:
x[0]
[14]:
10
[14]:
x[1]
[15]:
'foo'
[15]:

3.5.3 The For Loop

Now let’s consider the for loop from the program above, which was
for i in range(ts_length):
[16]: e = np.random.randn()
οΏ½_values.append(e)

Python executes the two indented lines ts_length times before moving on.
These two lines are called a code block, since they comprise the β€œblock” of code that we
are looping over.
Unlike most other languages, Python knows the extent of the code block only from indenta-
tion.
In our program, indentation decreases after line οΏ½_values.append(e), telling Python that
this line marks the lower limit of the code block.
More on indentation belowβ€”for now, let’s look at another example of a for loop
animals = ['dog', 'cat', 'bird']
[17]: for animal in animals:
print("The plural of " + animal + " is " + animal + "s")

The plural of dog is dogs


The plural of cat is cats
The plural of bird is birds

This example helps to clarify how the for loop works: When we execute a loop of the form

for variable_name in sequence:


<code block>

The Python interpreter performs the following:

β€’ For each element of the sequence, it β€œbinds” the name variable_name to that ele-
ment and then executes the code block.

The sequence object can in fact be a very general object, as we’ll see soon enough.
42 CHAPTER 3. AN INTRODUCTORY EXAMPLE

3.5.4 Code Blocks and Indentation

In discussing the for loop, we explained that the code blocks being looped over are delimited
by indentation.
In fact, in Python, all code blocks (i.e., those occurring inside loops, if clauses, function defi-
nitions, etc.) are delimited by indentation.
Thus, unlike most other languages, whitespace in Python code affects the output of the pro-
gram.
Once you get used to it, this is a good thing: It

β€’ forces clean, consistent indentation, improving readability


β€’ removes clutter, such as the brackets or end statements used in other languages

On the other hand, it takes a bit of care to get right, so please remember:

β€’ The line before the start of a code block always ends in a colon

– for i in range(10):
– if x > y:
– while x < 100:
– etc., etc.

β€’ All lines in a code block must have the same amount of indentation.

β€’ The Python standard is 4 spaces, and that’s what you should use.

Tabs vs Spaces
One small β€œgotcha” here is the mixing of tabs and spaces, which often leads to errors.
(Important: Within text files, the internal representation of tabs and spaces is not the same)
You can use your Tab key to insert 4 spaces, but you need to make sure it’s configured to do
so.
If you are using a Jupyter notebook you will have no problems here.
Also, good text editors will allow you to configure the Tab key to insert spaces instead of tabs
β€” trying searching online.

3.5.5 While Loops

The for loop is the most common technique for iteration in Python.
But, for the purpose of illustration, let’s modify the program above to use a while loop in-
stead.
ts_length = 100
[18]: οΏ½_values = []
i = 0
while i < ts_length:
e = np.random.randn()
οΏ½_values.append(e)
i = i + 1
plt.plot(οΏ½_values)
3.5. ALTERNATIVE VERSIONS 43

plt.show()

Note that

β€’ the code block for the while loop is again delimited only by indentation
β€’ the statement i = i + 1 can be replaced by i += 1

3.5.6 User-Defined Functions

Now let’s go back to the for loop, but restructure our program to make the logic clearer.
To this end, we will break our program into two parts:

1. A user-defined function that generates a list of random variables.

2. The main part of the program that

3. calls this function to get data

4. plots the data

This is accomplished in the next program


def generate_data(n):
[19]: οΏ½_values = []
for i in range(n):
e = np.random.randn()
οΏ½_values.append(e)
return οΏ½_values

data = generate_data(100)
plt.plot(data)
plt.show()
44 CHAPTER 3. AN INTRODUCTORY EXAMPLE

Let’s go over this carefully, in case you’re not familiar with functions and how they work.
We have defined a function called generate_data() as follows

β€’ def is a Python keyword used to start function definitions.


β€’ def generate_data(n): indicates that the function is called generate_data and
that it has a single argument n.
β€’ The indented code is a code block called the function bodyβ€”in this case, it creates an
IID list of random draws using the same logic as before.
β€’ The return keyword indicates that οΏ½_values is the object that should be returned to
the calling code.

This whole function definition is read by the Python interpreter and stored in memory.
When the interpreter gets to the expression generate_data(100), it executes the function
body with n set equal to 100.
The net result is that the name data is bound to the list οΏ½_values returned by the func-
tion.

3.5.7 Conditions

Our function generate_data() is rather limited.


Let’s make it slightly more useful by giving it the ability to return either standard normals or
uniform random variables on (0, 1) as required.
This is achieved the next piece of code.
def generate_data(n, generator_type):
[20]: οΏ½_values = []
for i in range(n):
if generator_type == 'U':
e = np.random.uniform(0, 1)
3.5. ALTERNATIVE VERSIONS 45

else:
e = np.random.randn()
οΏ½_values.append(e)
return οΏ½_values

data = generate_data(100, 'U')


plt.plot(data)
plt.show()

Hopefully, the syntax of the if/else clause is self-explanatory, with indentation again delimit-
ing the extent of the code blocks.
Notes

β€’ We are passing the argument U as a string, which is why we write it as 'U'.

β€’ Notice that equality is tested with the == syntax, not =.

– For example, the statement a = 10 assigns the name a to the value 10.
– The expression a == 10 evaluates to either True or False, depending on the
value of a.

Now, there are several ways that we can simplify the code above.
For example, we can get rid of the conditionals all together by just passing the desired gener-
ator type as a function.
To understand this, consider the following version.
def generate_data(n, generator_type):
[21]: οΏ½_values = []
for i in range(n):
e = generator_type()
οΏ½_values.append(e)
return οΏ½_values
46 CHAPTER 3. AN INTRODUCTORY EXAMPLE

data = generate_data(100, np.random.uniform)


plt.plot(data)
plt.show()

Now, when we call the function generate_data(), we pass np.random.uniform as the


second argument.
This object is a function.
When the function call generate_data(100, np.random.uniform) is executed,
Python runs the function code block with n equal to 100 and the name generator_type
β€œbound” to the function np.random.uniform.

β€’ While these lines are executed, the names generator_type and


np.random.uniform are β€œsynonyms”, and can be used in identical ways.

This principle works more generallyβ€”for example, consider the following piece of code
max(7, 2, 4) # max() is a built-in Python function
[22]:
7
[22]:
m = max
[23]: m(7, 2, 4)

7
[23]:

Here we created another name for the built-in function max(), which could then be used in
identical ways.
In the context of our program, the ability to bind new names to functions means that there is
no problem passing a function as an argument to another functionβ€”as we did above.
3.6. EXERCISES 47

3.5.8 List Comprehensions

We can also simplify the code for generating the list of random draws considerably by using
something called a list comprehension.
List comprehensions are an elegant Python tool for creating lists.
Consider the following example, where the list comprehension is on the right-hand side of the
second line
animals = ['dog', 'cat', 'bird']
[24]: plurals = [animal + 's' for animal in animals]
plurals

['dogs', 'cats', 'birds']


[24]:

Here’s another example


range(8)
[25]:
range(0, 8)
[25]:
doubles = [2 * x for x in range(8)]
[26]: doubles

[0, 2, 4, 6, 8, 10, 12, 14]


[26]:

With the list comprehension syntax, we can simplify the lines

οΏ½_values = []
for i in range(n):
e = generator_type()
οΏ½_values.append(e)

into

οΏ½_values = [generator_type() for i in range(n)]

3.6 Exercises

3.6.1 Exercise 1

Recall that 𝑛! is read as β€œπ‘› factorial” and defined as 𝑛! = 𝑛 Γ— (𝑛 βˆ’ 1) Γ— β‹― Γ— 2 Γ— 1.


There are functions to compute this in various modules, but let’s write our own version as an
exercise.
In particular, write a function factorial such that factorial(n) returns 𝑛! for any posi-
tive integer 𝑛.

3.6.2 Exercise 2

The binomial random variable π‘Œ ∼ 𝐡𝑖𝑛(𝑛, 𝑝) represents the number of successes in 𝑛 binary
trials, where each trial succeeds with probability 𝑝.
48 CHAPTER 3. AN INTRODUCTORY EXAMPLE

Without any import besides from numpy.random import uniform, write a function
binomial_rv such that binomial_rv(n, p) generates one draw of π‘Œ .
Hint: If π‘ˆ is uniform on (0, 1) and 𝑝 ∈ (0, 1), then the expression U < p evaluates to True
with probability 𝑝.

3.6.3 Exercise 3

Compute an approximation to πœ‹ using Monte Carlo. Use no imports besides


import numpy as np
[27]:

Your hints are as follows:

β€’ If π‘ˆ is a bivariate uniform random variable on the unit square (0, 1)2 , then the proba-
bility that π‘ˆ lies in a subset 𝐡 of (0, 1)2 is equal to the area of 𝐡.
β€’ If π‘ˆ1 , … , π‘ˆπ‘› are IID copies of π‘ˆ , then, as 𝑛 gets large, the fraction that falls in 𝐡, con-
verges to the probability of landing in 𝐡.
β€’ For a circle, area = pi * radius^2.

3.6.4 Exercise 4

Write a program that prints one realization of the following random device:

β€’ Flip an unbiased coin 10 times.


β€’ If 3 consecutive heads occur one or more times within this sequence, pay one dollar.
β€’ If not, pay nothing.

Use no import besides from numpy.random import uniform.

3.6.5 Exercise 5

Your next task is to simulate and plot the correlated time series

π‘₯𝑑+1 = 𝛼 π‘₯𝑑 + πœ–π‘‘+1 where π‘₯0 = 0 and 𝑑 = 0, … , 𝑇

The sequence of shocks {πœ–π‘‘ } is assumed to be IID and standard normal.


In your solution, restrict your import statements to
import numpy as np
[28]: import matplotlib.pyplot as plt

Set 𝑇 = 200 and 𝛼 = 0.9.

3.6.6 Exercise 6

To do the next exercise, you will need to know how to produce a plot legend.
The following example should be sufficient to convey the idea
3.6. EXERCISES 49

import numpy as np
[29]: import matplotlib.pyplot as plt

x = [np.random.randn() for i in range(100)]


plt.plot(x, label="white noise")
plt.legend()
plt.show()

Now, starting with your solution to exercise 5, plot three simulated time series, one for each
of the cases 𝛼 = 0, 𝛼 = 0.8 and 𝛼 = 0.98.
In particular, you should produce (modulo randomness) a figure that looks as follows
50 CHAPTER 3. AN INTRODUCTORY EXAMPLE

(The figure nicely illustrates how time series with the same one-step-ahead conditional volatil-
ities, as these three processes have, can have very different unconditional volatilities.)
Use a for loop to step through the 𝛼 values.
Important hints:

β€’ If you call the plot() function multiple times before calling show(), all of the lines
you produce will end up on the same figure.

– And if you omit the argument 'b-' to the plot function, Matplotlib will automati-
cally select different colors for each line.

β€’ The expression 'foo' + str(42) evaluates to 'foo42'.

3.7 Solutions

3.7.1 Exercise 1

def factorial(n):
[30]: k = 1
for i in range(n):
k = k * (i + 1)
return k

factorial(4)

24
[30]:
3.7. SOLUTIONS 51

3.7.2 Exercise 2
from numpy.random import uniform
[31]:
def binomial_rv(n, p):
count = 0
for i in range(n):
U = uniform()
if U < p:
count = count + 1 # Or count += 1
return count

binomial_rv(10, 0.5)

5
[31]:

3.7.3 Exercise 3

Consider the circle of diameter 1 embedded in the unit square.


Let 𝐴 be its area and let π‘Ÿ = 1/2 be its radius.
If we know πœ‹ then we can compute 𝐴 via 𝐴 = πœ‹π‘Ÿ2 .
But here the point is to compute πœ‹, which we can do by πœ‹ = 𝐴/π‘Ÿ2 .
Summary: If we can estimate the area of the unit circle, then dividing by π‘Ÿ2 = (1/2)2 = 1/4
gives an estimate of πœ‹.
We estimate the area by sampling bivariate uniforms and looking at the fraction that falls
into the unit circle
n = 100000
[32]:
count = 0
for i in range(n):
u, v = np.random.uniform(), np.random.uniform()
d = np.sqrt((u - 0.5)**2 + (v - 0.5)**2)
if d < 0.5:
count += 1

area_estimate = count / n

print(area_estimate * 4) # dividing by radius**2

3.1448

3.7.4 Exercise 4
from numpy.random import uniform
[33]:
payoff = 0
count = 0

for i in range(10):
U = uniform()
count = count + 1 if U < 0.5 else 0
if count == 3:
payoff = 1

print(payoff)

1
52 CHAPTER 3. AN INTRODUCTORY EXAMPLE

3.7.5 Exercise 5

The next line embeds all subsequent figures in the browser itself
Ξ± = 0.9
[34]: ts_length = 200
current_x = 0

x_values = []
for i in range(ts_length + 1):
x_values.append(current_x)
current_x = Ξ± * current_x + np.random.randn()
plt.plot(x_values)
plt.show()

3.7.6 Exercise 6

Ξ±s = [0.0, 0.8, 0.98]


[35]: ts_length = 200

for Ξ± in Ξ±s:
x_values = []
current_x = 0
for i in range(ts_length):
x_values.append(current_x)
current_x = Ξ± * current_x + np.random.randn()
plt.plot(x_values, label=f'Ξ± = {Ξ±}')
plt.legend()
plt.show()
3.7. SOLUTIONS 53
54 CHAPTER 3. AN INTRODUCTORY EXAMPLE
Chapter 4

Python Essentials

4.1 Contents

β€’ Data Types 4.2


β€’ Input and Output 4.3
β€’ Iterating 4.4
β€’ Comparisons and Logical Operators 4.5
β€’ More Functions 4.6
β€’ Coding Style and PEP8 4.7
β€’ Exercises 4.8
β€’ Solutions 4.9

In this lecture, we’ll cover features of the language that are essential to reading and writing
Python code.

4.2 Data Types

We’ve already met several built-in Python data types, such as strings, integers, floats and
lists.
Let’s learn a bit more about them.

4.2.1 Primitive Data Types

One simple data type is Boolean values, which can be either True or False
x = True
[1]: x

True
[1]:

In the next line of code, the interpreter evaluates the expression on the right of = and binds y
to this value

55
56 CHAPTER 4. PYTHON ESSENTIALS

y = 100 < 10
[2]: y

False
[2]:
type(y)
[3]:
bool
[3]:

In arithmetic expressions, True is converted to 1 and False is converted 0.


This is called Boolean arithmetic and is often useful in programming.
Here are some examples
x + y
[4]:
1
[4]:
x * y
[5]:
0
[5]:
True + True
[6]:
2
[6]:
bools = [True, True, False, True] # List of Boolean values
[7]:
sum(bools)

3
[7]:

The two most common data types used to represent numbers are integers and floats
a, b = 1, 2
[8]: c, d = 2.5, 10.0
type(a)

int
[8]:
type(c)
[9]:
float
[9]:

Computers distinguish between the two because, while floats are more informative, arithmetic
operations on integers are faster and more accurate.
As long as you’re using Python 3.x, division of integers yields floats
1 / 2
[10]:
0.5
[10]:

But be careful! If you’re still using Python 2.x, division of two integers returns only the inte-
ger part.
For integer division in Python 3.x use this syntax:
1 // 2
[11]:
0
[11]:

Complex numbers are another primitive data type in Python


4.2. DATA TYPES 57

x = complex(1, 2)
[12]: y = complex(2, 1)
x * y

5j
[12]:

4.2.2 Containers

Python has several basic types for storing collections of (possibly heterogeneous) data.
We’ve already discussed lists.
A related data type is tuples, which are β€œimmutable” lists
x = ('a', 'b') # Parentheses instead of the square brackets
[13]: x = 'a', 'b' # Or no brackets --- the meaning is identical
x

('a', 'b')
[13]:
type(x)
[14]:
tuple
[14]:

In Python, an object is called immutable if, once created, the object cannot be changed.
Conversely, an object is mutable if it can still be altered after creation.
Python lists are mutable
x = [1, 2]
[15]: x[0] = 10
x

[10, 2]
[15]:

But tuples are not


x = (1, 2)
[16]: x[0] = 10

---------------------------------------------------------------------------

TypeError Traceback (most recent call last)

<ipython-input-16-d1b2647f6c81> in <module>
1 x = (1, 2)
----> 2 x[0] = 10

TypeError: 'tuple' object does not support item assignment

We’ll say more about the role of mutable and immutable data a bit later.
Tuples (and lists) can be β€œunpacked” as follows
integers = (10, 20, 30)
[17]: x, y, z = integers
x

10
[17]:
58 CHAPTER 4. PYTHON ESSENTIALS

y
[18]:
20
[18]:

You’ve actually seen an example of this already.


Tuple unpacking is convenient and we’ll use it often.
Slice Notation
To access multiple elements of a list or tuple, you can use Python’s slice notation.
For example,
a = [2, 4, 6, 8]
[19]: a[1:]

[4, 6, 8]
[19]:
a[1:3]
[20]:
[4, 6]
[20]:

The general rule is that a[m:n] returns n - m elements, starting at a[m].


Negative numbers are also permissible
a[-2:] # Last two elements of the list
[21]:
[6, 8]
[21]:

The same slice notation works on tuples and strings


s = 'foobar'
[22]: s[-3:] # Select the last three elements

'bar'
[22]:

Sets and Dictionaries


Two other container types we should mention before moving on are sets and dictionaries.
Dictionaries are much like lists, except that the items are named instead of numbered
d = {'name': 'Frodo', 'age': 33}
[23]: type(d)

dict
[23]:
d['age']
[24]:
33
[24]:

The names 'name' and 'age' are called the keys.


The objects that the keys are mapped to ('Frodo' and 33) are called the values.
Sets are unordered collections without duplicates, and set methods provide the usual set-
theoretic operations
s1 = {'a', 'b'}
[25]: type(s1)

set
[25]:
4.3. INPUT AND OUTPUT 59

s2 = {'b', 'c'}
[26]: s1.issubset(s2)

False
[26]:
s1.intersection(s2)
[27]:
{'b'}
[27]:

The set() function creates sets from sequences


s3 = set(('foo', 'bar', 'foo'))
[28]: s3

{'bar', 'foo'}
[28]:

4.3 Input and Output

Let’s briefly review reading and writing to text files, starting with writing
f = open('newfile.txt', 'w') # Open 'newfile.txt' for writing
[29]: f.write('Testing\n') # Here '\n' means new line
f.write('Testing again')
f.close()

Here

β€’ The built-in function open() creates a file object for writing to.
β€’ Both write() and close() are methods of file objects.

Where is this file that we’ve created?


Recall that Python maintains a concept of the present working directory (pwd) that can be
located from with Jupyter or IPython via
%pwd
[30]:
'/home/ubuntu/repos/lecture-source-py/_build/pdf/jupyter/executed'
[30]:

If a path is not specified, then this is where Python writes to.


We can also use Python to read the contents of newline.txt as follows
f = open('newfile.txt', 'r')
[31]: out = f.read()
out

'Testing\nTesting again'
[31]:
print(out)
[32]:

Testing
Testing again

4.3.1 Paths

Note that if newfile.txt is not in the present working directory then this call to open()
fails.
60 CHAPTER 4. PYTHON ESSENTIALS

In this case, you can shift the file to the pwd or specify the full path to the file

f = open('insert_full_path_to_file/newfile.txt', 'r')

4.4 Iterating

One of the most important tasks in computing is stepping through a sequence of data and
performing a given action.
One of Python’s strengths is its simple, flexible interface to this kind of iteration via the for
loop.

4.4.1 Looping over Different Objects

Many Python objects are β€œiterable”, in the sense that they can be looped over.
To give an example, let’s write the file us_cities.txt, which lists US cities and their popula-
tion, to the present working directory.
%%file us_cities.txt
[33]: new york: 8244910
los angeles: 3819702
chicago: 2707120
houston: 2145146
philadelphia: 1536471
phoenix: 1469471
san antonio: 1359758
san diego: 1326179
dallas: 1223229

Overwriting us_cities.txt

Suppose that we want to make the information more readable, by capitalizing names and
adding commas to mark thousands.
The program below reads the data in and makes the conversion:
data_file = open('us_cities.txt', 'r')
[34]: for line in data_file:
city, population = line.split(':') # Tuple unpacking
city = city.title() # Capitalize city names
population = f'{int(population):,}' # Add commas to numbers
print(city.ljust(15) + population)
data_file.close()

New York 8,244,910


Los Angeles 3,819,702
Chicago 2,707,120
Houston 2,145,146
Philadelphia 1,536,471
Phoenix 1,469,471
San Antonio 1,359,758
San Diego 1,326,179
Dallas 1,223,229

Here format() is a string method used for inserting variables into strings.
The reformatting of each line is the result of three different string methods, the details of
which can be left till later.
4.4. ITERATING 61

The interesting part of this program for us is line 2, which shows that

1. The file object f is iterable, in the sense that it can be placed to the right of in within
a for loop.
2. Iteration steps through each line in the file.

This leads to the clean, convenient syntax shown in our program.


Many other kinds of objects are iterable, and we’ll discuss some of them later on.

4.4.2 Looping without Indices

One thing you might have noticed is that Python tends to favor looping without explicit in-
dexing.
For example,
x_values = [1, 2, 3] # Some iterable x
[35]: for x in x_values:
print(x * x)

1
4
9

is preferred to
for i in range(len(x_values)):
[36]: print(x_values[i] * x_values[i])

1
4
9

When you compare these two alternatives, you can see why the first one is preferred.
Python provides some facilities to simplify looping without indices.
One is zip(), which is used for stepping through pairs from two sequences.
For example, try running the following code
countries = ('Japan', 'Korea', 'China')
[37]: cities = ('Tokyo', 'Seoul', 'Beijing')
for country, city in zip(countries, cities):
print(f'The capital of {country} is {city}')

The capital of Japan is Tokyo


The capital of Korea is Seoul
The capital of China is Beijing

The zip() function is also useful for creating dictionaries β€” for example
names = ['Tom', 'John']
[38]: marks = ['E', 'F']
dict(zip(names, marks))
62 CHAPTER 4. PYTHON ESSENTIALS

{'Tom': 'E', 'John': 'F'}


[38]:

If we actually need the index from a list, one option is to use enumerate().
To understand what enumerate() does, consider the following example
letter_list = ['a', 'b', 'c']
[39]: for index, letter in enumerate(letter_list):
print(f"letter_list[{index}] = '{letter}'")

letter_list[0] = 'a'
letter_list[1] = 'b'
letter_list[2] = 'c'

The output of the loop is


letter_list[0] = 'a'
[40]: letter_list[1] = 'b'
letter_list[2] = 'c'

4.5 Comparisons and Logical Operators

4.5.1 Comparisons

Many different kinds of expressions evaluate to one of the Boolean values (i.e., True or
False).
A common type is comparisons, such as
x, y = 1, 2
[41]: x < y

True
[41]:
x > y
[42]:
False
[42]:

One of the nice features of Python is that we can chain inequalities


1 < 2 < 3
[43]:
True
[43]:
1 <= 2 <= 3
[44]:
True
[44]:

As we saw earlier, when testing for equality we use ==


x = 1 # Assignment
[45]: x == 2 # Comparison

False
[45]:

For β€œnot equal” use !=


1 != 2
[46]:
True
[46]:
4.5. COMPARISONS AND LOGICAL OPERATORS 63

Note that when testing conditions, we can use any valid Python expression
x = 'yes' if 42 else 'no'
[47]: x

'yes'
[47]:
x = 'yes' if [] else 'no'
[48]: x

'no'
[48]:

What’s going on here?


The rule is:

β€’ Expressions that evaluate to zero, empty sequences or containers (strings, lists, etc.)
and None are all equivalent to False.

– for example, [] and () are equivalent to False in an if clause

β€’ All other values are equivalent to True.

– for example, 42 is equivalent to True in an if clause

4.5.2 Combining Expressions

We can combine expressions using and, or and not.


These are the standard logical connectives (conjunction, disjunction and denial)
1 < 2 and 'f' in 'foo'
[49]:
True
[49]:
1 < 2 and 'g' in 'foo'
[50]:
False
[50]:
1 < 2 or 'g' in 'foo'
[51]:
True
[51]:
not True
[52]:
False
[52]:
not not True
[53]:
True
[53]:

Remember

β€’ P and Q is True if both are True, else False


β€’ P or Q is False if both are False, else True
64 CHAPTER 4. PYTHON ESSENTIALS

4.6 More Functions

Let’s talk a bit more about functions, which are all important for good programming style.
Python has a number of built-in functions that are available without import.
We have already met some
max(19, 20)
[54]:
20
[54]:
range(4) # in python3 this returns a range iterator object
[55]:
range(0, 4)
[55]:
list(range(4)) # will evaluate the range iterator and create a list
[56]:
[0, 1, 2, 3]
[56]:
str(22)
[57]:
'22'
[57]:
type(22)
[58]:
int
[58]:

Two more useful built-in functions are any() and all()


bools = False, True, True
[59]: all(bools) # True if all are True and False otherwise

False
[59]:
any(bools) # False if all are False and True otherwise
[]:
True
[]:

The full list of Python built-ins is here.


Now let’s talk some more about user-defined functions constructed using the keyword def.

4.6.1 Why Write Functions?

User-defined functions are important for improving the clarity of your code by

β€’ separating different strands of logic


β€’ facilitating code reuse

(Writing the same thing twice is almost always a bad idea)


The basics of user-defined functions were discussed here.

4.6.2 The Flexibility of Python Functions

As we discussed in the previous lecture, Python functions are very flexible.


In particular
4.6. MORE FUNCTIONS 65

β€’ Any number of functions can be defined in a given file.


β€’ Functions can be (and often are) defined inside other functions.
β€’ Any object can be passed to a function as an argument, including other functions.
β€’ A function can return any kind of object, including functions.

We already gave an example of how straightforward it is to pass a function to a function.


Note that a function can have arbitrarily many return statements (including zero).
Execution of the function terminates when the first return is hit, allowing code like the fol-
lowing example
def f(x):
[61]: if x < 0:
return 'negative'
return 'nonnegative'

Functions without a return statement automatically return the special Python object None.

4.6.3 Docstrings

Python has a system for adding comments to functions, modules, etc. called docstrings.
The nice thing about docstrings is that they are available at run-time.
Try running this
def f(x):
[62]: """
This function squares its argument
"""
return x**2

After running this code, the docstring is available


f?
[63]:

Type: function
String Form:<function f at 0x2223320>
File: /home/john/temp/temp.py
Definition: f(x)
Docstring: This function squares its argument

f??
[64]:

Type: function
String Form:<function f at 0x2223320>
File: /home/john/temp/temp.py
Definition: f(x)
Source:
def f(x):
"""
This function squares its argument
"""
return x**2
66 CHAPTER 4. PYTHON ESSENTIALS

With one question mark we bring up the docstring, and with two we get the source code as
well.

4.6.4 One-Line Functions: lambda

The lambda keyword is used to create simple functions on one line.


For example, the definitions
def f(x):
[65]: return x**3

and
f = lambda x: x**3
[66]:

are entirely equivalent.


2
To see why lambda is useful, suppose that we want to calculate ∫0 π‘₯3 𝑑π‘₯ (and have forgotten
our high-school calculus).
The SciPy library has a function called quad that will do this calculation for us.
The syntax of the quad function is quad(f, a, b) where f is a function and a and b are
numbers.
To create the function 𝑓(π‘₯) = π‘₯3 we can use lambda as follows
from scipy.integrate import quad
[67]:
quad(lambda x: x**3, 0, 2)

(4.0, 4.440892098500626e-14)
[67]:

Here the function created by lambda is said to be anonymous because it was never given a
name.

4.6.5 Keyword Arguments

If you did the exercises in the previous lecture, you would have come across the statement

plt.plot(x, 'b-', label="white noise")

In this call to Matplotlib’s plot function, notice that the last argument is passed in
name=argument syntax.
This is called a keyword argument, with label being the keyword.
Non-keyword arguments are called positional arguments, since their meaning is determined by
order

β€’ plot(x, 'b-', label="white noise") is different from plot('b-', x,


label="white noise")

Keyword arguments are particularly useful when a function has a lot of arguments, in which
case it’s hard to remember the right order.
4.7. CODING STYLE AND PEP8 67

You can adopt keyword arguments in user-defined functions with no difficulty.


The next example illustrates the syntax
def f(x, a=1, b=1):
[68]: return a + b * x

The keyword argument values we supplied in the definition of f become the default values
f(2)
[69]:
3
[69]:

They can be modified as follows


f(2, a=4, b=5)
[70]:
14
[70]:

4.7 Coding Style and PEP8

To learn more about the Python programming philosophy type import this at the
prompt.
Among other things, Python strongly favors consistency in programming style.
We’ve all heard the saying about consistency and little minds.
In programming, as in mathematics, the opposite is true

β€’ A mathematical paper where the symbols βˆͺ and ∩ were reversed would be very hard to
read, even if the author told you so on the first page.

In Python, the standard style is set out in PEP8.


(Occasionally we’ll deviate from PEP8 in these lectures to better match mathematical nota-
tion)

4.8 Exercises

Solve the following exercises.


(For some, the built-in function sum() comes in handy).

4.8.1 Exercise 1

Part 1: Given two numeric lists or tuples x_vals and y_vals of equal length, compute their
inner product using zip().
Part 2: In one line, count the number of even numbers in 0,…,99.

β€’ Hint: x % 2 returns 0 if x is even, 1 otherwise.

Part 3: Given pairs = ((2, 5), (4, 2), (9, 8), (12, 10)), count the number of
pairs (a, b) such that both a and b are even.
68 CHAPTER 4. PYTHON ESSENTIALS

4.8.2 Exercise 2

Consider the polynomial

𝑛
𝑝(π‘₯) = π‘Ž0 + π‘Ž1 π‘₯ + π‘Ž2 π‘₯2 + β‹― π‘Žπ‘› π‘₯𝑛 = βˆ‘ π‘Žπ‘– π‘₯𝑖 (1)
𝑖=0

Write a function p such that p(x, coeff) that computes the value in Eq. (1) given a point
x and a list of coefficients coeff.
Try to use enumerate() in your loop.

4.8.3 Exercise 3

Write a function that takes a string as an argument and returns the number of capital letters
in the string.
Hint: 'foo'.upper() returns 'FOO'.

4.8.4 Exercise 4

Write a function that takes two sequences seq_a and seq_b as arguments and returns True
if every element in seq_a is also an element of seq_b, else False.

β€’ By β€œsequence” we mean a list, a tuple or a string.


β€’ Do the exercise without using sets and set methods.

4.8.5 Exercise 5

When we cover the numerical libraries, we will see they include many alternatives for interpo-
lation and function approximation.
Nevertheless, let’s write our own function approximation routine as an exercise.
In particular, without using any imports, write a function linapprox that takes as argu-
ments

β€’ A function f mapping some interval [π‘Ž, 𝑏] into R.


β€’ Two scalars a and b providing the limits of this interval.
β€’ An integer n determining the number of grid points.
β€’ A number x satisfying a <= x <= b.

and returns the piecewise linear interpolation of f at x, based on n evenly spaced grid points
a = point[0] < point[1] < ... < point[n-1] = b.
Aim for clarity, not efficiency.
4.9. SOLUTIONS 69

4.9 Solutions

4.9.1 Exercise 1

Part 1 Solution:
Here’s one possible solution
x_vals = [1, 2, 3]
[71]: y_vals = [1, 1, 1]
sum([x * y for x, y in zip(x_vals, y_vals)])

6
[71]:

This also works


sum(x * y for x, y in zip(x_vals, y_vals))
[72]:
6
[72]:

Part 2 Solution:
One solution is
sum([x % 2 == 0 for x in range(100)])
[73]:
50
[73]:

This also works:


sum(x % 2 == 0 for x in range(100))
[74]:
50
[74]:

Some less natural alternatives that nonetheless help to illustrate the flexibility of list compre-
hensions are
len([x for x in range(100) if x % 2 == 0])
[75]:
50
[75]:

and
sum([1 for x in range(100) if x % 2 == 0])
[76]:
50
[76]:

Part 3 Solution
Here’s one possibility
pairs = ((2, 5), (4, 2), (9, 8), (12, 10))
[77]: sum([x % 2 == 0 and y % 2 == 0 for x, y in pairs])

2
[77]:

4.9.2 Exercise 2
def p(x, coeff):
[78]: return sum(a * x**i for i, a in enumerate(coeff))

p(1, (2, 4))


[79]:
70 CHAPTER 4. PYTHON ESSENTIALS

6
[79]:

4.9.3 Exercise 3

Here’s one solution:


def f(string):
[80]: count = 0
for letter in string:
if letter == letter.upper() and letter.isalpha():
count += 1
return count

f('The Rain in Spain')

3
[80]:

An alternative, more pythonic solution:


def count_uppercase_chars(s):
[81]: return sum([c.isupper() for c in s])

count_uppercase_chars('The Rain in Spain')

3
[81]:

4.9.4 Exercise 4

Here’s a solution:
def f(seq_a, seq_b):
[82]: is_subset = True
for a in seq_a:
if a not in seq_b:
is_subset = False
return is_subset

# == test == #

print(f([1, 2], [1, 2, 3]))


print(f([1, 2, 3], [1, 2]))

True
False

Of course, if we use the sets data type then the solution is easier
def f(seq_a, seq_b):
[83]: return set(seq_a).issubset(set(seq_b))

4.9.5 Exercise 5
def linapprox(f, a, b, n, x):
[84]: """
Evaluates the piecewise linear interpolant of f at x on the interval
[a, b], with n evenly spaced grid points.

Parameters
==========
f : function
The function to approximate
4.9. SOLUTIONS 71

x, a, b : scalars (floats or integers)


Evaluation point and endpoints, with a <= x <= b

n : integer
Number of grid points

Returns
=======
A float. The interpolant evaluated at x

"""
length_of_interval = b - a
num_subintervals = n - 1
step = length_of_interval / num_subintervals

# === find first grid point larger than x === #


point = a
while point <= x:
point += step

# === x must lie between the gridpoints (point - step) and point === #
u, v = point - step, point

return f(u) + (x - u) * (f(v) - f(u)) / (v - u)


72 CHAPTER 4. PYTHON ESSENTIALS
Chapter 5

OOP I: Introduction to Object


Oriented Programming

5.1 Contents

β€’ Overview 5.2

β€’ Objects 5.3

β€’ Summary 5.4

5.2 Overview

OOP is one of the major paradigms in programming.


The traditional programming paradigm (think Fortran, C, MATLAB, etc.) is called procedu-
ral.
It works as follows

β€’ The program has a state corresponding to the values of its variables.


β€’ Functions are called to act on these data.
β€’ Data are passed back and forth via function calls.

In contrast, in the OOP paradigm

β€’ data and functions are β€œbundled together” into β€œobjects”

(Functions in this context are referred to as methods)

5.2.1 Python and OOP

Python is a pragmatic language that blends object-oriented and procedural styles, rather than
taking a purist approach.
However, at a foundational level, Python is object-oriented.

73
74 CHAPTER 5. OOP I: INTRODUCTION TO OBJECT ORIENTED PROGRAMMING

In particular, in Python, everything is an object.


In this lecture, we explain what that statement means and why it matters.

5.3 Objects

In Python, an object is a collection of data and instructions held in computer memory that
consists of

1. a type
2. a unique identity
3. data (i.e., content)
4. methods

These concepts are defined and discussed sequentially below.

5.3.1 Type

Python provides for different types of objects, to accommodate different categories of data.
For example
s = 'This is a string'
[1]: type(s)

str
[1]:
x = 42 # Now let's create an integer
[2]: type(x)

int
[2]:

The type of an object matters for many expressions.


For example, the addition operator between two strings means concatenation
'300' + 'cc'
[3]:
'300cc'
[3]:

On the other hand, between two numbers it means ordinary addition


300 + 400
[4]:
700
[4]:

Consider the following expression


'300' + 400
[5]:

---------------------------------------------------------------------------

TypeError Traceback (most recent call last)

<ipython-input-5-263a89d2d982> in <module>
----> 1 '300' + 400
5.3. OBJECTS 75

TypeError: can only concatenate str (not "int") to str

Here we are mixing types, and it’s unclear to Python whether the user wants to

β€’ convert '300' to an integer and then add it to 400, or


β€’ convert 400 to string and then concatenate it with '300'

Some languages might try to guess but Python is strongly typed

β€’ Type is important, and implicit type conversion is rare.


β€’ Python will respond instead by raising a TypeError.

To avoid the error, you need to clarify by changing the relevant type.
For example,
int('300') + 400 # To add as numbers, change the string to an integer
[6]:
700
[6]:

5.3.2 Identity

In Python, each object has a unique identifier, which helps Python (and us) keep track of the
object.
The identity of an object can be obtained via the id() function
y = 2.5
[7]: z = 2.5
id(y)

140473280865624
[7]:
id(z)
[8]:
140473280865648
[8]:

In this example, y and z happen to have the same value (i.e., 2.5), but they are not the
same object.
The identity of an object is in fact just the address of the object in memory.

5.3.3 Object Content: Data and Attributes

If we set x = 42 then we create an object of type int that contains the data 42.
In fact, it contains more, as the following example shows
x = 42
[9]: x

42
[9]:
x.imag
[10]:
76 CHAPTER 5. OOP I: INTRODUCTION TO OBJECT ORIENTED PROGRAMMING

0
[10]:
x.__class__
[11]:
int
[11]:

When Python creates this integer object, it stores with it various auxiliary information, such
as the imaginary part, and the type.
Any name following a dot is called an attribute of the object to the left of the dot.

β€’ e.g.,imag and __class__ are attributes of x.

We see from this example that objects have attributes that contain auxiliary information.
They also have attributes that act like functions, called methods.
These attributes are important, so let’s discuss them in-depth.

5.3.4 Methods

Methods are functions that are bundled with objects.


Formally, methods are attributes of objects that are callable (i.e., can be called as functions)
x = ['foo', 'bar']
[12]: callable(x.append)

True
[12]:
callable(x.__doc__)
[13]:
False
[13]:

Methods typically act on the data contained in the object they belong to, or combine that
data with other data
x = ['a', 'b']
[14]: x.append('c')
s = 'This is a string'
s.upper()

'THIS IS A STRING'
[14]:
s.lower()
[15]:
'this is a string'
[15]:
s.replace('This', 'That')
[16]:
'That is a string'
[16]:

A great deal of Python functionality is organized around method calls.


For example, consider the following piece of code
x = ['a', 'b']
[17]: x[0] = 'aa' # Item assignment using square bracket notation
x

['aa', 'b']
[17]:
5.4. SUMMARY 77

It doesn’t look like there are any methods used here, but in fact the square bracket assign-
ment notation is just a convenient interface to a method call.
What actually happens is that Python calls the __setitem__ method, as follows
x = ['a', 'b']
[18]: x.__setitem__(0, 'aa') # Equivalent to x[0] = 'aa'
x

['aa', 'b']
[18]:

(If you wanted to you could modify the __setitem__ method, so that square bracket as-
signment does something totally different)

5.4 Summary

In Python, everything in memory is treated as an object.


This includes not just lists, strings, etc., but also less obvious things, such as

β€’ functions (once they have been read into memory)


β€’ modules (ditto)
β€’ files opened for reading or writing
β€’ integers, etc.

Consider, for example, functions.


When Python reads a function definition, it creates a function object and stores it in mem-
ory.
The following code illustrates
def f(x): return x**2
[19]: f

<function __main__.f(x)>
[19]:
type(f)
[20]:
function
[20]:
id(f)
[21]:
140473061744848
[21]:
f.__name__
[22]:
'f'
[22]:

We can see that f has type, identity, attributes and so onβ€”just like any other object.
It also has methods.
One example is the __call__ method, which just evaluates the function
f.__call__(3)
[23]:
9
[23]:

Another is the __dir__ method, which returns a list of attributes.


78 CHAPTER 5. OOP I: INTRODUCTION TO OBJECT ORIENTED PROGRAMMING

Modules loaded into memory are also treated as objects


import math
[24]:
id(math)

140473369783576
[24]:

This uniform treatment of data in Python (everything is an object) helps keep the language
simple and consistent.
Part II

The Scientific Libraries

79
Chapter 6

NumPy

6.1 Contents

β€’ Overview 6.2

β€’ Introduction to NumPy 6.3

β€’ NumPy Arrays 6.4

β€’ Operations on Arrays 6.5

β€’ Additional Functionality 6.6

β€’ Exercises 6.7

β€’ Solutions 6.8

β€œLet’s be clear: the work of science has nothing whatever to do with consensus.
Consensus is the business of politics. Science, on the contrary, requires only one
investigator who happens to be right, which means that he or she has results that
are verifiable by reference to the real world. In science consensus is irrelevant.
What is relevant is reproducible results.” – Michael Crichton

6.2 Overview

NumPy is a first-rate library for numerical programming

β€’ Widely used in academia, finance and industry.


β€’ Mature, fast, stable and under continuous development.

In this lecture, we introduce NumPy arrays and the fundamental array processing operations
provided by NumPy.

6.2.1 References

β€’ The official NumPy documentation.

81
82 CHAPTER 6. NUMPY

6.3 Introduction to NumPy

The essential problem that NumPy solves is fast array processing.


For example, suppose we want to create an array of 1 million random draws from a uniform
distribution and compute the mean.
If we did this in pure Python it would be orders of magnitude slower than C or Fortran.
This is because

β€’ Loops in Python over Python data types like lists carry significant overhead.
β€’ C and Fortran code contains a lot of type information that can be used for optimiza-
tion.
β€’ Various optimizations can be carried out during compilation when the compiler sees the
instructions as a whole.

However, for a task like the one described above, there’s no need to switch back to C or For-
tran.
Instead, we can use NumPy, where the instructions look like this:
import numpy as np
[1]:
x = np.random.uniform(0, 1, size=1000000)
x.mean()

0.49990572835210734
[1]:

The operations of creating the array and computing its mean are both passed out to carefully
optimized machine code compiled from C.
More generally, NumPy sends operations in batches to optimized C and Fortran code.
This is similar in spirit to Matlab, which provides an interface to fast Fortran routines.

6.3.1 A Comment on Vectorization

NumPy is great for operations that are naturally vectorized.


Vectorized operations are precompiled routines that can be sent in batches, like

β€’ matrix multiplication and other linear algebra routines


β€’ generating a vector of random numbers
β€’ applying a fixed transformation (e.g., sine or cosine) to an entire array

In a later lecture, we’ll discuss code that isn’t easy to vectorize and how such routines can
also be optimized.

6.4 NumPy Arrays

The most important thing that NumPy defines is an array data type formally called a
numpy.ndarray.
NumPy arrays power a large proportion of the scientific Python ecosystem.
6.4. NUMPY ARRAYS 83

To create a NumPy array containing only zeros we use np.zeros


a = np.zeros(3)
[2]: a

array([0., 0., 0.])


[2]:
type(a)
[3]:
numpy.ndarray
[3]:

NumPy arrays are somewhat like native Python lists, except that

β€’ Data must be homogeneous (all elements of the same type).


β€’ These types must be one of the data types (dtypes) provided by NumPy.

The most important of these dtypes are:

β€’ float64: 64 bit floating-point number


β€’ int64: 64 bit integer
β€’ bool: 8 bit True or False

There are also dtypes to represent complex numbers, unsigned integers, etc.
On modern machines, the default dtype for arrays is float64
a = np.zeros(3)
[4]: type(a[0])

numpy.float64
[4]:

If we want to use integers we can specify as follows:


a = np.zeros(3, dtype=int)
[5]: type(a[0])

numpy.int64
[5]:

6.4.1 Shape and Dimension

Consider the following assignment


z = np.zeros(10)
[6]:

Here z is a flat array with no dimension β€” neither row nor column vector.
The dimension is recorded in the shape attribute, which is a tuple
z.shape
[7]:
(10,)
[7]:

Here the shape tuple has only one element, which is the length of the array (tuples with one
element end with a comma).
To give it dimension, we can change the shape attribute
z.shape = (10, 1)
[8]: z
84 CHAPTER 6. NUMPY

array([[0.],
[8]: [0.],
[0.],
[0.],
[0.],
[0.],
[0.],
[0.],
[0.],
[0.]])

z = np.zeros(4)
[9]: z.shape = (2, 2)
z

array([[0., 0.],
[9]: [0., 0.]])

In the last case, to make the 2 by 2 array, we could also pass a tuple to the zeros() func-
tion, as in z = np.zeros((2, 2)).

6.4.2 Creating Arrays

As we’ve seen, the np.zeros function creates an array of zeros.


You can probably guess what np.ones creates.
Related is np.empty, which creates arrays in memory that can later be populated with data
z = np.empty(3)
[10]: z

array([0., 0., 0.])


[10]:

The numbers you see here are garbage values.


(Python allocates 3 contiguous 64 bit pieces of memory, and the existing contents of those
memory slots are interpreted as float64 values)
To set up a grid of evenly spaced numbers use np.linspace
z = np.linspace(2, 4, 5) # From 2 to 4, with 5 elements
[11]:

To create an identity matrix use either np.identity or np.eye


z = np.identity(2)
[12]: z

array([[1., 0.],
[12]: [0., 1.]])

In addition, NumPy arrays can be created from Python lists, tuples, etc. using np.array
z = np.array([10, 20]) # ndarray from Python list
[13]: z

array([10, 20])
[13]:
type(z)
[14]:
numpy.ndarray
[14]:
z = np.array((10, 20), dtype=float) # Here 'float' is equivalent to 'np.float64'
[15]: z
6.4. NUMPY ARRAYS 85

array([10., 20.])
[15]:
z = np.array([[1, 2], [3, 4]]) # 2D array from a list of lists
[16]: z

array([[1, 2],
[16]: [3, 4]])

See also np.asarray, which performs a similar function, but does not make a distinct copy
of data already in a NumPy array.
na = np.linspace(10, 20, 2)
[17]: na is np.asarray(na) # Does not copy NumPy arrays

True
[17]:
na is np.array(na) # Does make a new copy --- perhaps unnecessarily
[18]:
False
[18]:

To read in the array data from a text file containing numeric data use np.loadtxt or
np.genfromtxtβ€”see the documentation for details.

6.4.3 Array Indexing

For a flat array, indexing is the same as Python sequences:


z = np.linspace(1, 2, 5)
[19]: z

array([1. , 1.25, 1.5 , 1.75, 2. ])


[19]:
z[0]
[20]:
1.0
[20]:
z[0:2] # Two elements, starting at element 0
[21]:
array([1. , 1.25])
[21]:
z[-1]
[22]:
2.0
[22]:

For 2D arrays the index syntax is as follows:


z = np.array([[1, 2], [3, 4]])
[23]: z

array([[1, 2],
[23]: [3, 4]])

z[0, 0]
[24]:
1
[24]:
z[0, 1]
[25]:
2
[25]:

And so on.
86 CHAPTER 6. NUMPY

Note that indices are still zero-based, to maintain compatibility with Python sequences.
Columns and rows can be extracted as follows
z[0, :]
[26]:
array([1, 2])
[26]:
z[:, 1]
[27]:
array([2, 4])
[27]:

NumPy arrays of integers can also be used to extract elements


z = np.linspace(2, 4, 5)
[28]: z

array([2. , 2.5, 3. , 3.5, 4. ])


[28]:
indices = np.array((0, 2, 3))
[29]: z[indices]

array([2. , 3. , 3.5])
[29]:

Finally, an array of dtype bool can be used to extract elements


z
[30]:
array([2. , 2.5, 3. , 3.5, 4. ])
[30]:
d = np.array([0, 1, 1, 0, 0], dtype=bool)
[31]: d

array([False, True, True, False, False])


[31]:
z[d]
[32]:
array([2.5, 3. ])
[32]:

We’ll see why this is useful below.


An aside: all elements of an array can be set equal to one number using slice notation
z = np.empty(3)
[33]: z

array([2. , 3. , 3.5])
[33]:
z[:] = 42
[34]: z

array([42., 42., 42.])


[34]:

6.4.4 Array Methods

Arrays have useful methods, all of which are carefully optimized


a = np.array((4, 3, 2, 1))
[35]: a

array([4, 3, 2, 1])
[35]:
6.4. NUMPY ARRAYS 87

a.sort() # Sorts a in place


[36]: a

array([1, 2, 3, 4])
[36]:
a.sum() # Sum
[37]:
10
[37]:
a.mean() # Mean
[38]:
2.5
[38]:
a.max() # Max
[39]:
4
[39]:
a.argmax() # Returns the index of the maximal element
[40]:
3
[40]:
a.cumsum() # Cumulative sum of the elements of a
[41]:
array([ 1, 3, 6, 10])
[41]:
a.cumprod() # Cumulative product of the elements of a
[42]:
array([ 1, 2, 6, 24])
[42]:
a.var() # Variance
[43]:
1.25
[43]:
a.std() # Standard deviation
[44]:
1.118033988749895
[44]:
a.shape = (2, 2)
[45]: a.T # Equivalent to a.transpose()

array([[1, 3],
[45]: [2, 4]])

Another method worth knowing is searchsorted().


If z is a nondecreasing array, then z.searchsorted(a) returns the index of the first ele-
ment of z that is >= a
z = np.linspace(2, 4, 5)
[46]: z

array([2. , 2.5, 3. , 3.5, 4. ])


[46]:
z.searchsorted(2.2)
[47]:
1
[47]:

Many of the methods discussed above have equivalent functions in the NumPy namespace
a = np.array((4, 3, 2, 1))
[48]:
np.sum(a)
[49]:
10
[49]:
88 CHAPTER 6. NUMPY

np.mean(a)
[50]:
2.5
[50]:

6.5 Operations on Arrays

6.5.1 Arithmetic Operations

The operators +, -, *, / and ** all act elementwise on arrays


a = np.array([1, 2, 3, 4])
[51]: b = np.array([5, 6, 7, 8])
a + b

array([ 6, 8, 10, 12])


[51]:
a * b
[52]:
array([ 5, 12, 21, 32])
[52]:

We can add a scalar to each element as follows


a + 10
[53]:
array([11, 12, 13, 14])
[53]:

Scalar multiplication is similar


a * 10
[54]:
array([10, 20, 30, 40])
[54]:

The two-dimensional arrays follow the same general rules


A = np.ones((2, 2))
[55]: B = np.ones((2, 2))
A + B

array([[2., 2.],
[55]: [2., 2.]])

A + 10
[56]:
array([[11., 11.],
[56]: [11., 11.]])

A * B
[57]:
array([[1., 1.],
[57]: [1., 1.]])

In particular, A * B is not the matrix product, it is an element-wise product.

6.5.2 Matrix Multiplication

With Anaconda’s scientific Python package based around Python 3.5 and above, one can use
the @ symbol for matrix multiplication, as follows:
6.5. OPERATIONS ON ARRAYS 89

A = np.ones((2, 2))
[58]: B = np.ones((2, 2))
A @ B

array([[2., 2.],
[58]: [2., 2.]])

(For older versions of Python and NumPy you need to use the np.dot function)
We can also use @ to take the inner product of two flat arrays
A = np.array((1, 2))
[59]: B = np.array((10, 20))
A @ B

50
[59]:

In fact, we can use @ when one element is a Python list or tuple


A = np.array(((1, 2), (3, 4)))
[]: A

array([[1, 2],
[]: [3, 4]])

A @ (0, 1)
[61]:
array([2, 4])
[61]:

Since we are post-multiplying, the tuple is treated as a column vector.

6.5.3 Mutability and Copying Arrays

NumPy arrays are mutable data types, like Python lists.


In other words, their contents can be altered (mutated) in memory after initialization.
We already saw examples above.
Here’s another example:
a = np.array([42, 44])
[62]: a

array([42, 44])
[62]:
a[-1] = 0 # Change last element to 0
[63]: a

array([42, 0])
[63]:

Mutability leads to the following behavior (which can be shocking to MATLAB program-
mers…)
a = np.random.randn(3)
[64]: a

array([ 2.37408651, 1.52030237, -0.69162243])


[64]:
b = a
[65]: b[0] = 0.0
a

array([ 0. , 1.52030237, -0.69162243])


[65]:
90 CHAPTER 6. NUMPY

What’s happened is that we have changed a by changing b.


The name b is bound to a and becomes just another reference to the array (the Python as-
signment model is described in more detail later in the course).
Hence, it has equal rights to make changes to that array.
This is in fact the most sensible default behavior!
It means that we pass around only pointers to data, rather than making copies.
Making copies is expensive in terms of both speed and memory.
Making Copies
It is of course possible to make b an independent copy of a when required.
This can be done using np.copy
a = np.random.randn(3)
[66]: a

array([-0.75022052, -0.91299605, -0.63730441])


[66]:
b = np.copy(a)
[67]: b

array([-0.75022052, -0.91299605, -0.63730441])


[67]:

Now b is an independent copy (called a deep copy)


b[:] = 1
[68]: b

array([1., 1., 1.])


[68]:
a
[69]:
array([-0.75022052, -0.91299605, -0.63730441])
[69]:

Note that the change to b has not affected a.

6.6 Additional Functionality

Let’s look at some other useful things we can do with NumPy.

6.6.1 Vectorized Functions

NumPy provides versions of the standard functions log, exp, sin, etc. that act element-
wise on arrays
z = np.array([1, 2, 3])
[70]: np.sin(z)

array([0.84147098, 0.90929743, 0.14112001])


[70]:

This eliminates the need for explicit element-by-element loops such as


n = len(z)
[71]: y = np.empty(n)
for i in range(n):
6.6. ADDITIONAL FUNCTIONALITY 91

y[i] = np.sin(z[i])

Because they act element-wise on arrays, these functions are called vectorized functions.
In NumPy-speak, they are also called ufuncs, which stands for β€œuniversal functions”.
As we saw above, the usual arithmetic operations (+, *, etc.) also work element-wise, and
combining these with the ufuncs gives a very large set of fast element-wise functions.
z
[72]:
array([1, 2, 3])
[72]:
(1 / np.sqrt(2 * np.pi)) * np.exp(- 0.5 * z**2)
[73]:
array([0.24197072, 0.05399097, 0.00443185])
[73]:

Not all user-defined functions will act element-wise.


For example, passing the function f defined below a NumPy array causes a ValueError
def f(x):
[74]: return 1 if x > 0 else 0

The NumPy function np.where provides a vectorized alternative:


x = np.random.randn(4)
[75]: x

array([ 0.8414943 , -0.96286188, -1.32882881, 0.61491117])


[75]:
np.where(x > 0, 1, 0) # Insert 1 if x > 0 true, otherwise 0
[76]:
array([1, 0, 0, 1])
[76]:

You can also use np.vectorize to vectorize a given function


def f(x): return 1 if x > 0 else 0
[77]:
f = np.vectorize(f)
f(x) # Passing the same vector x as in the previous example

array([1, 0, 0, 1])
[77]:

However, this approach doesn’t always obtain the same speed as a more carefully crafted vec-
torized function.

6.6.2 Comparisons

As a rule, comparisons on arrays are done element-wise


z = np.array([2, 3])
[78]: y = np.array([2, 3])
z == y

array([ True, True])


[78]:
y[0] = 5
[79]: z == y

array([False, True])
[79]:
92 CHAPTER 6. NUMPY

z != y
[80]:
array([ True, False])
[80]:

The situation is similar for >, <, >= and <=.


We can also do comparisons against scalars
z = np.linspace(0, 10, 5)
[81]: z

array([ 0. , 2.5, 5. , 7.5, 10. ])


[81]:
z > 3
[82]:
array([False, False, True, True, True])
[82]:

This is particularly useful for conditional extraction


b = z > 3
[83]: b

array([False, False, True, True, True])


[83]:
z[b]
[84]:
array([ 5. , 7.5, 10. ])
[84]:

Of course we canβ€”and frequently doβ€”perform this in one step


z[z > 3]
[85]:
array([ 5. , 7.5, 10. ])
[85]:

6.6.3 Sub-packages

NumPy provides some additional functionality related to scientific programming through its
sub-packages.
We’ve already seen how we can generate random variables using np.random
z = np.random.randn(10000) # Generate standard normals
[86]: y = np.random.binomial(10, 0.5, size=1000) # 1,000 draws from Bin(10, 0.5)
y.mean()

5.013
[86]:

Another commonly used subpackage is np.linalg


A = np.array([[1, 2], [3, 4]])
[87]:
np.linalg.det(A) # Compute the determinant

-2.0000000000000004
[87]:
np.linalg.inv(A) # Compute the inverse
[88]:
array([[-2. , 1. ],
[88]: [ 1.5, -0.5]])

Much of this functionality is also available in SciPy, a collection of modules that are built on
top of NumPy.
6.7. EXERCISES 93

We’ll cover the SciPy versions in more detail soon.


For a comprehensive list of what’s available in NumPy see this documentation.

6.7 Exercises

6.7.1 Exercise 1

Consider the polynomial expression

𝑁
𝑝(π‘₯) = π‘Ž0 + π‘Ž1 π‘₯ + π‘Ž2 π‘₯2 + β‹― π‘Žπ‘ π‘₯𝑁 = βˆ‘ π‘Žπ‘› π‘₯𝑛 (1)
𝑛=0

Earlier, you wrote a simple function p(x, coeff) to evaluate Eq. (1) without considering
efficiency.
Now write a new function that does the same job, but uses NumPy arrays and array opera-
tions for its computations, rather than any form of Python loop.
(Such functionality is already implemented as np.poly1d, but for the sake of the exercise
don’t use this class)

β€’ Hint: Use np.cumprod()

6.7.2 Exercise 2

Let q be a NumPy array of length n with q.sum() == 1.


Suppose that q represents a probability mass function.
We wish to generate a discrete random variable π‘₯ such that P{π‘₯ = 𝑖} = π‘žπ‘– .
In other words, x takes values in range(len(q)) and x = i with probability q[i].
The standard (inverse transform) algorithm is as follows:

β€’ Divide the unit interval [0, 1] into 𝑛 subintervals 𝐼0 , 𝐼1 , … , πΌπ‘›βˆ’1 such that the length of
𝐼𝑖 is π‘žπ‘– .
β€’ Draw a uniform random variable π‘ˆ on [0, 1] and return the 𝑖 such that π‘ˆ ∈ 𝐼𝑖 .

The probability of drawing 𝑖 is the length of 𝐼𝑖 , which is equal to π‘žπ‘– .


We can implement the algorithm as follows
from random import uniform
[89]:
def sample(q):
a = 0.0
U = uniform(0, 1)
for i in range(len(q)):
if a < U <= a + q[i]:
return i
a = a + q[i]

If you can’t see how this works, try thinking through the flow for a simple example, such as q
= [0.25, 0.75] It helps to sketch the intervals on paper.
94 CHAPTER 6. NUMPY

Your exercise is to speed it up using NumPy, avoiding explicit loops

β€’ Hint: Use np.searchsorted and np.cumsum

If you can, implement the functionality as a class called discreteRV, where

β€’ the data for an instance of the class is the vector of probabilities q


β€’ the class has a draw() method, which returns one draw according to the algorithm de-
scribed above

If you can, write the method so that draw(k) returns k draws from q.

6.7.3 Exercise 3

Recall our earlier discussion of the empirical cumulative distribution function.


Your task is to

1. Make the __call__ method more efficient using NumPy.


2. Add a method that plots the ECDF over [π‘Ž, 𝑏], where π‘Ž and 𝑏 are method parameters.

6.8 Solutions
import matplotlib.pyplot as plt
[90]: %matplotlib inline

6.8.1 Exercise 1

This code does the job


def p(x, coef):
[91]: X = np.ones_like(coef)
X[1:] = x
y = np.cumprod(X) # y = [1, x, x**2,...]
return coef @ y

Let’s test it
x = 2
[92]: coef = np.linspace(2, 4, 3)
print(coef)
print(p(x, coef))
# For comparison
q = np.poly1d(np.flip(coef))
print(q(x))

[2. 3. 4.]
24.0
24.0

6.8.2 Exercise 2

Here’s our first pass at a solution:


6.8. SOLUTIONS 95

from numpy import cumsum


[93]: from numpy.random import uniform

class DiscreteRV:
"""
Generates an array of draws from a discrete random variable with vector of
probabilities given by q.
"""

def __init__(self, q):


"""
The argument q is a NumPy array, or array like, nonnegative and sums
to 1
"""
self.q = q
self.Q = cumsum(q)

def draw(self, k=1):


"""
Returns k draws from q. For each such draw, the value i is returned
with probability q[i].
"""
return self.Q.searchsorted(uniform(0, 1, size=k))

The logic is not obvious, but if you take your time and read it slowly, you will understand.
There is a problem here, however.
Suppose that q is altered after an instance of discreteRV is created, for example by
q = (0.1, 0.9)
[94]: d = DiscreteRV(q)
d.q = (0.5, 0.5)

The problem is that Q does not change accordingly, and Q is the data used in the draw
method.
To deal with this, one option is to compute Q every time the draw method is called.
But this is inefficient relative to computing Q once-off.
A better option is to use descriptors.
A solution from the quantecon library using descriptors that behaves as we desire can be
found here.

6.8.3 Exercise 3

An example solution is given below.


In essence, we’ve just taken this code from QuantEcon and added in a plot method
"""
[95]: Modifies ecdf.py from QuantEcon to add in a plot method

"""

class ECDF:
"""
One-dimensional empirical distribution function given a vector of
observations.

Parameters
----------
observations : array_like
An array of observations
96 CHAPTER 6. NUMPY

Attributes
----------
observations : array_like
An array of observations

"""

def __init__(self, observations):


self.observations = np.asarray(observations)

def __call__(self, x):


"""
Evaluates the ecdf at x

Parameters
----------
x : scalar(float)
The x at which the ecdf is evaluated

Returns
-------
scalar(float)
Fraction of the sample less than x

"""
return np.mean(self.observations <= x)

def plot(self, a=None, b=None):


"""
Plot the ecdf on the interval [a, b].

Parameters
----------
a : scalar(float), optional(default=None)
Lower endpoint of the plot interval
b : scalar(float), optional(default=None)
Upper endpoint of the plot interval

"""

# === choose reasonable interval if [a, b] not specified === #


if a is None:
a = self.observations.min() - self.observations.std()
if b is None:
b = self.observations.max() + self.observations.std()

# === generate plot === #


x_vals = np.linspace(a, b, num=100)
f = np.vectorize(self.__call__)
plt.plot(x_vals, f(x_vals))
plt.show()

Here’s an example of usage


X = np.random.randn(1000)
[96]: F = ECDF(X)
F.plot()
6.8. SOLUTIONS 97
98 CHAPTER 6. NUMPY
Chapter 7

Matplotlib

7.1 Contents

β€’ Overview 7.2

β€’ The APIs 7.3

β€’ More Features 7.4

β€’ Further Reading 7.5

β€’ Exercises 7.6

β€’ Solutions 7.7

7.2 Overview

We’ve already generated quite a few figures in these lectures using Matplotlib.
Matplotlib is an outstanding graphics library, designed for scientific computing, with

β€’ high-quality 2D and 3D plots


β€’ output in all the usual formats (PDF, PNG, etc.)
β€’ LaTeX integration
β€’ fine-grained control over all aspects of presentation
β€’ animation, etc.

7.2.1 Matplotlib’s Split Personality

Matplotlib is unusual in that it offers two different interfaces to plotting.


One is a simple MATLAB-style API (Application Programming Interface) that was written to
help MATLAB refugees find a ready home.
The other is a more β€œPythonic” object-oriented API.
For reasons described below, we recommend that you use the second API.
But first, let’s discuss the difference.

99
100 CHAPTER 7. MATPLOTLIB

7.3 The APIs

7.3.1 The MATLAB-style API

Here’s the kind of easy example you might find in introductory treatments
import matplotlib.pyplot as plt
[1]: %matplotlib inline
import numpy as np

x = np.linspace(0, 10, 200)


y = np.sin(x)

plt.plot(x, y, 'b-', linewidth=2)


plt.show()

This is simple and convenient, but also somewhat limited and un-Pythonic.
For example, in the function calls, a lot of objects get created and passed around without
making themselves known to the programmer.
Python programmers tend to prefer a more explicit style of programming (run import this
in a code block and look at the second line).
This leads us to the alternative, object-oriented Matplotlib API.

7.3.2 The Object-Oriented API

Here’s the code corresponding to the preceding figure using the object-oriented API
fig, ax = plt.subplots()
[2]: ax.plot(x, y, 'b-', linewidth=2)
plt.show()
7.3. THE APIS 101

Here the call fig, ax = plt.subplots() returns a pair, where

β€’ fig is a Figure instanceβ€”like a blank canvas.


β€’ ax is an AxesSubplot instanceβ€”think of a frame for plotting in.

The plot() function is actually a method of ax.


While there’s a bit more typing, the more explicit use of objects gives us better control.
This will become more clear as we go along.

7.3.3 Tweaks

Here we’ve changed the line to red and added a legend


fig, ax = plt.subplots()
[3]: ax.plot(x, y, 'r-', linewidth=2, label='sine function', alpha=0.6)
ax.legend()
plt.show()
102 CHAPTER 7. MATPLOTLIB

We’ve also used alpha to make the line slightly transparentβ€”which makes it look smoother.
The location of the legend can be changed by replacing ax.legend() with
ax.legend(loc='upper center').
fig, ax = plt.subplots()
[4]: ax.plot(x, y, 'r-', linewidth=2, label='sine function', alpha=0.6)
ax.legend(loc='upper center')
plt.show()

If everything is properly configured, then adding LaTeX is trivial


7.3. THE APIS 103

fig, ax = plt.subplots()
[5]: ax.plot(x, y, 'r-', linewidth=2, label='$y=\sin(x)$', alpha=0.6)
ax.legend(loc='upper center')
plt.show()

Controlling the ticks, adding titles and so on is also straightforward


fig, ax = plt.subplots()
[6]: ax.plot(x, y, 'r-', linewidth=2, label='$y=\sin(x)$', alpha=0.6)
ax.legend(loc='upper center')
ax.set_yticks([-1, 0, 1])
ax.set_title('Test plot')
plt.show()
104 CHAPTER 7. MATPLOTLIB

7.4 More Features

Matplotlib has a huge array of functions and features, which you can discover over time as
you have need for them.
We mention just a few.

7.4.1 Multiple Plots on One Axis

It’s straightforward to generate multiple plots on the same axes.


Here’s an example that randomly generates three normal densities and adds a label with their
mean
from scipy.stats import norm
[7]: from random import uniform

fig, ax = plt.subplots()
x = np.linspace(-4, 4, 150)
for i in range(3):
m, s = uniform(-1, 1), uniform(1, 2)
y = norm.pdf(x, loc=m, scale=s)
current_label = f'$\mu = {m:.2}$'
ax.plot(x, y, linewidth=2, alpha=0.6, label=current_label)
ax.legend()
plt.show()

7.4.2 Multiple Subplots

Sometimes we want multiple subplots in one figure.


7.4. MORE FEATURES 105

Here’s an example that generates 6 histograms


num_rows, num_cols = 3, 2
[8]: fig, axes = plt.subplots(num_rows, num_cols, figsize=(10, 12))
for i in range(num_rows):
for j in range(num_cols):
m, s = uniform(-1, 1), uniform(1, 2)
x = norm.rvs(loc=m, scale=s, size=100)
axes[i, j].hist(x, alpha=0.6, bins=20)
t = f'$\mu = {m:.2}, \quad \sigma = {s:.2}$'
axes[i, j].set(title=t, xticks=[-4, 0, 4], yticks=[])
plt.show()
106 CHAPTER 7. MATPLOTLIB

7.4.3 3D Plots

Matplotlib does a nice job of 3D plots β€” here is one example


from mpl_toolkits.mplot3d.axes3d import Axes3D
[9]: from matplotlib import cm

def f(x, y):


return np.cos(x**2 + y**2) / (1 + x**2 + y**2)

xgrid = np.linspace(-3, 3, 50)


ygrid = xgrid
x, y = np.meshgrid(xgrid, ygrid)

fig = plt.figure(figsize=(8, 6))


ax = fig.add_subplot(111, projection='3d')
ax.plot_surface(x,
y,
f(x, y),
rstride=2, cstride=2,
cmap=cm.jet,
alpha=0.7,
linewidth=0.25)
ax.set_zlim(-0.5, 1.0)
plt.show()

7.4.4 A Customizing Function

Perhaps you will find a set of customizations that you regularly use.
Suppose we usually prefer our axes to go through the origin, and to have a grid.
7.5. FURTHER READING 107

Here’s a nice example from Matthew Doty of how the object-oriented API can be used to
build a custom subplots function that implements these changes.
Read carefully through the code and see if you can follow what’s going on
def subplots():
[10]: "Custom subplots with axes through the origin"
fig, ax = plt.subplots()

# Set the axes through the origin


for spine in ['left', 'bottom']:
ax.spines[spine].set_position('zero')
for spine in ['right', 'top']:
ax.spines[spine].set_color('none')

ax.grid()
return fig, ax

fig, ax = subplots() # Call the local version, not plt.subplots()


x = np.linspace(-2, 10, 200)
y = np.sin(x)
ax.plot(x, y, 'r-', linewidth=2, label='sine function', alpha=0.6)
ax.legend(loc='lower right')
plt.show()

The custom subplots function

1. calls the standard plt.subplots function internally to generate the fig, ax pair,
2. makes the desired customizations to ax, and
3. passes the fig, ax pair back to the calling code.

7.5 Further Reading

β€’ The Matplotlib gallery provides many examples.


β€’ A nice Matplotlib tutorial by Nicolas Rougier, Mike Muller and Gael Varoquaux.
β€’ mpltools allows easy switching between plot styles.
β€’ Seaborn facilitates common statistics plots in Matplotlib.
108 CHAPTER 7. MATPLOTLIB

7.6 Exercises

7.6.1 Exercise 1

Plot the function

𝑓(π‘₯) = cos(πœ‹πœƒπ‘₯) exp(βˆ’π‘₯)

over the interval [0, 5] for each πœƒ in np.linspace(0, 2, 10).


Place all the curves in the same figure.
The output should look like this

7.7 Solutions

7.7.1 Exercise 1

Here’s one solution


ΞΈ_vals = np.linspace(0, 2, 10)
[11]: x = np.linspace(0, 5, 200)
fig, ax = plt.subplots()

for ΞΈ in ΞΈ_vals:
ax.plot(x, np.cos(np.pi * ΞΈ * x) * np.exp(- x))

plt.show()
7.7. SOLUTIONS 109
110 CHAPTER 7. MATPLOTLIB
Chapter 8

SciPy

8.1 Contents

β€’ SciPy versus NumPy 8.2

β€’ Statistics 8.3

β€’ Roots and Fixed Points 8.4

β€’ Optimization 8.5

β€’ Integration 8.6

β€’ Linear Algebra 8.7

β€’ Exercises 8.8

β€’ Solutions 8.9

SciPy builds on top of NumPy to provide common tools for scientific programming such as

β€’ linear algebra
β€’ numerical integration
β€’ interpolation
β€’ optimization
β€’ distributions and random number generation
β€’ signal processing
β€’ etc., etc

Like NumPy, SciPy is stable, mature and widely used.


Many SciPy routines are thin wrappers around industry-standard Fortran libraries such as
LAPACK, BLAS, etc.
It’s not really necessary to β€œlearn” SciPy as a whole.
A more common approach is to get some idea of what’s in the library and then look up docu-
mentation as required.
In this lecture, we aim only to highlight some useful parts of the package.

111
112 CHAPTER 8. SCIPY

8.2 SciPy versus NumPy

SciPy is a package that contains various tools that are built on top of NumPy, using its array
data type and related functionality.
In fact, when we import SciPy we also get NumPy, as can be seen from the SciPy initializa-
tion file
# Import numpy symbols to scipy namespace
[1]: import numpy as _num
linalg = None
from numpy import *
from numpy.random import rand, randn
from numpy.fft import fft, ifft
from numpy.lib.scimath import *

__all__ = []
__all__ += _num.__all__
__all__ += ['randn', 'rand', 'fft', 'ifft']

del _num
# Remove the linalg imported from numpy so that the scipy.linalg package
# can be imported.
del linalg
__all__.remove('linalg')

However, it’s more common and better practice to use NumPy functionality explicitly
import numpy as np
[2]:
a = np.identity(3)

What is useful in SciPy is the functionality in its sub-packages

β€’ scipy.optimize, scipy.integrate, scipy.stats, etc.

These sub-packages and their attributes need to be imported separately


from scipy.integrate import quad
[3]: from scipy.optimize import brentq
# etc

Let’s explore some of the major sub-packages.

8.3 Statistics

The scipy.stats subpackage supplies

β€’ numerous random variable objects (densities, cumulative distributions, random sam-


pling, etc.)
β€’ some estimation procedures
β€’ some statistical tests

8.3.1 Random Variables and Distributions

Recall that numpy.random provides functions for generating random variables


np.random.beta(5, 5, size=3)
[4]:
8.3. STATISTICS 113

array([0.39390263, 0.5575978 , 0.34117081])


[4]:

This generates a draw from the distribution below when a, b = 5, 5

π‘₯(π‘Žβˆ’1) (1 βˆ’ π‘₯)(π‘βˆ’1)
𝑓(π‘₯; π‘Ž, 𝑏) = 1
(0 ≀ π‘₯ ≀ 1) (1)
∫0 𝑒(π‘Žβˆ’1) (1 βˆ’ 𝑒)(π‘βˆ’1) 𝑑𝑒

Sometimes we need access to the density itself, or the cdf, the quantiles, etc.
For this, we can use scipy.stats, which provides all of this functionality as well as random
number generation in a single consistent interface.
Here’s an example of usage
from scipy.stats import beta
[5]: import matplotlib.pyplot as plt
%matplotlib inline

q = beta(5, 5) # Beta(a, b), with a = b = 5


obs = q.rvs(2000) # 2000 observations
grid = np.linspace(0.01, 0.99, 100)

fig, ax = plt.subplots(figsize=(10, 6))


ax.hist(obs, bins=40, density=True)
ax.plot(grid, q.pdf(grid), 'k-', linewidth=2)
plt.show()

In this code, we created a so-called rv_frozen object, via the call q = beta(5, 5).
The β€œfrozen” part of the notation implies that q represents a particular distribution with a
particular set of parameters.
Once we’ve done so, we can then generate random numbers, evaluate the density, etc., all
from this fixed distribution
q.cdf(0.4) # Cumulative distribution function
[6]:
114 CHAPTER 8. SCIPY

0.26656768000000003
[6]:
q.pdf(0.4) # Density function
[7]:
2.0901888000000013
[7]:
q.ppf(0.8) # Quantile (inverse cdf) function
[8]:
0.6339134834642708
[8]:
q.mean()
[9]:
0.5
[9]:

The general syntax for creating these objects is

identifier = scipy.stats.distribution_name(shape_parameters)

where distribution_name is one of the distribution names in scipy.stats.


There are also two keyword arguments, loc and scale, which following our example above,
are called as

identifier = scipy.stats.distribution_name(shape_parameters,
loc=c, scale=d)

These transform the original random variable 𝑋 into π‘Œ = 𝑐 + 𝑑𝑋.


The methods rvs, pdf, cdf, etc. are transformed accordingly.
Before finishing this section, we note that there is an alternative way of calling the methods
described above.
For example, the previous code can be replaced by
obs = beta.rvs(5, 5, size=2000)
[10]: grid = np.linspace(0.01, 0.99, 100)

fig, ax = plt.subplots()
ax.hist(obs, bins=40, density=True)
ax.plot(grid, beta.pdf(grid, 5, 5), 'k-', linewidth=2)
plt.show()
8.4. ROOTS AND FIXED POINTS 115

8.3.2 Other Goodies in scipy.stats

There are a variety statistical functions in scipy.stats.


For example, scipy.stats.linregress implements simple linear regression
from scipy.stats import linregress
[11]:
x = np.random.randn(200)
y = 2 * x + 0.1 * np.random.randn(200)
gradient, intercept, r_value, p_value, std_err = linregress(x, y)
gradient, intercept

(2.00290964225675, -0.003101209098539087)
[11]:

To see the full list, consult the documentation.

8.4 Roots and Fixed Points

A root of a real function 𝑓 on [π‘Ž, 𝑏] is an π‘₯ ∈ [π‘Ž, 𝑏] such that 𝑓(π‘₯) = 0.


For example, if we plot the function

𝑓(π‘₯) = sin(4(π‘₯ βˆ’ 1/4)) + π‘₯ + π‘₯20 βˆ’ 1 (2)

with π‘₯ ∈ [0, 1] we get


f = lambda x: np.sin(4 * (x - 1/4)) + x + x**20 - 1
[12]: x = np.linspace(0, 1, 100)

fig, ax = plt.subplots(figsize=(10, 8))


ax.plot(x, f(x))
ax.axhline(ls='--', c='k')
plt.show()
116 CHAPTER 8. SCIPY

The unique root is approximately 0.408.


Let’s consider some numerical techniques for finding roots.

8.4.1 Bisection

One of the most common algorithms for numerical root-finding is bisection.


To understand the idea, recall the well-known game where

β€’ Player A thinks of a secret number between 1 and 100


β€’ Player B asks if it’s less than 50

– If yes, B asks if it’s less than 25


– If no, B asks if it’s less than 75

And so on.
This is bisection.
Here’s a fairly simplistic implementation of the algorithm in Python.
It works for all sufficiently well behaved increasing continuous functions with 𝑓(π‘Ž) < 0 < 𝑓(𝑏)
def bisect(f, a, b, tol=10e-5):
[13]: """
Implements the bisection root finding algorithm, assuming that f is a
real-valued function on [a, b] satisfying f(a) < 0 < f(b).
"""
8.4. ROOTS AND FIXED POINTS 117

lower, upper = a, b

while upper - lower > tol:


middle = 0.5 * (upper + lower)
# === if root is between lower and middle === #
if f(middle) > 0:
lower, upper = lower, middle
# === if root is between middle and upper === #
else:
lower, upper = middle, upper

return 0.5 * (upper + lower)

In fact, SciPy provides its own bisection function, which we now test using the function 𝑓 de-
fined in Eq. (2)
from scipy.optimize import bisect
[14]:
bisect(f, 0, 1)

0.4082935042806639
[14]:

8.4.2 The Newton-Raphson Method

Another very common root-finding algorithm is the Newton-Raphson method.


In SciPy this algorithm is implemented by scipy.optimize.newton.
Unlike bisection, the Newton-Raphson method uses local slope information.
This is a double-edged sword:

β€’ When the function is well-behaved, the Newton-Raphson method is faster than bisec-
tion.
β€’ When the function is less well-behaved, the Newton-Raphson might fail.

Let’s investigate this using the same function 𝑓, first looking at potential instability
from scipy.optimize import newton
[15]:
newton(f, 0.2) # Start the search at initial condition x = 0.2

0.40829350427935673
[15]:
newton(f, 0.7) # Start the search at x = 0.7 instead
[16]:
0.7001700000000279
[16]:

The second initial condition leads to failure of convergence.


On the other hand, using IPython’s timeit magic, we see that newton can be much faster
%timeit bisect(f, 0, 1)
[17]:

98.3 Β΅s Β± 359 ns per loop (mean Β± std. dev. of 7 runs, 10000 loops each)

%timeit newton(f, 0.2)


[18]:

198 Β΅s Β± 20.4 Β΅s per loop (mean Β± std. dev. of 7 runs, 10000 loops each)
118 CHAPTER 8. SCIPY

8.4.3 Hybrid Methods

So far we have seen that the Newton-Raphson method is fast but not robust.
This bisection algorithm is robust but relatively slow.
This illustrates a general principle

β€’ If you have specific knowledge about your function, you might be able to exploit it to
generate efficiency.
β€’ If not, then the algorithm choice involves a trade-off between the speed of convergence
and robustness.

In practice, most default algorithms for root-finding, optimization and fixed points use hybrid
methods.
These methods typically combine a fast method with a robust method in the following man-
ner:

1. Attempt to use a fast method


2. Check diagnostics
3. If diagnostics are bad, then switch to a more robust algorithm

In scipy.optimize, the function brentq is such a hybrid method and a good default
brentq(f, 0, 1)
[19]:
0.40829350427936706
[19]:
%timeit brentq(f, 0, 1)
[20]:

24.9 Β΅s Β± 71.8 ns per loop (mean Β± std. dev. of 7 runs, 10000 loops each)

Here the correct solution is found and the speed is almost the same as newton.

8.4.4 Multivariate Root-Finding

Use scipy.optimize.fsolve, a wrapper for a hybrid method in MINPACK.


See the documentation for details.

8.4.5 Fixed Points

SciPy has a function for finding (scalar) fixed points too


from scipy.optimize import fixed_point
[21]:
fixed_point(lambda x: x**2, 10.0) # 10.0 is an initial guess

array(1.)
[21]:

If you don’t get good results, you can always switch back to the brentq root finder, since
the fixed point of a function 𝑓 is the root of 𝑔(π‘₯) ∢= π‘₯ βˆ’ 𝑓(π‘₯).
8.5. OPTIMIZATION 119

8.5 Optimization

Most numerical packages provide only functions for minimization.


Maximization can be performed by recalling that the maximizer of a function 𝑓 on domain 𝐷
is the minimizer of βˆ’π‘“ on 𝐷.
Minimization is closely related to root-finding: For smooth functions, interior optima corre-
spond to roots of the first derivative.
The speed/robustness trade-off described above is present with numerical optimization too.
Unless you have some prior information you can exploit, it’s usually best to use hybrid meth-
ods.
For constrained, univariate (i.e., scalar) minimization, a good hybrid option is fminbound
from scipy.optimize import fminbound
[22]:
fminbound(lambda x: x**2, -1, 2) # Search in [-1, 2]

0.0
[22]:

8.5.1 Multivariate Optimization

Multivariate local optimizers include minimize, fmin, fmin_powell, fmin_cg,


fmin_bfgs, and fmin_ncg.
Constrained multivariate local optimizers include fmin_l_bfgs_b, fmin_tnc,
fmin_cobyla.
See the documentation for details.

8.6 Integration

Most numerical integration methods work by computing the integral of an approximating


polynomial.
The resulting error depends on how well the polynomial fits the integrand, which in turn de-
pends on how β€œregular” the integrand is.
In SciPy, the relevant module for numerical integration is scipy.integrate.
A good default for univariate integration is quad
from scipy.integrate import quad
[23]:
integral, error = quad(lambda x: x**2, 0, 1)
integral

0.33333333333333337
[23]:

In fact, quad is an interface to a very standard numerical integration routine in the Fortran
library QUADPACK.
It uses Clenshaw-Curtis quadrature, based on expansion in terms of Chebychev polynomials.
There are other options for univariate integrationβ€”a useful one is fixed_quad, which is fast
and hence works well inside for loops.
120 CHAPTER 8. SCIPY

There are also functions for multivariate integration.


See the documentation for more details.

8.7 Linear Algebra

We saw that NumPy provides a module for linear algebra called linalg.
SciPy also provides a module for linear algebra with the same name.
The latter is not an exact superset of the former, but overall it has more functionality.
We leave you to investigate the set of available routines.

8.8 Exercises

8.8.1 Exercise 1

Previously we discussed the concept of recursive function calls.


Write a recursive implementation of the bisection function described above, which we repeat
here for convenience.
def bisect(f, a, b, tol=10e-5):
[24]: """
Implements the bisection root finding algorithm, assuming that f is a
real-valued function on [a, b] satisfying f(a) < 0 < f(b).
"""
lower, upper = a, b

while upper - lower > tol:


middle = 0.5 * (upper + lower)
# === if root is between lower and middle === #
if f(middle) > 0:
lower, upper = lower, middle
# === if root is between middle and upper === #
else:
lower, upper = middle, upper

return 0.5 * (upper + lower)

Test it on the function f = lambda x: np.sin(4 * (x - 0.25)) + x + x**20 -


1 discussed above.

8.9 Solutions

8.9.1 Exercise 1

Here’s a reasonable solution:


def bisect(f, a, b, tol=10e-5):
[25]: """
Implements the bisection root-finding algorithm, assuming that f is a
real-valued function on [a, b] satisfying f(a) < 0 < f(b).
"""
lower, upper = a, b
if upper - lower < tol:
return 0.5 * (upper + lower)
else:
8.9. SOLUTIONS 121

middle = 0.5 * (upper + lower)


print(f'Current mid point = {middle}')
if f(middle) > 0: # Implies root is between lower and middle
return bisect(f, lower, middle)
else: # Implies root is between middle and upper
return bisect(f, middle, upper)

We can test it as follows


f = lambda x: np.sin(4 * (x - 0.25)) + x + x**20 - 1
[26]: bisect(f, 0, 1)

Current mid point = 0.5


Current mid point = 0.25
Current mid point = 0.375
Current mid point = 0.4375
Current mid point = 0.40625
Current mid point = 0.421875
Current mid point = 0.4140625
Current mid point = 0.41015625
Current mid point = 0.408203125
Current mid point = 0.4091796875
Current mid point = 0.40869140625
Current mid point = 0.408447265625
Current mid point = 0.4083251953125
Current mid point = 0.40826416015625

0.408294677734375
[26]:
122 CHAPTER 8. SCIPY
Chapter 9

Numba

9.1 Contents

β€’ Overview 9.2

β€’ Where are the Bottlenecks? 9.3

β€’ Vectorization 9.4

β€’ Numba 9.5

In addition to what’s in Anaconda, this lecture will need the following libraries:
!pip install --upgrade quantecon
[1]:

9.2 Overview

In our lecture on NumPy, we learned one method to improve speed and efficiency in numeri-
cal work.
That method, called vectorization, involved sending array processing operations in batch to
efficient low-level code.
This clever idea dates back to Matlab, which uses it extensively.
Unfortunately, vectorization is limited and has several weaknesses.
One weakness is that it is highly memory-intensive.
Another problem is that only some algorithms can be vectorized.
In the last few years, a new Python library called Numba has appeared that solves many of
these problems.
It does so through something called just in time (JIT) compilation.
JIT compilation is effective in many numerical settings and can generate extremely fast, effi-
cient code.
It can also do other tricks such as facilitate multithreading (a form of parallelization well
suited to numerical work).

123
124 CHAPTER 9. NUMBA

9.2.1 The Need for Speed

To understand what Numba does and why, we need some background knowledge.
Let’s start by thinking about higher-level languages, such as Python.
These languages are optimized for humans.
This means that the programmer can leave many details to the runtime environment

β€’ specifying variable types


β€’ memory allocation/deallocation, etc.

The upside is that, compared to low-level languages, Python is typically faster to write, less
error-prone and easier to debug.
The downside is that Python is harder to optimize β€” that is, turn into fast machine code β€”
than languages like C or Fortran.
Indeed, the standard implementation of Python (called CPython) cannot match the speed of
compiled languages such as C or Fortran.
Does that mean that we should just switch to C or Fortran for everything?
The answer is no, no and one hundred times no.
High productivity languages should be chosen over high-speed languages for the great major-
ity of scientific computing tasks.
This is because

1. Of any given program, relatively few lines are ever going to be time-critical.
2. For those lines of code that are time-critical, we can achieve C-like speed using a combi-
nation of NumPy and Numba.

This lecture provides a guide.

9.3 Where are the Bottlenecks?

Let’s start by trying to understand why high-level languages like Python are slower than com-
piled code.

9.3.1 Dynamic Typing

Consider this Python operation


a, b = 10, 10
[2]: a + b

20
[2]:

Even for this simple operation, the Python interpreter has a fair bit of work to do.
For example, in the statement a + b, the interpreter has to know which operation to invoke.
If a and b are strings, then a + b requires string concatenation
9.3. WHERE ARE THE BOTTLENECKS? 125

a, b = 'foo', 'bar'
[3]: a + b

'foobar'
[3]:

If a and b are lists, then a + b requires list concatenation


a, b = ['foo'], ['bar']
[4]: a + b

['foo', 'bar']
[4]:

(We say that the operator + is overloaded β€” its action depends on the type of the objects on
which it acts)
As a result, Python must check the type of the objects and then call the correct operation.
This involves substantial overheads.
Static Types
Compiled languages avoid these overheads with explicit, static types.
For example, consider the following C code, which sums the integers from 1 to 10

#include <stdio.h>

int main(void) {
int i;
int sum = 0;
for (i = 1; i <= 10; i++) {
sum = sum + i;
}
printf("sum = %d\n", sum);
return 0;
}

The variables i and sum are explicitly declared to be integers.


Hence, the meaning of addition here is completely unambiguous.

9.3.2 Data Access

Another drag on speed for high-level languages is data access.


To illustrate, let’s consider the problem of summing some data β€” say, a collection of integers.
Summing with Compiled Code
In C or Fortran, these integers would typically be stored in an array, which is a simple data
structure for storing homogeneous data.
Such an array is stored in a single contiguous block of memory

β€’ In modern computers, memory addresses are allocated to each byte (one byte = 8 bits).
β€’ For example, a 64 bit integer is stored in 8 bytes of memory.
β€’ An array of 𝑛 such integers occupies 8𝑛 consecutive memory slots.
126 CHAPTER 9. NUMBA

Moreover, the compiler is made aware of the data type by the programmer.

β€’ In this case 64 bit integers

Hence, each successive data point can be accessed by shifting forward in memory space by a
known and fixed amount.

β€’ In this case 8 bytes

Summing in Pure Python


Python tries to replicate these ideas to some degree.
For example, in the standard Python implementation (CPython), list elements are placed in
memory locations that are in a sense contiguous.
However, these list elements are more like pointers to data rather than actual data.
Hence, there is still overhead involved in accessing the data values themselves.
This is a considerable drag on speed.
In fact, it’s generally true that memory traffic is a major culprit when it comes to slow execu-
tion.
Let’s look at some ways around these problems.

9.4 Vectorization

Vectorization is about sending batches of related operations to native machine code.

β€’ The machine code itself is typically compiled from carefully optimized C or Fortran.

This can greatly accelerate many (but not all) numerical computations.

9.4.1 Operations on Arrays

First, let’s run some imports


import random
[5]: import numpy as np
import quantecon as qe

Now let’s try this non-vectorized code


qe.util.tic() # Start timing
[6]: n = 1_000_000
sum = 0
for i in range(n):
x = random.uniform(0, 1)
sum += x**2
qe.util.toc() # End timing

TOC: Elapsed: 0:00:0.64


9.4. VECTORIZATION 127

0.646047830581665
[6]:

Now compare this vectorized code


qe.util.tic()
[7]: n = 1_000_000
x = np.random.uniform(0, 1, n)
np.sum(x**2)
qe.util.toc()

TOC: Elapsed: 0:00:0.02

0.022841215133666992
[7]:

The second code block β€” which achieves the same thing as the first β€” runs much faster.
The reason is that in the second implementation we have broken the loop down into three
basic operations

1. draw n uniforms
2. square them
3. sum them

These are sent as batch operators to optimized machine code.


Apart from minor overheads associated with sending data back and forth, the result is C or
Fortran-like speed.
When we run batch operations on arrays like this, we say that the code is vectorized.
Vectorized code is typically fast and efficient.
It is also surprisingly flexible, in the sense that many operations can be vectorized.
The next section illustrates this point.

9.4.2 Universal Functions

Many functions provided by NumPy are so-called universal functions β€” also called ufuncs.
This means that they

β€’ map scalars into scalars, as expected


β€’ map arrays into arrays, acting element-wise

For example, np.cos is a ufunc:


np.cos(1.0)
[8]:
0.5403023058681398
[8]:
np.cos(np.linspace(0, 1, 3))
[9]:
array([1. , 0.87758256, 0.54030231])
[9]:

By exploiting ufuncs, many operations can be vectorized.


128 CHAPTER 9. NUMBA

For example, consider the problem of maximizing a function 𝑓 of two variables (π‘₯, 𝑦) over the
square [βˆ’π‘Ž, π‘Ž] Γ— [βˆ’π‘Ž, π‘Ž].
For 𝑓 and π‘Ž let’s choose

cos(π‘₯2 + 𝑦2 )
𝑓(π‘₯, 𝑦) = and π‘Ž = 3
1 + π‘₯2 + 𝑦 2

Here’s a plot of 𝑓
import matplotlib.pyplot as plt
[10]: %matplotlib inline
from mpl_toolkits.mplot3d.axes3d import Axes3D
from matplotlib import cm

def f(x, y):


return np.cos(x**2 + y**2) / (1 + x**2 + y**2)

xgrid = np.linspace(-3, 3, 50)


ygrid = xgrid
x, y = np.meshgrid(xgrid, ygrid)

fig = plt.figure(figsize=(8, 6))


ax = fig.add_subplot(111, projection='3d')
ax.plot_surface(x,
y,
f(x, y),
rstride=2, cstride=2,
cmap=cm.jet,
alpha=0.7,
linewidth=0.25)
ax.set_zlim(-0.5, 1.0)
plt.show()
9.4. VECTORIZATION 129

To maximize it, we’re going to use a naive grid search:

1. Evaluate 𝑓 for all (π‘₯, 𝑦) in a grid on the square.


2. Return the maximum of observed values.

Here’s a non-vectorized version that uses Python loops


def f(x, y):
[11]: return np.cos(x**2 + y**2) / (1 + x**2 + y**2)

grid = np.linspace(-3, 3, 1000)


m = -np.inf

qe.tic()
for x in grid:
for y in grid:
z = f(x, y)
if z > m:
m = z

qe.toc()

TOC: Elapsed: 0:00:4.21

4.219594240188599
[11]:

And here’s a vectorized version


def f(x, y):
[12]: return np.cos(x**2 + y**2) / (1 + x**2 + y**2)

grid = np.linspace(-3, 3, 1000)


x, y = np.meshgrid(grid, grid)

qe.tic()
np.max(f(x, y))
qe.toc()

TOC: Elapsed: 0:00:0.02

0.02581310272216797
[12]:

In the vectorized version, all the looping takes place in compiled code.
As you can see, the second version is much faster.
(We’ll make it even faster again below when we discuss Numba)

9.4.3 Pros and Cons of Vectorization

At its best, vectorization yields fast, simple code.


However, it’s not without disadvantages.
One issue is that it can be highly memory-intensive.
For example, the vectorized maximization routine above is far more memory intensive than
the non-vectorized version that preceded it.
Another issue is that not all algorithms can be vectorized.
130 CHAPTER 9. NUMBA

In these kinds of settings, we need to go back to loops.


Fortunately, there are nice ways to speed up Python loops.

9.5 Numba

One exciting development in this direction is Numba.


Numba aims to automatically compile functions to native machine code instructions on the
fly.
The process isn’t flawless, since Numba needs to infer type information on all variables to
generate pure machine instructions.
Such inference isn’t possible in every setting.
But for simple routines, Numba infers types very well.
Moreover, the β€œhot loops” at the heart of our code that we need to speed up are often such
simple routines.

9.5.1 Prerequisites

If you followed our set up instructions, then Numba should be installed.


Make sure you have the latest version of Anaconda by running conda update anaconda
from a terminal (Mac, Linux) / Anaconda command prompt (Windows).

9.5.2 An Example

Let’s consider some problems that are difficult to vectorize.


One is generating the trajectory of a difference equation given an initial condition.
Let’s take the difference equation to be the quadratic map

π‘₯𝑑+1 = 4π‘₯𝑑 (1 βˆ’ π‘₯𝑑 )

Here’s the plot of a typical trajectory, starting from π‘₯0 = 0.1, with 𝑑 on the x-axis
def qm(x0, n):
[13]: x = np.empty(n+1)
x[0] = x0
for t in range(n):
x[t+1] = 4 * x[t] * (1 - x[t])
return x

x = qm(0.1, 250)
fig, ax = plt.subplots(figsize=(10, 6))
ax.plot(x, 'b-', lw=2, alpha=0.8)
ax.set_xlabel('time', fontsize=16)
plt.show()
9.5. NUMBA 131

To speed this up using Numba is trivial using Numba’s jit function


from numba import jit
[14]:
qm_numba = jit(qm) # qm_numba is now a 'compiled' version of qm

Let’s time and compare identical function calls across these two versions:
qe.util.tic()
[15]: qm(0.1, int(10**5))
time1 = qe.util.toc()

TOC: Elapsed: 0:00:0.12

qe.util.tic()
[16]: qm_numba(0.1, int(10**5))
time2 = qe.util.toc()

TOC: Elapsed: 0:00:0.09

The first execution is relatively slow because of JIT compilation (see below).
Next time and all subsequent times it runs much faster:
qe.util.tic()
[17]: qm_numba(0.1, int(10**5))
time2 = qe.util.toc()

TOC: Elapsed: 0:00:0.00

time1 / time2 # Calculate speed gain


[18]:
362.8663101604278
[18]:
132 CHAPTER 9. NUMBA

That’s a speed increase of two orders of magnitude!


Your mileage will of course vary depending on hardware and so on.
Nonetheless, two orders of magnitude is huge relative to how simple and clear the implemen-
tation is.
Decorator Notation
If you don’t need a separate name for the β€œnumbafied” version of qm, you can just put @jit
before the function
@jit
[19]: def qm(x0, n):
x = np.empty(n+1)
x[0] = x0
for t in range(n):
x[t+1] = 4 * x[t] * (1 - x[t])
return x

This is equivalent to qm = jit(qm).

9.5.3 How and When it Works

Numba attempts to generate fast machine code using the infrastructure provided by the
LLVM Project.
It does this by inferring type information on the fly.
As you can imagine, this is easier for simple Python objects (simple scalar data types, such as
floats, integers, etc.).
Numba also plays well with NumPy arrays, which it treats as typed memory regions.
In an ideal setting, Numba can infer all necessary type information.
This allows it to generate native machine code, without having to call the Python runtime
environment.
In such a setting, Numba will be on par with machine code from low-level languages.
When Numba cannot infer all type information, some Python objects are given generic
object status, and some code is generated using the Python runtime.
In this second setting, Numba typically provides only minor speed gains β€” or none at all.
Hence, it’s prudent when using Numba to focus on speeding up small, time-critical snippets of
code.
This will give you much better performance than blanketing your Python programs with
@jit statements.
A Gotcha: Global Variables
Consider the following example
a = 1
[20]:
@jit
def add_x(x):
return a + x

print(add_x(10))
9.5. NUMBA 133

11

a = 2
[21]:
print(add_x(10))

11

Notice that changing the global had no effect on the value returned by the function.
When Numba compiles machine code for functions, it treats global variables as constants to
ensure type stability.

9.5.4 Numba for Vectorization

Numba can also be used to create custom ufuncs with the @vectorize decorator.
To illustrate the advantage of using Numba to vectorize a function, we return to a maximiza-
tion problem discussed above
from numba import vectorize
[22]:
@vectorize
def f_vec(x, y):
return np.cos(x**2 + y**2) / (1 + x**2 + y**2)

grid = np.linspace(-3, 3, 1000)


x, y = np.meshgrid(grid, grid)

np.max(f_vec(x, y)) # Run once to compile

qe.tic()
np.max(f_vec(x, y))
qe.toc()

TOC: Elapsed: 0:00:0.02

0.025510549545288086
[22]:

This is faster than our vectorized version using NumPy’s ufuncs.


Why should that be? After all, anything vectorized with NumPy will be running in fast C or
Fortran code.
The reason is that it’s much less memory-intensive.
For example, when NumPy computes np.cos(x**2 + y**2) it first creates the intermedi-
ate arrays x**2 and y**2, then it creates the array np.cos(x**2 + y**2).
In our @vectorize version using Numba, the entire operator is reduced to a single vector-
ized process and none of these intermediate arrays are created.
We can gain further speed improvements using Numba’s automatic parallelization feature by
specifying target='parallel'.
In this case, we need to specify the types of our inputs and outputs
@vectorize('float64(float64, float64)', target='parallel')
[23]: def f_vec(x, y):
return np.cos(x**2 + y**2) / (1 + x**2 + y**2)
134 CHAPTER 9. NUMBA

np.max(f_vec(x, y)) # Run once to compile

qe.tic()
np.max(f_vec(x, y))
qe.toc()

TOC: Elapsed: 0:00:0.01

0.01980733871459961
[23]:

This is a striking speed up with very little effort.


Chapter 10

Other Scientific Libraries

10.1 Contents

β€’ Overview 10.2

β€’ Cython 10.3

β€’ Joblib 10.4

β€’ Other Options 10.5

β€’ Exercises 10.6

β€’ Solutions 10.7

In addition to what’s in Anaconda, this lecture will need the following libraries:
!pip install --upgrade quantecon
[1]:

10.2 Overview

In this lecture, we review some other scientific libraries that are useful for economic research
and analysis.
We have, however, already picked most of the low hanging fruit in terms of economic re-
search.
Hence you should feel free to skip this lecture on first pass.

10.3 Cython

Like Numba, Cython provides an approach to generating fast compiled code that can be used
from Python.
As was the case with Numba, a key problem is the fact that Python is dynamically typed.
As you’ll recall, Numba solves this problem (where possible) by inferring type.

135
136 CHAPTER 10. OTHER SCIENTIFIC LIBRARIES

Cython’s approach is different β€” programmers add type definitions directly to their β€œPython”
code.
As such, the Cython language can be thought of as Python with type definitions.
In addition to a language specification, Cython is also a language translator, transforming
Cython code into optimized C and C++ code.
Cython also takes care of building language extensions β€” the wrapper code that interfaces
between the resulting compiled code and Python.
Important Note:
In what follows code is executed in a Jupyter notebook.
This is to take advantage of a Cython cell magic that makes Cython particularly easy to use.
Some modifications are required to run the code outside a notebook.

β€’ See the book Cython by Kurt Smith or the online documentation.

10.3.1 A First Example

Let’s start with a rather artificial example.


𝑛
Suppose that we want to compute the sum βˆ‘π‘–=0 𝛼𝑖 for given 𝛼, 𝑛.
Suppose further that we’ve forgotten the basic formula

𝑛
1 βˆ’ 𝛼𝑛+1
βˆ‘ 𝛼𝑖 =
𝑖=0
1βˆ’π›Ό

for a geometric progression and hence have resolved to rely on a loop.


Python vs C
Here’s a pure Python function that does the job
def geo_prog(alpha, n):
[2]: current = 1.0
sum = current
for i in range(n):
current = current * alpha
sum = sum + current
return sum

This works fine but for large 𝑛 it is slow.


Here’s a C function that will do the same thing

double geo_prog(double alpha, int n) {


double current = 1.0;
double sum = current;
int i;
for (i = 1; i <= n; i++) {
current = current * alpha;
sum = sum + current;
}
10.3. CYTHON 137

return sum;
}

If you’re not familiar with C, the main thing you should take notice of is the type definitions

β€’ int means integer


β€’ double means double precision floating-point number
β€’ the double in double geo_prog(... indicates that the function will return a dou-
ble

Not surprisingly, the C code is faster than the Python code.


A Cython Implementation
Cython implementations look like a convex combination of Python and C.
We’re going to run our Cython code in the Jupyter notebook, so we’ll start by loading the
Cython extension in a notebook cell
%load_ext Cython
[3]:

In the next cell, we execute the following


%%cython
[4]: def geo_prog_cython(double alpha, int n):
cdef double current = 1.0
cdef double sum = current
cdef int i
for i in range(n):
current = current * alpha
sum = sum + current
return sum

Here cdef is a Cython keyword indicating a variable declaration and is followed by a type.
The %%cython line at the top is not actually Cython code β€” it’s a Jupyter cell magic indi-
cating the start of Cython code.
After executing the cell, you can now call the function geo_prog_cython from within
Python.
What you are in fact calling is compiled C code with a Python call interface
import quantecon as qe
[5]: qe.util.tic()
geo_prog(0.99, int(10**6))
qe.util.toc()

TOC: Elapsed: 0:00:0.17

0.17586088180541992
[5]:
qe.util.tic()
[6]: geo_prog_cython(0.99, int(10**6))
qe.util.toc()

TOC: Elapsed: 0:00:0.04

0.04195237159729004
[6]:
138 CHAPTER 10. OTHER SCIENTIFIC LIBRARIES

10.3.2 Example 2: Cython with NumPy Arrays

Let’s go back to the first problem that we worked with: generating the iterates of the
quadratic map

π‘₯𝑑+1 = 4π‘₯𝑑 (1 βˆ’ π‘₯𝑑 )

The problem of computing iterates and returning a time series requires us to work with ar-
rays.
The natural array type to work with is NumPy arrays.
Here’s a Cython implementation that initializes, populates and returns a NumPy array
%%cython
[7]: import numpy as np

def qm_cython_first_pass(double x0, int n):


cdef int t
x = np.zeros(n+1, float)
x[0] = x0
for t in range(n):
x[t+1] = 4.0 * x[t] * (1 - x[t])
return np.asarray(x)

If you run this code and time it, you will see that its performance is disappointing β€” nothing
like the speed gain we got from Numba
qe.util.tic()
[8]: qm_cython_first_pass(0.1, int(10**5))
qe.util.toc()

TOC: Elapsed: 0:00:0.06

0.06516456604003906
[8]:

This example was also computed in the Numba lecture, and you can see Numba is around 90
times faster.
The reason is that working with NumPy arrays incurs substantial Python overheads.
We can do better by using Cython’s typed memoryviews, which provide more direct access to
arrays in memory.
When using them, the first step is to create a NumPy array.
Next, we declare a memoryview and bind it to the NumPy array.
Here’s an example:
%%cython
[9]: import numpy as np
from numpy cimport float_t

def qm_cython(double x0, int n):


cdef int t
x_np_array = np.zeros(n+1, dtype=float)
cdef float_t [:] x = x_np_array
x[0] = x0
for t in range(n):
x[t+1] = 4.0 * x[t] * (1 - x[t])
return np.asarray(x)
10.4. JOBLIB 139

Here

β€’ cimport pulls in some compile-time information from NumPy


β€’ cdef float_t [:] x = x_np_array creates a memoryview on the NumPy array
x_np_array
β€’ the return statement uses np.asarray(x) to convert the memoryview back to a
NumPy array

Let’s time it:


qe.util.tic()
[10]: qm_cython(0.1, int(10**5))
qe.util.toc()

TOC: Elapsed: 0:00:0.00

0.0008153915405273438
[10]:

This is fast, although still slightly slower than qm_numba.

10.3.3 Summary

Cython requires more expertise than Numba, and is a little more fiddly in terms of getting
good performance.
In fact, it’s surprising how difficult it is to beat the speed improvements provided by Numba.
Nonetheless,

β€’ Cython is a very mature, stable and widely used tool.


β€’ Cython can be more useful than Numba when working with larger, more sophisticated
applications.

10.4 Joblib

Joblib is a popular Python library for caching and parallelization.


To install it, start Jupyter and type
!pip install joblib
[11]:

Requirement already satisfied: joblib in


/home/ubuntu/anaconda3/lib/python3.7/site-packages (0.13.2)

from within a notebook.


Here we review just the basics.

10.4.1 Caching

Perhaps, like us, you sometimes run a long computation that simulates a model at a given set
of parameters β€” to generate a figure, say, or a table.
140 CHAPTER 10. OTHER SCIENTIFIC LIBRARIES

20 minutes later you realize that you want to tweak the figure and now you have to do it all
again.
What caching will do is automatically store results at each parameterization.
With Joblib, results are compressed and stored on file, and automatically served back up to
you when you repeat the calculation.

10.4.2 An Example

Let’s look at a toy example, related to the quadratic map model discussed above.
Let’s say we want to generate a long trajectory from a certain initial condition π‘₯0 and see
what fraction of the sample is below 0.1.
(We’ll omit JIT compilation or other speedups for simplicity)
Here’s our code
from joblib import Memory
[12]: location = './cachedir'
memory = Memory(location='./joblib_cache')

@memory.cache
def qm(x0, n):
x = np.empty(n+1)
x[0] = x0
for t in range(n):
x[t+1] = 4 * x[t] * (1 - x[t])
return np.mean(x < 0.1)

We are using joblib to cache the result of calling qm at a given set of parameters.
With the argument location=’./joblib_cache’, any call to this function results in both the in-
put values and output values being stored a subdirectory joblib_cache of the present working
directory.
(In UNIX shells, . refers to the present working directory)
The first time we call the function with a given set of parameters we see some extra output
that notes information being cached
qe.util.tic()
[13]: n = int(1e7)
qm(0.2, n)
qe.util.toc()

________________________________________________________________________________
[Memory] Calling __main__--home-ubuntu-repos-lecture-source-py-_build-pdf-
jupyter-executed-__ipython-input__.qm…
qm(0.2, 10000000)
______________________________________________________________qm - 16.9s, 0.3min
TOC: Elapsed: 0:00:16.87

16.87077283859253
[13]:

The next time we call the function with the same set of parameters, the result is returned
almost instantaneously
qe.util.tic()
[14]: n = int(1e7)
qm(0.2, n)
qe.util.toc()
10.5. OTHER OPTIONS 141

TOC: Elapsed: 0:00:0.00

0.001150369644165039
[14]:

10.5 Other Options

There are in fact many other approaches to speeding up your Python code.
One is interfacing with Fortran.
If you are comfortable writing Fortran you will find it very easy to create extension modules
from Fortran code using F2Py.
F2Py is a Fortran-to-Python interface generator that is particularly simple to use.
Robert Johansson provides a very nice introduction to F2Py, among other things.
Recently, a Jupyter cell magic for Fortran has been developed β€” you might want to give it a
try.

10.6 Exercises

10.6.1 Exercise 1

Later we’ll learn all about finite-state Markov chains.


For now, let’s just concentrate on simulating a very simple example of such a chain.
Suppose that the volatility of returns on an asset can be in one of two regimes β€” high or low.
The transition probabilities across states are as follows

For example, let the period length be one month, and suppose the current state is high.
We see from the graph that the state next month will be

β€’ high with probability 0.8


β€’ low with probability 0.2

Your task is to simulate a sequence of monthly volatility states according to this rule.
Set the length of the sequence to n = 100000 and start in the high state.
Implement a pure Python version, a Numba version and a Cython version, and compare
speeds.
142 CHAPTER 10. OTHER SCIENTIFIC LIBRARIES

To test your code, evaluate the fraction of time that the chain spends in the low state.
If your code is correct, it should be about 2/3.

10.7 Solutions

10.7.1 Exercise 1

We let

β€’ 0 represent β€œlow”
β€’ 1 represent β€œhigh”

p, q = 0.1, 0.2 # Prob of leaving low and high state respectively


[15]:

Here’s a pure Python version of the function


def compute_series(n):
[16]: x = np.empty(n, dtype=np.int_)
x[0] = 1 # Start in state 1
U = np.random.uniform(0, 1, size=n)
for t in range(1, n):
current_x = x[t-1]
if current_x == 0:
x[t] = U[t] < p
else:
x[t] = U[t] > q
return x

Let’s run this code and check that the fraction of time spent in the low state is about 0.666
n = 100000
[17]: x = compute_series(n)
print(np.mean(x == 0)) # Fraction of time x is in state 0

0.66673

Now let’s time it


qe.util.tic()
[18]: compute_series(n)
qe.util.toc()

TOC: Elapsed: 0:00:0.14

0.14121103286743164
[18]:

Next let’s implement a Numba version, which is easy


from numba import jit
[19]:
compute_series_numba = jit(compute_series)

Let’s check we still get the right numbers


x = compute_series_numba(n)
[20]: print(np.mean(x == 0))
10.7. SOLUTIONS 143

0.66615

Let’s see the time


qe.util.tic()
[21]: compute_series_numba(n)
qe.util.toc()

TOC: Elapsed: 0:00:0.00

0.0012984275817871094
[21]:

This is a nice speed improvement for one line of code.


Now let’s implement a Cython version
%load_ext Cython
[22]:

The Cython extension is already loaded. To reload it, use:


%reload_ext Cython

%%cython
[23]: import numpy as np
from numpy cimport int_t, float_t

def compute_series_cy(int n):


# == Create NumPy arrays first == #
x_np = np.empty(n, dtype=int)
U_np = np.random.uniform(0, 1, size=n)
# == Now create memoryviews of the arrays == #
cdef int_t [:] x = x_np
cdef float_t [:] U = U_np
# == Other variable declarations == #
cdef float p = 0.1
cdef float q = 0.2
cdef int t
# == Main loop == #
x[0] = 1
for t in range(1, n):
current_x = x[t-1]
if current_x == 0:
x[t] = U[t] < p
else:
x[t] = U[t] > q
return np.asarray(x)

compute_series_cy(10)
[24]:
array([1, 1, 1, 1, 1, 0, 0, 0, 1, 0])
[24]:
x = compute_series_cy(n)
[25]: print(np.mean(x == 0))

0.65561

qe.util.tic()
[26]: compute_series_cy(n)
qe.util.toc()

TOC: Elapsed: 0:00:0.00


144 CHAPTER 10. OTHER SCIENTIFIC LIBRARIES

0.0029497146606445312
[26]:

The Cython implementation is fast but not as fast as Numba.


Part III

Advanced Python Programming

145
Chapter 11

Writing Good Code

11.1 Contents

β€’ Overview 11.2

β€’ An Example of Bad Code 11.3

β€’ Good Coding Practice 11.4

β€’ Revisiting the Example 11.5

β€’ Summary 11.6

11.2 Overview

When computer programs are small, poorly written code is not overly costly.
But more data, more sophisticated models, and more computer power are enabling us to take
on more challenging problems that involve writing longer programs.
For such programs, investment in good coding practices will pay high returns.
The main payoffs are higher productivity and faster code.
In this lecture, we review some elements of good coding practice.
We also touch on modern developments in scientific computing β€” such as just in time compi-
lation β€” and how they affect good program design.

11.3 An Example of Bad Code

Let’s have a look at some poorly written code.


The job of the code is to generate and plot time series of the simplified Solow model

π‘˜π‘‘+1 = π‘ π‘˜π‘‘π›Ό + (1 βˆ’ 𝛿)π‘˜π‘‘ , 𝑑 = 0, 1, 2, … (1)

Here

147
148 CHAPTER 11. WRITING GOOD CODE

β€’ π‘˜π‘‘ is capital at time 𝑑 and


β€’ 𝑠, 𝛼, 𝛿 are parameters (savings, a productivity parameter and depreciation)

For each parameterization, the code

1. sets π‘˜0 = 1
2. iterates using Eq. (1) to produce a sequence π‘˜0 , π‘˜1 , π‘˜2 … , π‘˜π‘‡
3. plots the sequence

The plots will be grouped into three subfigures.


In each subfigure, two parameters are held fixed while another varies
import numpy as np
[1]: import matplotlib.pyplot as plt
%matplotlib inline

# Allocate memory for time series


k = np.empty(50)

fig, axes = plt.subplots(3, 1, figsize=(12, 15))

# Trajectories with different Ξ±


Ξ΄ = 0.1
s = 0.4
Ξ± = (0.25, 0.33, 0.45)

for j in range(3):
k[0] = 1
for t in range(49):
k[t+1] = s * k[t]**Ξ±[j] + (1 - Ξ΄) * k[t]
axes[0].plot(k, 'o-', label=rf"$\alpha = {Ξ±[j]},\; s = {s},\; \delta={Ξ΄}$")

axes[0].grid(lw=0.2)
axes[0].set_ylim(0, 18)
axes[0].set_xlabel('time')
axes[0].set_ylabel('capital')
axes[0].legend(loc='upper left', frameon=True, fontsize=14)

# Trajectories with different s


Ξ΄ = 0.1
Ξ± = 0.33
s = (0.3, 0.4, 0.5)

for j in range(3):
k[0] = 1
for t in range(49):
k[t+1] = s[j] * k[t]**Ξ± + (1 - Ξ΄) * k[t]
axes[1].plot(k, 'o-', label=rf"$\alpha = {Ξ±},\; s = {s},\; \delta={Ξ΄}$")

axes[1].grid(lw=0.2)
axes[1].set_xlabel('time')
axes[1].set_ylabel('capital')
axes[1].set_ylim(0, 18)
axes[1].legend(loc='upper left', frameon=True, fontsize=14)

# Trajectories with different Ξ΄


Ξ΄ = (0.05, 0.1, 0.15)
Ξ± = 0.33
s = 0.4

for j in range(3):
k[0] = 1
for t in range(49):
k[t+1] = s * k[t]**Ξ± + (1 - Ξ΄[j]) * k[t]
axes[2].plot(k, 'o-', label=rf"$\alpha = {Ξ±},\; s = {s},\; \delta={Ξ΄[j]}$")

axes[2].set_ylim(0, 18)
11.3. AN EXAMPLE OF BAD CODE 149

axes[2].set_xlabel('time')
axes[2].set_ylabel('capital')
axes[2].grid(lw=0.2)
axes[2].legend(loc='upper left', frameon=True, fontsize=14)

plt.show()

True, the code more or less follows PEP8.


At the same time, it’s very poorly structured.
Let’s talk about why that’s the case, and what we can do about it.
150 CHAPTER 11. WRITING GOOD CODE

11.4 Good Coding Practice

There are usually many different ways to write a program that accomplishes a given task.
For small programs, like the one above, the way you write code doesn’t matter too much.
But if you are ambitious and want to produce useful things, you’ll write medium to large pro-
grams too.
In those settings, coding style matters a great deal.
Fortunately, lots of smart people have thought about the best way to write code.
Here are some basic precepts.

11.4.1 Don’t Use Magic Numbers

If you look at the code above, you’ll see numbers like 50 and 49 and 3 scattered through the
code.
These kinds of numeric literals in the body of your code are sometimes called β€œmagic num-
bers”.
This is not a complement.
While numeric literals are not all evil, the numbers shown in the program above should cer-
tainly be replaced by named constants.
For example, the code above could declare the variable time_series_length = 50.
Then in the loops, 49 should be replaced by time_series_length - 1.
The advantages are:

β€’ the meaning is much clearer throughout


β€’ to alter the time series length, you only need to change one value

11.4.2 Don’t Repeat Yourself

The other mortal sin in the code snippet above is repetition.


Blocks of logic (such as the loop to generate time series) are repeated with only minor
changes.
This violates a fundamental tenet of programming: Don’t repeat yourself (DRY).

β€’ Also called DIE (duplication is evil).

Yes, we realize that you can just cut and paste and change a few symbols.
But as a programmer, your aim should be to automate repetition, not do it yourself.
More importantly, repeating the same logic in different places means that eventually one of
them will likely be wrong.
If you want to know more, read the excellent summary found on this page.
We’ll talk about how to avoid repetition below.
11.4. GOOD CODING PRACTICE 151

11.4.3 Minimize Global Variables

Sure, global variables (i.e., names assigned to values outside of any function or class) are con-
venient.
Rookie programmers typically use global variables with abandon β€” as we once did ourselves.
But global variables are dangerous, especially in medium to large size programs, since

β€’ they can affect what happens in any part of your program


β€’ they can be changed by any function

This makes it much harder to be certain about what some small part of a given piece of code
actually commands.
Here’s a useful discussion on the topic.
While the odd global in small scripts is no big deal, we recommend that you teach yourself to
avoid them.
(We’ll discuss how just below).
JIT Compilation
In fact, there’s now another good reason to avoid global variables.
In scientific computing, we’re witnessing the rapid growth of just in time (JIT) compilation.
JIT compilation can generate excellent performance for scripting languages like Python and
Julia.
But the task of the compiler used for JIT compilation becomes much harder when many
global variables are present.
(This is because data type instability hinders the generation of efficient machine code β€” we’ll
learn more about such topics later on)

11.4.4 Use Functions or Classes

Fortunately, we can easily avoid the evils of global variables and WET code.

β€’ WET stands for β€œwe love typing” and is the opposite of DRY.

We can do this by making frequent use of functions or classes.


In fact, functions and classes are designed specifically to help us avoid shaming ourselves by
repeating code or excessive use of global variables.
Which One, Functions or Classes?
Both can be useful, and in fact they work well with each other.
We’ll learn more about these topics over time.
(Personal preference is part of the story too)
What’s really important is that you use one or the other or both.
152 CHAPTER 11. WRITING GOOD CODE

11.5 Revisiting the Example

Here’s some code that reproduces the plot above with better coding style.
It uses a function to avoid repetition.
Note also that

β€’ global variables are quarantined by collecting together at the end, not the start of the
program
β€’ magic numbers are avoided
β€’ the loop at the end where the actual work is done is short and relatively simple

from itertools import product


[2]:
def plot_path(ax, Ξ±s, s_vals, Ξ΄s, series_length=50):
"""
Add a time series plot to the axes ax for all given parameters.
"""
k = np.empty(series_length)

for (Ξ±, s, Ξ΄) in product(Ξ±s, s_vals, Ξ΄s):


k[0] = 1
for t in range(series_length-1):
k[t+1] = s * k[t]**Ξ± + (1 - Ξ΄) * k[t]
ax.plot(k, 'o-', label=rf"$\alpha = {Ξ±},\; s = {s},\; \delta = {Ξ΄}$")

ax.grid(lw=0.2)
ax.set_xlabel('time')
ax.set_ylabel('capital')
ax.set_ylim(0, 18)
ax.legend(loc='upper left', frameon=True, fontsize=14)

fig, axes = plt.subplots(3, 1, figsize=(12, 15))

# Parameters (Ξ±s, s_vals, Ξ΄s)


set_one = ([0.25, 0.33, 0.45], [0.4], [0.1])
set_two = ([0.33], [0.3, 0.4, 0.5], [0.1])
set_three = ([0.33], [0.4], [0.05, 0.1, 0.15])

for (ax, params) in zip(axes, (set_one, set_two, set_three)):


Ξ±s, s_vals, Ξ΄s = params
plot_path(ax, Ξ±s, s_vals, Ξ΄s)

plt.show()
11.6. SUMMARY 153

11.6 Summary

Writing decent code isn’t hard.


It’s also fun and intellectually satisfying.
We recommend that you cultivate good habits and style even when you write relatively short
programs.
154 CHAPTER 11. WRITING GOOD CODE
Chapter 12

OOP II: Building Classes

12.1 Contents

β€’ Overview 12.2

β€’ OOP Review 12.3

β€’ Defining Your Own Classes 12.4

β€’ Special Methods 12.5

β€’ Exercises 12.6

β€’ Solutions 12.7

12.2 Overview

In an earlier lecture, we learned some foundations of object-oriented programming.


The objectives of this lecture are

β€’ cover OOP in more depth


β€’ learn how to build our own objects, specialized to our needs

For example, you already know how to

β€’ create lists, strings and other Python objects


β€’ use their methods to modify their contents

So imagine now you want to write a program with consumers, who can

β€’ hold and spend cash


β€’ consume goods
β€’ work and earn cash

A natural solution in Python would be to create consumers as objects with

155
156 CHAPTER 12. OOP II: BUILDING CLASSES

β€’ data, such as cash on hand


β€’ methods, such as buy or work that affect this data

Python makes it easy to do this, by providing you with class definitions.


Classes are blueprints that help you build objects according to your own specifications.
It takes a little while to get used to the syntax so we’ll provide plenty of examples.
We’ll use the following imports:
import numpy as np
[1]: import matplotlib.pyplot as plt
%matplotlib inline

12.3 OOP Review

OOP is supported in many languages:

β€’ JAVA and Ruby are relatively pure OOP.


β€’ Python supports both procedural and object-oriented programming.
β€’ Fortran and MATLAB are mainly procedural, some OOP recently tacked on.
β€’ C is a procedural language, while C++ is C with OOP added on top.

Let’s cover general OOP concepts before we specialize to Python.

12.3.1 Key Concepts

As discussed an earlier lecture, in the OOP paradigm, data and functions are bundled to-
gether into β€œobjects”.
An example is a Python list, which not only stores data but also knows how to sort itself, etc.
x = [1, 5, 4]
[2]: x.sort()
x

[1, 4, 5]
[2]:

As we now know, sort is a function that is β€œpart of” the list object β€” and hence called a
method.
If we want to make our own types of objects we need to use class definitions.
A class definition is a blueprint for a particular class of objects (e.g., lists, strings or complex
numbers).
It describes

β€’ What kind of data the class stores


β€’ What methods it has for acting on these data

An object or instance is a realization of the class, created from the blueprint

β€’ Each instance has its own unique data.


12.4. DEFINING YOUR OWN CLASSES 157

β€’ Methods set out in the class definition act on this (and other) data.

In Python, the data and methods of an object are collectively referred to as attributes.
Attributes are accessed via β€œdotted attribute notation”

β€’ object_name.data
β€’ object_name.method_name()

In the example
x = [1, 5, 4]
[3]: x.sort()
x.__class__

list
[3]:

β€’ x is an object or instance, created from the definition for Python lists, but with its own
particular data.
β€’ x.sort() and x.__class__ are two attributes of x.
β€’ dir(x) can be used to view all the attributes of x.

12.3.2 Why is OOP Useful?

OOP is useful for the same reason that abstraction is useful: for recognizing and exploiting
the common structure.
For example,

β€’ a Markov chain consists of a set of states and a collection of transition probabilities for
moving across states
β€’ a general equilibrium theory consists of a commodity space, preferences, technologies,
and an equilibrium definition
β€’ a game consists of a list of players, lists of actions available to each player, player pay-
offs as functions of all players’ actions, and a timing protocol

These are all abstractions that collect together β€œobjects” of the same β€œtype”.
Recognizing common structure allows us to employ common tools.
In economic theory, this might be a proposition that applies to all games of a certain type.
In Python, this might be a method that’s useful for all Markov chains (e.g., simulate).
When we use OOP, the simulate method is conveniently bundled together with the Markov
chain object.

12.4 Defining Your Own Classes

Let’s build some simple classes to start off.


158 CHAPTER 12. OOP II: BUILDING CLASSES

12.4.1 Example: A Consumer Class

First, we’ll build a Consumer class with

β€’ a wealth attribute that stores the consumer’s wealth (data)


β€’ an earn method, where earn(y) increments the consumer’s wealth by y
β€’ a spend method, where spend(x) either decreases wealth by x or returns an error if
insufficient funds exist

Admittedly a little contrived, this example of a class helps us internalize some new syntax.
Here’s one implementation
class Consumer:
[4]:
def __init__(self, w):
"Initialize consumer with w dollars of wealth"
self.wealth = w

def earn(self, y):


"The consumer earns y dollars"
self.wealth += y

def spend(self, x):


"The consumer spends x dollars if feasible"
new_wealth = self.wealth - x
if new_wealth < 0:
print("Insufficent funds")
else:
self.wealth = new_wealth

There’s some special syntax here so let’s step through carefully

β€’ The class keyword indicates that we are building a class.

This class defines instance data wealth and three methods: __init__, earn and spend

β€’ wealth is instance data because each consumer we create (each instance of the
Consumer class) will have its own separate wealth data.

The ideas behind the earn and spend methods were discussed above.
Both of these act on the instance data wealth.
The __init__ method is a constructor method.
Whenever we create an instance of the class, this method will be called automatically.
Calling __init__ sets up a β€œnamespace” to hold the instance data β€” more on this soon.
We’ll also discuss the role of self just below.
Usage
Here’s an example of usage
c1 = Consumer(10) # Create instance with initial wealth 10
[5]: c1.spend(5)
c1.wealth

5
[5]:
12.4. DEFINING YOUR OWN CLASSES 159

c1.earn(15)
[6]: c1.spend(100)

Insufficent funds

We can of course create multiple instances each with its own data
c1 = Consumer(10)
[7]: c2 = Consumer(12)
c2.spend(4)
c2.wealth

8
[7]:
c1.wealth
[8]:
10
[8]:

In fact, each instance stores its data in a separate namespace dictionary


c1.__dict__
[9]:
{'wealth': 10}
[9]:
c2.__dict__
[10]:
{'wealth': 8}
[10]:

When we access or set attributes we’re actually just modifying the dictionary maintained by
the instance.
Self
If you look at the Consumer class definition again you’ll see the word self throughout the
code.
The rules with self are that

β€’ Any instance data should be prepended with self

– e.g., the earn method references self.wealth rather than just wealth

β€’ Any method defined within the class should have self as its first argument

– e.g., def earn(self, y) rather than just def earn(y)

β€’ Any method referenced within the class should be called as self.method_name

There are no examples of the last rule in the preceding code but we will see some shortly.
Details
In this section, we look at some more formal details related to classes and self

β€’ You might wish to skip to the next section on first pass of this lecture.
β€’ You can return to these details after you’ve familiarized yourself with more examples.

Methods actually live inside a class object formed when the interpreter reads the class defini-
tion
160 CHAPTER 12. OOP II: BUILDING CLASSES

print(Consumer.__dict__) # Show __dict__ attribute of class object


[11]:

{'__module__': '__main__', '__init__': <function Consumer.__init__ at


0x7f1583b13488>, 'earn': <function Consumer.earn at 0x7f1583b13510>, 'spend':
<function Consumer.spend at 0x7f1583b13598>, '__dict__': <attribute '__dict__'
of 'Consumer' objects>, '__weakref__': <attribute '__weakref__' of 'Consumer'
objects>, '__doc__': None}

Note how the three methods __init__, earn and spend are stored in the class object.
Consider the following code
c1 = Consumer(10)
[12]: c1.earn(10)
c1.wealth

20
[12]:

When you call earn via c1.earn(10) the interpreter passes the instance c1 and the argu-
ment 10 to Consumer.earn.
In fact, the following are equivalent

β€’ c1.earn(10)
β€’ Consumer.earn(c1, 10)

In the function call Consumer.earn(c1, 10) note that c1 is the first argument.
Recall that in the definition of the earn method, self is the first parameter
def earn(self, y):
[13]: "The consumer earns y dollars"
self.wealth += y

The end result is that self is bound to the instance c1 inside the function call.
That’s why the statement self.wealth += y inside earn ends up modifying c1.wealth.

12.4.2 Example: The Solow Growth Model

For our next example, let’s write a simple class to implement the Solow growth model.
The Solow growth model is a neoclassical growth model where the amount of capital stock
per capita π‘˜π‘‘ evolves according to the rule

π‘ π‘§π‘˜π‘‘π›Ό + (1 βˆ’ 𝛿)π‘˜π‘‘
π‘˜π‘‘+1 = (1)
1+𝑛

Here

β€’ 𝑠 is an exogenously given savings rate


β€’ 𝑧 is a productivity parameter
β€’ 𝛼 is capital’s share of income
β€’ 𝑛 is the population growth rate
β€’ 𝛿 is the depreciation rate
12.4. DEFINING YOUR OWN CLASSES 161

The steady state of the model is the π‘˜ that solves Eq. (1) when π‘˜π‘‘+1 = π‘˜π‘‘ = π‘˜.
Here’s a class that implements this model.
Some points of interest in the code are

β€’ An instance maintains a record of its current capital stock in the variable self.k.

β€’ The h method implements the right-hand side of Eq. (1).

β€’ The update method uses h to update capital as per Eq. (1).

– Notice how inside update the reference to the local method h is self.h.

The methods steady_state and generate_sequence are fairly self-explanatory


class Solow:
[14]: r"""
Implements the Solow growth model with the update rule

k_{t+1} = [(s z k^Ξ±_t) + (1 - Ξ΄)k_t] /(1 + n)

"""
def __init__(self, n=0.05, # population growth rate
s=0.25, # savings rate
Ξ΄=0.1, # depreciation rate
Ξ±=0.3, # share of labor
z=2.0, # productivity
k=1.0): # current capital stock

self.n, self.s, self.Ξ΄, self.Ξ±, self.z = n, s, Ξ΄, Ξ±, z


self.k = k

def h(self):
"Evaluate the h function"
# Unpack parameters (get rid of self to simplify notation)
n, s, Ξ΄, Ξ±, z = self.n, self.s, self.Ξ΄, self.Ξ±, self.z
# Apply the update rule
return (s * z * self.k**Ξ± + (1 - Ξ΄) * self.k) / (1 + n)

def update(self):
"Update the current state (i.e., the capital stock)."
self.k = self.h()

def steady_state(self):
"Compute the steady state value of capital."
# Unpack parameters (get rid of self to simplify notation)
n, s, Ξ΄, Ξ±, z = self.n, self.s, self.Ξ΄, self.Ξ±, self.z
# Compute and return steady state
return ((s * z) / (n + Ξ΄))**(1 / (1 - Ξ±))

def generate_sequence(self, t):


"Generate and return a time series of length t"
path = []
for i in range(t):
path.append(self.k)
self.update()
return path

Here’s a little program that uses the class to compute time series from two different initial
conditions.
The common steady state is also plotted for comparison
s1 = Solow()
[15]: s2 = Solow(k=8.0)

T =
162 CHAPTER 12. OOP II: BUILDING CLASSES

fig, ax = plt.subplots(figsize=(9, 6))

# Plot the common steady state value of capital


ax.plot([s1.steady_state()]*T, 'k-', label='steady state')

# Plot time series for each economy


for s in s1, s2:
lb = f'capital series from initial state {s.k}'
ax.plot(s.generate_sequence(T), 'o-', lw=2, alpha=0.6, label=lb)

ax.legend()
plt.show()

12.4.3 Example: A Market

Next, let’s write a class for a simple one good market where agents are price takers.
The market consists of the following objects:

β€’ A linear demand curve 𝑄 = π‘Žπ‘‘ βˆ’ 𝑏𝑑 𝑝


β€’ A linear supply curve 𝑄 = π‘Žπ‘§ + 𝑏𝑧 (𝑝 βˆ’ 𝑑)

Here

β€’ 𝑝 is price paid by the consumer, 𝑄 is quantity and 𝑑 is a per-unit tax.


β€’ Other symbols are demand and supply parameters.

The class provides methods to compute various values of interest, including competitive equi-
librium price and quantity, tax revenue raised, consumer surplus and producer surplus.
Here’s our implementation
12.4. DEFINING YOUR OWN CLASSES 163

from scipy.integrate import quad


[16]:
class Market:

def __init__(self, ad, bd, az, bz, tax):


"""
Set up market parameters. All parameters are scalars. See
https://lectures.quantecon.org/py/python_oop.html for interpretation.

"""
self.ad, self.bd, self.az, self.bz, self.tax = ad, bd, az, bz, tax
if ad < az:
raise ValueError('Insufficient demand.')

def price(self):
"Return equilibrium price"
return (self.ad - self.az + self.bz * self.tax) / (self.bd + self.bz)

def quantity(self):
"Compute equilibrium quantity"
return self.ad - self.bd * self.price()

def consumer_surp(self):
"Compute consumer surplus"
# == Compute area under inverse demand function == #
integrand = lambda x: (self.ad / self.bd) - (1 / self.bd) * x
area, error = quad(integrand, 0, self.quantity())
return area - self.price() * self.quantity()

def producer_surp(self):
"Compute producer surplus"
# == Compute area above inverse supply curve, excluding tax == #
integrand = lambda x: -(self.az / self.bz) + (1 / self.bz) * x
area, error = quad(integrand, 0, self.quantity())
return (self.price() - self.tax) * self.quantity() - area

def taxrev(self):
"Compute tax revenue"
return self.tax * self.quantity()

def inverse_demand(self, x):


"Compute inverse demand"
return self.ad / self.bd - (1 / self.bd)* x

def inverse_supply(self, x):


"Compute inverse supply curve"
return -(self.az / self.bz) + (1 / self.bz) * x + self.tax

def inverse_supply_no_tax(self, x):


"Compute inverse supply curve without tax"
return -(self.az / self.bz) + (1 / self.bz) * x

Here’s a sample of usage


baseline_params = 15, .5, -2, .5, 3
[17]: m = Market(*baseline_params)
print("equilibrium price = ", m.price())

equilibrium price = 18.5

print("consumer surplus = ", m.consumer_surp())


[18]:

consumer surplus = 33.0625

Here’s a short program that uses this class to plot an inverse demand curve together with in-
verse supply curves with and without taxes
164 CHAPTER 12. OOP II: BUILDING CLASSES

# Baseline ad, bd, az, bz, tax


[19]: baseline_params = 15, .5, -2, .5, 3
m = Market(*baseline_params)

q_max = m.quantity() * 2
q_grid = np.linspace(0.0, q_max, 100)
pd = m.inverse_demand(q_grid)
ps = m.inverse_supply(q_grid)
psno = m.inverse_supply_no_tax(q_grid)

fig, ax = plt.subplots()
ax.plot(q_grid, pd, lw=2, alpha=0.6, label='demand')
ax.plot(q_grid, ps, lw=2, alpha=0.6, label='supply')
ax.plot(q_grid, psno, '--k', lw=2, alpha=0.6, label='supply without tax')
ax.set_xlabel('quantity', fontsize=14)
ax.set_xlim(0, q_max)
ax.set_ylabel('price', fontsize=14)
ax.legend(loc='lower right', frameon=False, fontsize=14)
plt.show()

The next program provides a function that

β€’ takes an instance of Market as a parameter


β€’ computes dead weight loss from the imposition of the tax

def deadw(m):
[20]: "Computes deadweight loss for market m."
# == Create analogous market with no tax == #
m_no_tax = Market(m.ad, m.bd, m.az, m.bz, 0)
# == Compare surplus, return difference == #
surp1 = m_no_tax.consumer_surp() + m_no_tax.producer_surp()
surp2 = m.consumer_surp() + m.producer_surp() + m.taxrev()
return surp1 - surp2

Here’s an example of usage


12.4. DEFINING YOUR OWN CLASSES 165

baseline_params = 15, .5, -2, .5, 3


[21]: m = Market(*baseline_params)
deadw(m) # Show deadweight loss

1.125
[21]:

12.4.4 Example: Chaos

Let’s look at one more example, related to chaotic dynamics in nonlinear systems.
One simple transition rule that can generate complex dynamics is the logistic map

π‘₯𝑑+1 = π‘Ÿπ‘₯𝑑 (1 βˆ’ π‘₯𝑑 ), π‘₯0 ∈ [0, 1], π‘Ÿ ∈ [0, 4] (2)

Let’s write a class for generating time series from this model.
Here’s one implementation
class Chaos:
[22]: """
Models the dynamical system with :math:`x_{t+1} = r x_t (1 - x_t)`
"""
def __init__(self, x0, r):
"""
Initialize with state x0 and parameter r
"""
self.x, self.r = x0, r

def update(self):
"Apply the map to update state."
self.x = self.r * self.x *(1 - self.x)

def generate_sequence(self, n):


"Generate and return a sequence of length n."
path = []
for i in range(n):
path.append(self.x)
self.update()
return path

Here’s an example of usage


ch = Chaos(0.1, 4.0) # x0 = 0.1 and r = 0.4
[23]: ch.generate_sequence(5) # First 5 iterates

[0.1, 0.36000000000000004, 0.9216, 0.28901376000000006, 0.8219392261226498]


[23]:

This piece of code plots a longer trajectory


ch = Chaos(0.1, 4.0)
[24]: ts_length = 250

fig, ax = plt.subplots()
ax.set_xlabel('$t$', fontsize=14)
ax.set_ylabel('$x_t$', fontsize=14)
x = ch.generate_sequence(ts_length)
ax.plot(range(ts_length), x, 'bo-', alpha=0.5, lw=2, label='$x_t$')
plt.show()
166 CHAPTER 12. OOP II: BUILDING CLASSES

The next piece of code provides a bifurcation diagram


fig, ax = plt.subplots()
[25]: ch = Chaos(0.1, 4)
r = 2.5
while r < 4:
ch.r = r
t = ch.generate_sequence(1000)[950:]
ax.plot([r] * len(t), t, 'b.', ms=0.6)
r = r + 0.005

ax.set_xlabel('$r$', fontsize=16)
plt.show()
12.5. SPECIAL METHODS 167

On the horizontal axis is the parameter π‘Ÿ in Eq. (2).


The vertical axis is the state space [0, 1].
For each π‘Ÿ we compute a long time series and then plot the tail (the last 50 points).
The tail of the sequence shows us where the trajectory concentrates after settling down to
some kind of steady state, if a steady state exists.
Whether it settles down, and the character of the steady state to which it does settle down,
depend on the value of π‘Ÿ.
For π‘Ÿ between about 2.5 and 3, the time series settles into a single fixed point plotted on the
vertical axis.
For π‘Ÿ between about 3 and 3.45, the time series settles down to oscillating between the two
values plotted on the vertical axis.
For π‘Ÿ a little bit higher than 3.45, the time series settles down to oscillating among the four
values plotted on the vertical axis.
Notice that there is no value of π‘Ÿ that leads to a steady state oscillating among three values.

12.5 Special Methods

Python provides special methods with which some neat tricks can be performed.
For example, recall that lists and tuples have a notion of length and that this length can be
queried via the len function
x = (10, 20)
[26]: len(x)

2
[26]:
168 CHAPTER 12. OOP II: BUILDING CLASSES

If you want to provide a return value for the len function when applied to your user-defined
object, use the __len__ special method
class Foo:
[27]:
def __len__(self):
return 42

Now we get
f = Foo()
[28]: len(f)

42
[28]:

A special method we will use regularly is the __call__ method.


This method can be used to make your instances callable, just like functions
class Foo:
[29]:
def __call__(self, x):
return x + 42

After running we get


f = Foo()
[30]: f(8) # Exactly equivalent to f.__call__(8)

50
[30]:

Exercise 1 provides a more useful example.

12.6 Exercises

12.6.1 Exercise 1

The empirical cumulative distribution function (ecdf) corresponding to a sample {𝑋𝑖 }𝑛𝑖=1 is
defined as

1 𝑛
𝐹𝑛 (π‘₯) ∢= βˆ‘ 1{𝑋𝑖 ≀ π‘₯} (π‘₯ ∈ R) (3)
𝑛 𝑖=1

Here 1{𝑋𝑖 ≀ π‘₯} is an indicator function (one if 𝑋𝑖 ≀ π‘₯ and zero otherwise) and hence 𝐹𝑛 (π‘₯)
is the fraction of the sample that falls below π‘₯.
The Glivenko–Cantelli Theorem states that, provided that the sample is IID, the ecdf 𝐹𝑛 con-
verges to the true distribution function 𝐹 .
Implement 𝐹𝑛 as a class called ECDF, where

β€’ A given sample {𝑋𝑖 }𝑛𝑖=1 are the instance data, stored as self.observations.
β€’ The class implements a __call__ method that returns 𝐹𝑛 (π‘₯) for any π‘₯.

Your code should work as follows (modulo randomness)

from random import uniform


12.7. SOLUTIONS 169

samples = [uniform(0, 1) for i in range(10)]


F = ECDF(samples)
F(0.5) # Evaluate ecdf at x = 0.5

F.observations = [uniform(0, 1) for i in range(1000)]


F(0.5)

Aim for clarity, not efficiency.

12.6.2 Exercise 2

In an earlier exercise, you wrote a function for evaluating polynomials.


This exercise is an extension, where the task is to build a simple class called Polynomial for
representing and manipulating polynomial functions such as

𝑁
2
𝑝(π‘₯) = π‘Ž0 + π‘Ž1 π‘₯ + π‘Ž2 π‘₯ + β‹― π‘Žπ‘ π‘₯ 𝑁
= βˆ‘ π‘Žπ‘› π‘₯𝑛 (π‘₯ ∈ R) (4)
𝑛=0

The instance data for the class Polynomial will be the coefficients (in the case of Eq. (4),
the numbers π‘Ž0 , … , π‘Žπ‘ ).
Provide methods that

1. Evaluate the polynomial Eq. (4), returning 𝑝(π‘₯) for any π‘₯.


2. Differentiate the polynomial, replacing the original coefficients with those of its deriva-
tive 𝑝′ .

Avoid using any import statements.

12.7 Solutions

12.7.1 Exercise 1
class ECDF:
[31]:
def __init__(self, observations):
self.observations = observations

def __call__(self, x):


counter = 0.0
for obs in self.observations:
if obs <= x:
counter += 1
return counter / len(self.observations)

# == test == #
[32]:
from random import uniform

samples = [uniform(0, 1) for i in range(10)]


F = ECDF(samples)

print(F(0.5)) # Evaluate ecdf at x = 0.5


170 CHAPTER 12. OOP II: BUILDING CLASSES

F.observations = [uniform(0, 1) for i in range(1000)]

print(F(0.5))

0.4
0.48

12.7.2 Exercise 2
class Polynomial:
[33]:
def __init__(self, coefficients):
"""
Creates an instance of the Polynomial class representing

p(x) = a_0 x^0 + ... + a_N x^N,

where a_i = coefficients[i].


"""
self.coefficients = coefficients

def __call__(self, x):


"Evaluate the polynomial at x."
y = 0
for i, a in enumerate(self.coefficients):
y += a * x**i
return y

def differentiate(self):
"Reset self.coefficients to those of p' instead of p."
new_coefficients = []
for i, a in enumerate(self.coefficients):
new_coefficients.append(i * a)
# Remove the first element, which is zero
del new_coefficients[0]
# And reset coefficients data to new values
self.coefficients = new_coefficients
return new_coefficients
Chapter 13

OOP III: Samuelson Multiplier


Accelerator

13.1 Contents

β€’ Overview 13.2

β€’ Details 13.3

β€’ Implementation 13.4

β€’ Stochastic Shocks 13.5

β€’ Government Spending 13.6

β€’ Wrapping Everything Into a Class 13.7

β€’ Using the LinearStateSpace Class 13.8

β€’ Pure Multiplier Model 13.9

β€’ Summary 13.10

Co-author: Natasha Watkins


In addition to what’s in Anaconda, this lecture will need the following libraries:
!pip install --upgrade quantecon
[1]:

13.2 Overview

This lecture creates non-stochastic and stochastic versions of Paul Samuelson’s celebrated
multiplier accelerator model [118].
In doing so, we extend the example of the Solow model class in our second OOP lecture.
Our objectives are to

β€’ provide a more detailed example of OOP and classes


β€’ review a famous model

171
172 CHAPTER 13. OOP III: SAMUELSON MULTIPLIER ACCELERATOR

β€’ review linear difference equations, both deterministic and stochastic

Let’s start with some standard imports:


import numpy as np
[2]: import matplotlib.pyplot as plt
%matplotlib inline

We’ll also use the following for various tasks described below:
from quantecon import LinearStateSpace
[3]: import cmath
import math
import sympy
from sympy import Symbol, init_printing
from cmath import sqrt

13.2.1 Samuelson’s Model

Samuelson used a second-order linear difference equation to represent a model of national out-
put based on three components:

β€’ a national output identity asserting that national outcome is the sum of consumption
plus investment plus government purchases.
β€’ a Keynesian consumption function asserting that consumption at time 𝑑 is equal to a
constant times national output at time 𝑑 βˆ’ 1.
β€’ an investment accelerator asserting that investment at time 𝑑 equals a constant called
the accelerator coefficient times the difference in output between period 𝑑 βˆ’ 1 and 𝑑 βˆ’ 2.
β€’ the idea that consumption plus investment plus government purchases constitute aggre-
gate demand, which automatically calls forth an equal amount of aggregate supply.

(To read about linear difference equations see here or chapter IX of [121])
Samuelson used the model to analyze how particular values of the marginal propensity to
consume and the accelerator coefficient might give rise to transient business cycles in national
output.
Possible dynamic properties include

β€’ smooth convergence to a constant level of output


β€’ damped business cycles that eventually converge to a constant level of output
β€’ persistent business cycles that neither dampen nor explode

Later we present an extension that adds a random shock to the right side of the national in-
come identity representing random fluctuations in aggregate demand.
This modification makes national output become governed by a second-order stochastic linear
difference equation that, with appropriate parameter values, gives rise to recurrent irregular
business cycles.
(To read about stochastic linear difference equations see chapter XI of [121])

13.3 Details

Let’s assume that


13.3. DETAILS 173

β€’ {𝐺𝑑 } is a sequence of levels of government expenditures – we’ll start by setting 𝐺𝑑 = 𝐺


for all 𝑑.

β€’ {𝐢𝑑 } is a sequence of levels of aggregate consumption expenditures, a key endogenous


variable in the model.

β€’ {𝐼𝑑 } is a sequence of rates of investment, another key endogenous variable.

β€’ {π‘Œπ‘‘ } is a sequence of levels of national income, yet another endogenous variable.

β€’ π‘Ž is the marginal propensity to consume in the Keynesian consumption function 𝐢𝑑 =


π‘Žπ‘Œπ‘‘βˆ’1 + 𝛾.

β€’ 𝑏 is the β€œaccelerator coefficient” in the β€œinvestment accelerator” 𝐼_𝑑 = 𝑏(π‘Œ _𝑑 βˆ’ 1 βˆ’


π‘Œ _𝑑 βˆ’ 2).

β€’ {πœ–π‘‘ } is an IID sequence standard normal random variables.

β€’ 𝜎 β‰₯ 0 is a β€œvolatility” parameter β€” setting 𝜎 = 0 recovers the non-stochastic case that


we’ll start with.

The model combines the consumption function

𝐢𝑑 = π‘Žπ‘Œπ‘‘βˆ’1 + 𝛾 (1)

with the investment accelerator

𝐼𝑑 = 𝑏(π‘Œπ‘‘βˆ’1 βˆ’ π‘Œπ‘‘βˆ’2 ) (2)

and the national income identity

π‘Œπ‘‘ = 𝐢𝑑 + 𝐼𝑑 + 𝐺𝑑 (3)

β€’ The parameter π‘Ž is peoples’ marginal propensity to consume out of income - equation


Eq. (1) asserts that people consume a fraction of math:a in (0,1) of each additional dol-
lar of income.
β€’ The parameter 𝑏 > 0 is the investment accelerator coefficient - equation Eq. (2) asserts
that people invest in physical capital when income is increasing and disinvest when it is
decreasing.

Equations Eq. (1), Eq. (2), and Eq. (3) imply the following second-order linear difference
equation for national income:

π‘Œπ‘‘ = (π‘Ž + 𝑏)π‘Œπ‘‘βˆ’1 βˆ’ π‘π‘Œπ‘‘βˆ’2 + (𝛾 + 𝐺𝑑 )

or

π‘Œπ‘‘ = 𝜌1 π‘Œπ‘‘βˆ’1 + 𝜌2 π‘Œπ‘‘βˆ’2 + (𝛾 + 𝐺𝑑 ) (4)

where 𝜌1 = (π‘Ž + 𝑏) and 𝜌2 = βˆ’π‘.


174 CHAPTER 13. OOP III: SAMUELSON MULTIPLIER ACCELERATOR

To complete the model, we require two initial conditions.


If the model is to generate time series for 𝑑 = 0, … , 𝑇 , we require initial values

Μ„ ,
π‘Œβˆ’1 = π‘Œβˆ’1 Μ„
π‘Œβˆ’2 = π‘Œβˆ’2

We’ll ordinarily set the parameters (π‘Ž, 𝑏) so that starting from an arbitrary pair of initial con-
Μ„ , π‘Œβˆ’2
ditions (π‘Œβˆ’1 Μ„ ), national income π‘Œ _𝑑 converges to a constant value as 𝑑 becomes large.

We are interested in studying

β€’ the transient fluctuations in π‘Œπ‘‘ as it converges to its steady state level


β€’ the rate at which it converges to a steady state level

The deterministic version of the model described so far β€” meaning that no random shocks
hit aggregate demand β€” has only transient fluctuations.
We can convert the model to one that has persistent irregular fluctuations by adding a ran-
dom shock to aggregate demand.

13.3.1 Stochastic Version of the Model

We create a random or stochastic version of the model by adding a random process of


shocks or disturbances {πœŽπœ–π‘‘ } to the right side of equation Eq. (4), leading to the second-
order scalar linear stochastic difference equation:

π‘Œπ‘‘ = 𝐺𝑑 + π‘Ž(1 βˆ’ 𝑏)π‘Œπ‘‘βˆ’1 βˆ’ π‘Žπ‘π‘Œπ‘‘βˆ’2 + πœŽπœ–π‘‘ (5)

13.3.2 Mathematical Analysis of the Model

To get started, let’s set 𝐺𝑑 ≑ 0, 𝜎 = 0, and 𝛾 = 0.


Then we can write equation Eq. (5) as

π‘Œπ‘‘ = 𝜌1 π‘Œπ‘‘βˆ’1 + 𝜌2 π‘Œπ‘‘βˆ’2

or

π‘Œπ‘‘+2 βˆ’ 𝜌1 π‘Œπ‘‘+1 βˆ’ 𝜌2 π‘Œπ‘‘ = 0 (6)

To discover the properties of the solution of Eq. (6), it is useful first to form the characteris-
tic polynomial for Eq. (6):

𝑧 2 βˆ’ 𝜌1 𝑧 βˆ’ 𝜌 2 (7)

where 𝑧 is possibly a complex number.


We want to find the two zeros (a.k.a. roots) – namely πœ†1 , πœ†2 – of the characteristic polyno-
mial.
13.3. DETAILS 175

These are two special values of 𝑧, say 𝑧 = πœ†1 and 𝑧 = πœ†2 , such that if we set 𝑧 equal to one of
these values in expression Eq. (7), the characteristic polynomial Eq. (7) equals zero:

𝑧2 βˆ’ 𝜌1 𝑧 βˆ’ 𝜌2 = (𝑧 βˆ’ πœ†1 )(𝑧 βˆ’ πœ†2 ) = 0 (8)

Equation Eq. (8) is said to factor the characteristic polynomial.


When the roots are complex, they will occur as a complex conjugate pair.
When the roots are complex, it is convenient to represent them in the polar form

πœ†1 = π‘Ÿπ‘’π‘–πœ” , πœ†2 = π‘Ÿπ‘’βˆ’π‘–πœ”

where π‘Ÿ is the amplitude of the complex number and πœ” is its angle or phase.
These can also be represented as

πœ†1 = π‘Ÿ(π‘π‘œπ‘ (πœ”) + 𝑖 sin(πœ”))

πœ†2 = π‘Ÿ(π‘π‘œπ‘ (πœ”) βˆ’ 𝑖 sin(πœ”))

(To read about the polar form, see here)


Given initial conditions π‘Œβˆ’1 , π‘Œβˆ’2 , we want to generate a solution of the difference equation
Eq. (6).
It can be represented as

π‘Œπ‘‘ = πœ†π‘‘1 𝑐1 + πœ†π‘‘2 𝑐2

where 𝑐1 and 𝑐2 are constants that depend on the two initial conditions and on 𝜌1 , 𝜌2 .
When the roots are complex, it is useful to pursue the following calculations.
Notice that

π‘Œπ‘‘ = 𝑐1 (π‘Ÿπ‘’π‘–πœ” )𝑑 + 𝑐2 (π‘Ÿπ‘’βˆ’π‘–πœ” )𝑑
= 𝑐1 π‘Ÿπ‘‘ π‘’π‘–πœ”π‘‘ + 𝑐2 π‘Ÿπ‘‘ π‘’βˆ’π‘–πœ”π‘‘
= 𝑐1 π‘Ÿπ‘‘ [cos(πœ”π‘‘) + 𝑖 sin(πœ”π‘‘)] + 𝑐2 π‘Ÿπ‘‘ [cos(πœ”π‘‘) βˆ’ 𝑖 sin(πœ”π‘‘)]
= (𝑐1 + 𝑐2 )π‘Ÿπ‘‘ cos(πœ”π‘‘) + 𝑖(𝑐1 βˆ’ 𝑐2 )π‘Ÿπ‘‘ sin(πœ”π‘‘)

The only way that π‘Œπ‘‘ can be a real number for each 𝑑 is if 𝑐1 + 𝑐2 is a real number and 𝑐1 βˆ’ 𝑐2
is an imaginary number.
This happens only when 𝑐1 and 𝑐2 are complex conjugates, in which case they can be written
in the polar forms

𝑐1 = π‘£π‘’π‘–πœƒ , 𝑐2 = π‘£π‘’βˆ’π‘–πœƒ

So we can write
176 CHAPTER 13. OOP III: SAMUELSON MULTIPLIER ACCELERATOR

π‘Œπ‘‘ = π‘£π‘’π‘–πœƒ π‘Ÿπ‘‘ π‘’π‘–πœ”π‘‘ + π‘£π‘’βˆ’π‘–πœƒ π‘Ÿπ‘‘ π‘’βˆ’π‘–πœ”π‘‘


= π‘£π‘Ÿπ‘‘ [𝑒𝑖(πœ”π‘‘+πœƒ) + π‘’βˆ’π‘–(πœ”π‘‘+πœƒ) ]
= 2π‘£π‘Ÿπ‘‘ cos(πœ”π‘‘ + πœƒ)

where 𝑣 and πœƒ are constants that must be chosen to satisfy initial conditions for π‘Œβˆ’1 , π‘Œβˆ’2 .
This formula shows that when the roots are complex, π‘Œπ‘‘ displays oscillations with period
π‘ΜŒ = 2πœ‹
πœ” and damping factor π‘Ÿ.

We say that π‘ΜŒ is the period because in that amount of time the cosine wave cos(πœ”π‘‘ + πœƒ) goes
through exactly one complete cycles.
(Draw a cosine function to convince yourself of this please)
Remark: Following [118], we want to choose the parameters π‘Ž, 𝑏 of the model so that the ab-
solute values (of the possibly complex) roots πœ†1 , πœ†2 of the characteristic polynomial are both
strictly less than one:

|πœ†π‘— | < 1 for 𝑗 = 1, 2

Remark: When both roots πœ†1 , πœ†2 of the characteristic polynomial have absolute values
strictly less than one, the absolute value of the larger one governs the rate of convergence to
the steady state of the non stochastic version of the model.

13.3.3 Things This Lecture Does

We write a function to generate simulations of a {π‘Œπ‘‘ } sequence as a function of time.


The function requires that we put in initial conditions for π‘Œβˆ’1 , π‘Œβˆ’2 .
The function checks that π‘Ž, 𝑏 are set so that πœ†1 , πœ†2 are less than unity in absolute value (also
called β€œmodulus”).
The function also tells us whether the roots are complex, and, if they are complex, returns
both their real and complex parts.
If the roots are both real, the function returns their values.
We use our function written to simulate paths that are stochastic (when 𝜎 > 0).
We have written the function in a way that allows us to input {𝐺𝑑 } paths of a few simple
forms, e.g.,

β€’ one time jumps in 𝐺 at some time


β€’ a permanent jump in 𝐺 that occurs at some time

We proceed to use the Samuelson multiplier-accelerator model as a laboratory to make a sim-


ple OOP example.
The β€œstate” that determines next period’s π‘Œπ‘‘+1 is now not just the current value π‘Œπ‘‘ but also
the once lagged value π‘Œπ‘‘βˆ’1 .
This involves a little more bookkeeping than is required in the Solow model class definition.
We use the Samuelson multiplier-accelerator model as a vehicle for teaching how we can grad-
ually add more features to the class.
13.4. IMPLEMENTATION 177

We want to have a method in the class that automatically generates a simulation, either non-
stochastic (𝜎 = 0) or stochastic (𝜎 > 0).
We also show how to map the Samuelson model into a simple instance of the
LinearStateSpace class described here.
We can use a LinearStateSpace instance to do various things that we did above with our
homemade function and class.
Among other things, we show by example that the eigenvalues of the matrix 𝐴 that we use to
form the instance of the LinearStateSpace class for the Samuelson model equal the roots
of the characteristic polynomial Eq. (7) for the Samuelson multiplier accelerator model.
Here is the formula for the matrix 𝐴 in the linear state space system in the case that govern-
ment expenditures are a constant 𝐺:

1 0 0
𝐴=⎑
βŽ’π›Ύ + 𝐺 𝜌 ⎀
1 𝜌2 βŽ₯
⎣ 0 1 0⎦

13.4 Implementation

We’ll start by drawing an informative graph from page 189 of [121]


def param_plot():
[4]:
"""This function creates the graph on page 189 of
Sargent Macroeconomic Theory, second edition, 1987.
"""

fig, ax = plt.subplots(figsize=(10, 6))


ax.set_aspect('equal')

# Set axis
xmin, ymin = -3, -2
xmax, ymax = -xmin, -ymin
plt.axis([xmin, xmax, ymin, ymax])

# Set axis labels


ax.set(xticks=[], yticks=[])
ax.set_xlabel(r'$\rho_2$', fontsize=16)
ax.xaxis.set_label_position('top')
ax.set_ylabel(r'$\rho_1$', rotation=0, fontsize=16)
ax.yaxis.set_label_position('right')

# Draw (t1, t2) points


ρ1 = np.linspace(-2, 2, 100)
ax.plot(ρ1, -abs(ρ1) + 1, c='black')
ax.plot(ρ1, np.ones_like(ρ1) * -1, c='black')
ax.plot(ρ1, -(ρ1**2 / 4), c='black')

# Turn normal axes off


for spine in ['left', 'bottom', 'top', 'right']:
ax.spines[spine].set_visible(False)

# Add arrows to represent axes


axes_arrows = {'arrowstyle': '<|-|>', 'lw': 1.3}
ax.annotate('', xy=(xmin, 0), xytext=(xmax, 0), arrowprops=axes_arrows)
ax.annotate('', xy=(0, ymin), xytext=(0, ymax), arrowprops=axes_arrows)

# Annotate the plot with equations


plot_arrowsl = {'arrowstyle': '-|>', 'connectionstyle': "arc3, rad=-0.2"}
plot_arrowsr = {'arrowstyle': '-|>', 'connectionstyle': "arc3, rad=0.2"}
ax.annotate(r'$\rho_1 + \rho_2 < 1$', xy=(0.5, 0.3), xytext=(0.8, 0.6),
178 CHAPTER 13. OOP III: SAMUELSON MULTIPLIER ACCELERATOR

arrowprops=plot_arrowsr, fontsize='12')
ax.annotate(r'$\rho_1 + \rho_2 = 1$', xy=(0.38, 0.6), xytext=(0.6, 0.8),
arrowprops=plot_arrowsr, fontsize='12')
ax.annotate(r'$\rho_2 < 1 + \rho_1$', xy=(-0.5, 0.3), xytext=(-1.3, 0.6),
arrowprops=plot_arrowsl, fontsize='12')
ax.annotate(r'$\rho_2 = 1 + \rho_1$', xy=(-0.38, 0.6), xytext=(-1, 0.8),
arrowprops=plot_arrowsl, fontsize='12')
ax.annotate(r'$\rho_2 = -1$', xy=(1.5, -1), xytext=(1.8, -1.3),
arrowprops=plot_arrowsl, fontsize='12')
ax.annotate(r'${\rho_1}^2 + 4\rho_2 = 0$', xy=(1.15, -0.35),
xytext=(1.5, -0.3), arrowprops=plot_arrowsr, fontsize='12')
ax.annotate(r'${\rho_1}^2 + 4\rho_2 < 0$', xy=(1.4, -0.7),
xytext=(1.8, -0.6), arrowprops=plot_arrowsr, fontsize='12')

# Label categories of solutions


ax.text(1.5, 1, 'Explosive\n growth', ha='center', fontsize=16)
ax.text(-1.5, 1, 'Explosive\n oscillations', ha='center', fontsize=16)
ax.text(0.05, -1.5, 'Explosive oscillations', ha='center', fontsize=16)
ax.text(0.09, -0.5, 'Damped oscillations', ha='center', fontsize=16)

# Add small marker to y-axis


ax.axhline(y=1.005, xmin=0.495, xmax=0.505, c='black')
ax.text(-0.12, -1.12, '-1', fontsize=10)
ax.text(-0.12, 0.98, '1', fontsize=10)

return fig

param_plot()
plt.show()

The graph portrays regions in which the (πœ†1 , πœ†2 ) root pairs implied by the (𝜌1 = (π‘Ž + 𝑏), 𝜌2 =
βˆ’π‘) difference equation parameter pairs in the Samuelson model are such that:

β€’ (πœ†1 , πœ†2 ) are complex with modulus less than 1 - in this case, the {π‘Œπ‘‘ } sequence displays
damped oscillations.
β€’ (πœ†1 , πœ†2 ) are both real, but one is strictly greater than 1 - this leads to explosive growth.
13.4. IMPLEMENTATION 179

β€’ (πœ†1 , πœ†2 ) are both real, but one is strictly less than βˆ’1 - this leads to explosive oscilla-
tions.
β€’ (πœ†1 , πœ†2 ) are both real and both are less than 1 in absolute value - in this case, there is
smooth convergence to the steady state without damped cycles.

Later we’ll present the graph with a red mark showing the particular point implied by the
setting of (π‘Ž, 𝑏).

13.4.1 Function to Describe Implications of Characteristic Polynomial

def categorize_solution(ρ1, ρ2):


[5]:
"""This function takes values of ρ1 and ρ2 and uses them
to classify the type of solution
"""

discriminant = ρ1 ** 2 + 4 * ρ2
if ρ2 > 1 + ρ1 or ρ2 < -1:
print('Explosive oscillations')
elif ρ1 + ρ2 > 1:
print('Explosive growth')
elif discriminant < 0:
print('Roots are complex with modulus less than one; \
therefore damped oscillations')
else:
print('Roots are real and absolute values are less than one; \
therefore get smooth convergence to a steady state')

### Test the categorize_solution function


[6]:
categorize_solution(1.3, -.4)

Roots are real and absolute values are less than one; therefore get smooth
convergence to a steady state

13.4.2 Function for Plotting Paths

A useful function for our work below is


def plot_y(function=None):
[7]:
"""Function plots path of Y_t"""

plt.subplots(figsize=(10, 6))
plt.plot(function)
plt.xlabel('Time $t$')
plt.ylabel('$Y_t$', rotation=0)
plt.grid()
plt.show()

13.4.3 Manual or β€œby hand” Root Calculations

The following function calculates roots of the characteristic polynomial using high school al-
gebra.
(We’ll calculate the roots in other ways later)
The function also plots a π‘Œπ‘‘ starting from initial conditions that we set
180 CHAPTER 13. OOP III: SAMUELSON MULTIPLIER ACCELERATOR

# This is a 'manual' method


[8]:
def y_nonstochastic(y_0=100, y_1=80, Ξ±=.92, Ξ²=.5, Ξ³=10, n=80):

"""Takes values of parameters and computes the roots of characteristic


polynomial. It tells whether they are real or complex and whether they
are less than unity in absolute value.It also computes a simulation of
length n starting from the two given initial conditions for national
income
"""

roots = []

ρ1 = α + β
ρ2 = -β

print(f'ρ_1 is {ρ1}')
print(f'ρ_2 is {ρ2}')

discriminant = ρ1 ** 2 + 4 * ρ2

if discriminant == 0:
roots.append(-ρ1 / 2)
print('Single real root: ')
print(''.join(str(roots)))
elif discriminant > 0:
roots.append((-ρ1 + sqrt(discriminant).real) / 2)
roots.append((-ρ1 - sqrt(discriminant).real) / 2)
print('Two real roots: ')
print(''.join(str(roots)))
else:
roots.append((-ρ1 + sqrt(discriminant)) / 2)
roots.append((-ρ1 - sqrt(discriminant)) / 2)
print('Two complex roots: ')
print(''.join(str(roots)))

if all(abs(root) < 1 for root in roots):


print('Absolute values of roots are less than one')
else:
print('Absolute values of roots are not less than one')

def transition(x, t): return ρ1 * x[t - 1] + ρ2 * x[t - 2] + γ

y_t = [y_0, y_1]

for t in range(2, n):


y_t.append(transition(y_t, t))

return y_t

plot_y(y_nonstochastic())

ρ_1 is 1.42
ρ_2 is -0.5
Two real roots:
[-0.6459687576256715, -0.7740312423743284]
Absolute values of roots are less than one
13.4. IMPLEMENTATION 181

13.4.4 Reverse-Engineering Parameters to Generate Damped Cycles

The next cell writes code that takes as inputs the modulus π‘Ÿ and phase πœ™ of a conjugate pair
of complex numbers in polar form

πœ†1 = π‘Ÿ exp(π‘–πœ™), πœ†2 = π‘Ÿ exp(βˆ’π‘–πœ™)

β€’ The code assumes that these two complex numbers are the roots of the characteristic
polynomial
β€’ It then reverse-engineers (π‘Ž, 𝑏) and (𝜌1 , 𝜌2 ), pairs that would generate those roots

### code to reverse-engineer a cycle


[9]: ### y_t = r^t (c_1 cos(οΏ½ t) + c2 sin(οΏ½ t))
###

def f(r, οΏ½):


"""
Takes modulus r and angle οΏ½ of complex number r exp(j οΏ½)
and creates ρ1 and ρ2 of characteristic polynomial for which
r exp(j οΏ½) and r exp(- j οΏ½) are complex roots.

Returns the multiplier coefficient a and the accelerator coefficient b


that verifies those roots.
"""
g1 = cmath.rect(r, οΏ½) # Generate two complex roots
g2 = cmath.rect(r, -οΏ½)
ρ1 = g1 + g2 # Implied ρ1, ρ2
ρ2 = -g1 * g2
b = -ρ2 # Reverse-engineer a and b that validate these
a = ρ1 - b
return ρ1, ρ2, a, b

## Now let's use the function in an example


## Here are the example parameters
182 CHAPTER 13. OOP III: SAMUELSON MULTIPLIER ACCELERATOR

r = .95
period = 10 # Length of cycle in units of time
οΏ½ = 2 * math.pi/period

## Apply the function

ρ1, ρ2, a, b = f(r, �)

print(f"a, b = {a}, {b}")


print(f"ρ1, ρ2 = {ρ1}, {ρ2}")

a, b = (0.6346322893124001+0j), (0.9024999999999999-0j)
ρ1, ρ2 = (1.5371322893124+0j), (-0.9024999999999999+0j)

## Print the real components of ρ1 and ρ2


[10]:
ρ1 = ρ1.real
ρ2 = ρ2.real

ρ1, ρ2

(1.5371322893124, -0.9024999999999999)
[10]:

13.4.5 Root Finding Using Numpy

Here we’ll use numpy to compute the roots of the characteristic polynomial
r1, r2 = np.roots([1, -ρ1, -ρ2])
[11]:
p1 = cmath.polar(r1)
p2 = cmath.polar(r2)

print(f"r, οΏ½ = {r}, {οΏ½}")


print(f"p1, p2 = {p1}, {p2}")
# print(f"g1, g2 = {g1}, {g2}")

print(f"a, b = {a}, {b}")


print(f"ρ1, ρ2 = {ρ1}, {ρ2}")

r, οΏ½ = 0.95, 0.6283185307179586
p1, p2 = (0.95, 0.6283185307179586), (0.95, -0.6283185307179586)
a, b = (0.6346322893124001+0j), (0.9024999999999999-0j)
ρ1, ρ2 = 1.5371322893124, -0.9024999999999999

##=== This method uses numpy to calculate roots ===#


[12]:

def y_nonstochastic(y_0=100, y_1=80, Ξ±=.9, Ξ²=.8, Ξ³=10, n=80):

""" Rather than computing the roots of the characteristic


polynomial by hand as we did earlier, this function
enlists numpy to do the work for us
"""

# Useful constants
ρ1 = α + β
ρ2 = -β

categorize_solution(ρ1, ρ2)

# Find roots of polynomial


roots = np.roots([1, -ρ1, -ρ2])
print(f'Roots are {roots}')

# Check if real or complex


13.4. IMPLEMENTATION 183

if all(isinstance(root, complex) for root in roots):


print('Roots are complex')
else:
print('Roots are real')

# Check if roots are less than one


if all(abs(root) < 1 for root in roots):
print('Roots are less than one')
else:
print('Roots are not less than one')

# Define transition equation


def transition(x, t): return ρ1 * x[t - 1] + ρ2 * x[t - 2] + γ

# Set initial conditions


y_t = [y_0, y_1]

# Generate y_t series


for t in range(2, n):
y_t.append(transition(y_t, t))

return y_t

plot_y(y_nonstochastic())

Roots are complex with modulus less than one; therefore damped oscillations
Roots are [0.85+0.27838822j 0.85-0.27838822j]
Roots are complex
Roots are less than one

13.4.6 Reverse-Engineered Complex Roots: Example

The next cell studies the implications of reverse-engineered complex roots.


We’ll generate an undamped cycle of period 10
184 CHAPTER 13. OOP III: SAMUELSON MULTIPLIER ACCELERATOR

r = 1 # Generates undamped, nonexplosive cycles


[13]:
period = 10 # Length of cycle in units of time
οΏ½ = 2 * math.pi/period

## Apply the reverse-engineering function f

ρ1, ρ2, a, b = f(r, �)

# Drop the imaginary part so that it is a valid input into y_nonstochastic


a = a.real
b = b.real

print(f"a, b = {a}, {b}")

ytemp = y_nonstochastic(Ξ±=a, Ξ²=b, y_0=20, y_1=30)


plot_y(ytemp)

a, b = 0.6180339887498949, 1.0
Roots are complex with modulus less than one; therefore damped oscillations
Roots are [0.80901699+0.58778525j 0.80901699-0.58778525j]
Roots are complex
Roots are less than one

13.4.7 Digression: Using Sympy to Find Roots

We can also use sympy to compute analytic formulas for the roots
init_printing()
[14]:
r1 = Symbol("ρ_1")
r2 = Symbol("ρ_2")
z = Symbol("z")

sympy.solve(z**2 - r1*z - r2, z)

[14]:
13.5. STOCHASTIC SHOCKS 185

𝜌1 √𝜌12 + 4𝜌2 𝜌1 √𝜌12 + 4𝜌2


[ βˆ’ , + ]
2 2 2 2

𝜌1 1 𝜌1 1
[ βˆ’ √𝜌12 + 4𝜌2 , + √𝜌12 + 4𝜌2 ]
2 2 2 2

a = Symbol("Ξ±")
[15]: b = Symbol("Ξ²")
r1 = a + b
r2 = -b

sympy.solve(z**2 - r1*z - r2, z)

[15]:
𝛼 𝛽 βˆšπ›Ό2 + 2𝛼𝛽 + 𝛽 2 βˆ’ 4𝛽 𝛼 𝛽 βˆšπ›Ό2 + 2𝛼𝛽 + 𝛽 2 βˆ’ 4𝛽
[ + βˆ’ , + + ]
2 2 2 2 2 2

𝛼 𝛽 1 𝛼 𝛽 1
[ + βˆ’ βˆšπ›Ό2 + 2𝛼𝛽 + 𝛽 2 βˆ’ 4𝛽, + + βˆšπ›Ό2 + 2𝛼𝛽 + 𝛽 2 βˆ’ 4𝛽]
2 2 2 2 2 2

13.5 Stochastic Shocks

Now we’ll construct some code to simulate the stochastic version of the model that emerges
when we add a random shock process to aggregate demand
def y_stochastic(y_0=0, y_1=0, Ξ±=0.8, Ξ²=0.2, Ξ³=10, n=100, Οƒ=5):
[16]:
"""This function takes parameters of a stochastic version of
the model and proceeds to analyze the roots of the characteristic
polynomial and also generate a simulation.
"""

# Useful constants
ρ1 = α + β
ρ2 = -β

# Categorize solution
categorize_solution(ρ1, ρ2)

# Find roots of polynomial


roots = np.roots([1, -ρ1, -ρ2])
print(roots)

# Check if real or complex


if all(isinstance(root, complex) for root in roots):
print('Roots are complex')
else:
print('Roots are real')

# Check if roots are less than one


if all(abs(root) < 1 for root in roots):
print('Roots are less than one')
else:
print('Roots are not less than one')

# Generate shocks
οΏ½ = np.random.normal(0, 1, n)

# Define transition equation


def transition(x, t): return ρ1 * \
x[t - 1] + ρ2 * x[t - 2] + Ξ³ + Οƒ * οΏ½[t]

# Set initial conditions


y_t = [y_0, y_1]

# Generate y_t series


186 CHAPTER 13. OOP III: SAMUELSON MULTIPLIER ACCELERATOR

for t in range(2, n):


y_t.append(transition(y_t, t))

return y_t

plot_y(y_stochastic())

Roots are real and absolute values are less than one; therefore get smooth
convergence to a steady state
[0.7236068 0.2763932]
Roots are real
Roots are less than one

Let’s do a simulation in which there are shocks and the characteristic polynomial has complex
roots
r = .97
[17]:
period = 10 # Length of cycle in units of time
οΏ½ = 2 * math.pi/period

### Apply the reverse-engineering function f

ρ1, ρ2, a, b = f(r, �)

# Drop the imaginary part so that it is a valid input into y_nonstochastic


a = a.real
b = b.real

print(f"a, b = {a}, {b}")


plot_y(y_stochastic(y_0=40, y_1 = 42, Ξ±=a, Ξ²=b, Οƒ=2, n=100))

a, b = 0.6285929690873979, 0.9409000000000001
Roots are complex with modulus less than one; therefore damped oscillations
[0.78474648+0.57015169j 0.78474648-0.57015169j]
Roots are complex
Roots are less than one
13.6. GOVERNMENT SPENDING 187

13.6 Government Spending

This function computes a response to either a permanent or one-off increase in government


expenditures
def y_stochastic_g(y_0=20,
[18]: y_1=20,
Ξ±=0.8,
Ξ²=0.2,
Ξ³=10,
n=100,
Οƒ=2,
g=0,
g_t=0,
duration='permanent'):

"""This program computes a response to a permanent increase


in government expenditures that occurs at time 20
"""

# Useful constants
ρ1 = α + β
ρ2 = -β

# Categorize solution
categorize_solution(ρ1, ρ2)

# Find roots of polynomial


roots = np.roots([1, -ρ1, -ρ2])
print(roots)

# Check if real or complex


if all(isinstance(root, complex) for root in roots):
print('Roots are complex')
else:
print('Roots are real')

# Check if roots are less than one


188 CHAPTER 13. OOP III: SAMUELSON MULTIPLIER ACCELERATOR

if all(abs(root) < 1 for root in roots):


print('Roots are less than one')
else:
print('Roots are not less than one')

# Generate shocks
οΏ½ = np.random.normal(0, 1, n)

def transition(x, t, g):

# Non-stochastic - separated to avoid generating random series


# when not needed
if Οƒ == 0:
return ρ1 * x[t - 1] + ρ2 * x[t - 2] + γ + g

# Stochastic
else:
οΏ½ = np.random.normal(0, 1, n)
return ρ1 * x[t - 1] + ρ2 * x[t - 2] + Ξ³ + g + Οƒ * οΏ½[t]

# Create list and set initial conditions


y_t = [y_0, y_1]

# Generate y_t series


for t in range(2, n):

# No government spending
if g == 0:
y_t.append(transition(y_t, t))

# Government spending (no shock)


elif g != 0 and duration == None:
y_t.append(transition(y_t, t))

# Permanent government spending shock


elif duration == 'permanent':
if t < g_t:
y_t.append(transition(y_t, t, g=0))
else:
y_t.append(transition(y_t, t, g=g))

# One-off government spending shock


elif duration == 'one-off':
if t == g_t:
y_t.append(transition(y_t, t, g=g))
else:
y_t.append(transition(y_t, t, g=0))
return y_t

A permanent government spending shock can be simulated as follows


plot_y(y_stochastic_g(g=10, g_t=20, duration='permanent'))
[19]:

Roots are real and absolute values are less than one; therefore get smooth
convergence to a steady state
[0.7236068 0.2763932]
Roots are real
Roots are less than one
13.6. GOVERNMENT SPENDING 189

We can also see the response to a one time jump in government expenditures
plot_y(y_stochastic_g(g=500, g_t=50, duration='one-off'))
[20]:

Roots are real and absolute values are less than one; therefore get smooth
convergence to a steady state
[0.7236068 0.2763932]
Roots are real
Roots are less than one
190 CHAPTER 13. OOP III: SAMUELSON MULTIPLIER ACCELERATOR

13.7 Wrapping Everything Into a Class

Up to now, we have written functions to do the work.


Now we’ll roll up our sleeves and write a Python class called Samuelson for the Samuelson
model
class Samuelson():
[21]:
"""This class represents the Samuelson model, otherwise known as the
multiple-accelerator model. The model combines the Keynesian multiplier
with the accelerator theory of investment.

The path of output is governed by a linear second-order difference equation

.. math::

Y_t = + \alpha (1 + \beta) Y_{t-1} - \alpha \beta Y_{t-2}

Parameters
----------
y_0 : scalar
Initial condition for Y_0
y_1 : scalar
Initial condition for Y_1
Ξ± : scalar
Marginal propensity to consume
Ξ² : scalar
Accelerator coefficient
n : int
Number of iterations
Οƒ : scalar
Volatility parameter. It must be greater than or equal to 0. Set
equal to 0 for a non-stochastic model.
g : scalar
Government spending shock
g_t : int
Time at which government spending shock occurs. Must be specified
when duration != None.
duration : {None, 'permanent', 'one-off'}
Specifies type of government spending shock. If none, government
spending equal to g for all t.

"""

def __init__(self,
y_0=100,
y_1=50,
Ξ±=1.3,
Ξ²=0.2,
Ξ³=10,
n=100,
Οƒ=0,
g=0,
g_t=0,
duration=None):

self.y_0, self.y_1, self.Ξ±, self.Ξ² = y_0, y_1, Ξ±, Ξ²


self.n, self.g, self.g_t, self.duration = n, g, g_t, duration
self.Ξ³, self.Οƒ = Ξ³, Οƒ
self.ρ1 = α + β
self.ρ2 = -β
self.roots = np.roots([1, -self.ρ1, -self.ρ2])

def root_type(self):
if all(isinstance(root, complex) for root in self.roots):
return 'Complex conjugate'
elif len(self.roots) > 1:
return 'Double real'
else:
return 'Single real'
13.7. WRAPPING EVERYTHING INTO A CLASS 191

def root_less_than_one(self):
if all(abs(root) < 1 for root in self.roots):
return True

def solution_type(self):
ρ1, ρ2 = self.ρ1, self.ρ2
discriminant = ρ1 ** 2 + 4 * ρ2
if ρ2 >= 1 + ρ1 or ρ2 <= -1:
return 'Explosive oscillations'
elif ρ1 + ρ2 >= 1:
return 'Explosive growth'
elif discriminant < 0:
return 'Damped oscillations'
else:
return 'Steady state'

def _transition(self, x, t, g):

# Non-stochastic - separated to avoid generating random series


# when not needed
if self.Οƒ == 0:
return self.ρ1 * x[t - 1] + self.ρ2 * x[t - 2] + self.γ + g

# Stochastic
else:
οΏ½ = np.random.normal(0, 1, self.n)
return self.ρ1 * x[t - 1] + self.ρ2 * x[t - 2] + self.γ + g \
+ self.Οƒ * οΏ½[t]

def generate_series(self):

# Create list and set initial conditions


y_t = [self.y_0, self.y_1]

# Generate y_t series


for t in range(2, self.n):

# No government spending
if self.g == 0:
y_t.append(self._transition(y_t, t))

# Government spending (no shock)


elif self.g != 0 and self.duration == None:
y_t.append(self._transition(y_t, t))

# Permanent government spending shock


elif self.duration == 'permanent':
if t < self.g_t:
y_t.append(self._transition(y_t, t, g=0))
else:
y_t.append(self._transition(y_t, t, g=self.g))

# One-off government spending shock


elif self.duration == 'one-off':
if t == self.g_t:
y_t.append(self._transition(y_t, t, g=self.g))
else:
y_t.append(self._transition(y_t, t, g=0))
return y_t

def summary(self):
print('Summary\n' + '-' * 50)
print(f'Root type: {self.root_type()}')
print(f'Solution type: {self.solution_type()}')
print(f'Roots: {str(self.roots)}')

if self.root_less_than_one() == True:
print('Absolute value of roots is less than one')
else:
print('Absolute value of roots is not less than one')

if self.Οƒ > 0:
192 CHAPTER 13. OOP III: SAMUELSON MULTIPLIER ACCELERATOR

print('Stochastic series with Οƒ = ' + str(self.Οƒ))


else:
print('Non-stochastic series')

if self.g != 0:
print('Government spending equal to ' + str(self.g))

if self.duration != None:
print(self.duration.capitalize() +
' government spending shock at t = ' + str(self.g_t))

def plot(self):
fig, ax = plt.subplots(figsize=(10, 6))
ax.plot(self.generate_series())
ax.set(xlabel='Iteration', xlim=(0, self.n))
ax.set_ylabel('$Y_t$', rotation=0)
ax.grid()

# Add parameter values to plot


paramstr = f'$\\alpha={self.Ξ±:.2f}$ \n $\\beta={self.Ξ²:.2f}$ \n \
$\\gamma={self.Ξ³:.2f}$ \n $\\sigma={self.Οƒ:.2f}$ \n \
$\\rho_1={self.ρ1:.2f}$ \n $\\rho_2={self.ρ2:.2f}$'
props = dict(fc='white', pad=10, alpha=0.5)
ax.text(0.87, 0.05, paramstr, transform=ax.transAxes,
fontsize=12, bbox=props, va='bottom')

return fig

def param_plot(self):

# Uses the param_plot() function defined earlier (it is then able


# to be used standalone or as part of the model)

fig = param_plot()
ax = fig.gca()

# Add Ξ» values to legend


for i, root in enumerate(self.roots):
if isinstance(root, complex):
# Need to fill operator for positive as string is split apart
operator = ['+', '']
label = rf'$\lambda_{i+1} = {sam.roots[i].real:.2f} \
{operator[i]} {sam.roots[i].imag:.2f}i$'
else:
label = rf'$\lambda_{i+1} = {sam.roots[i].real:.2f}$'
ax.scatter(0, 0, 0, label=label) # dummy to add to legend

# Add ρ pair to plot


ax.scatter(self.ρ1, self.ρ2, 100, 'red', '+',
label=r'$(\ \rho_1, \ \rho_2 \ )$', zorder=5)

plt.legend(fontsize=12, loc=3)

return fig

13.7.1 Illustration of Samuelson Class

Now we’ll put our Samuelson class to work on an example


sam = Samuelson(Ξ±=0.8, Ξ²=0.5, Οƒ=2, g=10, g_t=20, duration='permanent')
[22]: sam.summary()

Summary
--------------------------------------------------
Root type: Complex conjugate
Solution type: Damped oscillations
Roots: [0.65+0.27838822j 0.65-0.27838822j]
Absolute value of roots is less than one
Stochastic series with Οƒ = 2
13.7. WRAPPING EVERYTHING INTO A CLASS 193

Government spending equal to 10


Permanent government spending shock at t = 20

sam.plot()
[23]: plt.show()

13.7.2 Using the Graph

We’ll use our graph to show where the roots lie and how their location is consistent with the
behavior of the path just graphed.
The red + sign shows the location of the roots
sam.param_plot()
[24]: plt.show()
194 CHAPTER 13. OOP III: SAMUELSON MULTIPLIER ACCELERATOR

13.8 Using the LinearStateSpace Class

It turns out that we can use the QuantEcon.py LinearStateSpace class to do much of the
work that we have done from scratch above.
Here is how we map the Samuelson model into an instance of a LinearStateSpace class
"""This script maps the Samuelson model in the the
[25]: ``LinearStateSpace`` class
"""
Ξ± = 0.8
Ξ² = 0.9
ρ1 = α + β
ρ2 = -β
Ξ³ = 10
Οƒ = 1
g = 10
n = 100

A = [[1, 0, 0],
[γ + g, ρ1, ρ2],
[0, 1, 0]]

G = [[γ + g, ρ1, ρ2], # this is Y_{t+1}


[Ξ³, Ξ±, 0], # this is C_{t+1}
[0, Ξ², -Ξ²]] # this is I_{t+1}

ΞΌ_0 = [1, 100, 100]


C = np.zeros((3,1))
C[1] = Οƒ # stochastic

sam_t = LinearStateSpace(A, C, G, mu_0=ΞΌ_0)

x, y = sam_t.simulate(ts_length=n)
13.8. USING THE LINEARSTATESPACE CLASS 195

fig, axes = plt.subplots(3, 1, sharex=True, figsize=(12, 8))


titles = ['Output ($Y_t$)', 'Consumption ($C_t$)', 'Investment ($I_t$)']
colors = ['darkblue', 'red', 'purple']
for ax, series, title, color in zip(axes, y, titles, colors):
ax.plot(series, color=color)
ax.set(title=title, xlim=(0, n))
ax.grid()

axes[-1].set_xlabel('Iteration')

plt.show()

13.8.1 Other Methods in the LinearStateSpace Class

Let’s plot impulse response functions for the instance of the Samuelson model using a
method in the LinearStateSpace class
imres = sam_t.impulse_response()
[26]: imres = np.asarray(imres)
y1 = imres[:, :, 0]
y2 = imres[:, :, 1]
y1.shape

[26]:
(2, 6, 1)

(2, 6, 1)

Now let’s compute the zeros of the characteristic polynomial by simply calculating the eigen-
values of 𝐴
196 CHAPTER 13. OOP III: SAMUELSON MULTIPLIER ACCELERATOR

A = np.asarray(A)
[27]: w, v = np.linalg.eig(A)
print(w)

[0.85+0.42130749j 0.85-0.42130749j 1. +0.j ]

13.8.2 Inheriting Methods from LinearStateSpace

We could also create a subclass of LinearStateSpace (inheriting all its methods and at-
tributes) to add more functions to use
class SamuelsonLSS(LinearStateSpace):
[28]:
"""
This subclass creates a Samuelson multiplier-accelerator model
as a linear state space system.
"""
def __init__(self,
y_0=100,
y_1=100,
Ξ±=0.8,
Ξ²=0.9,
Ξ³=10,
Οƒ=1,
g=10):

self.Ξ±, self.Ξ² = Ξ±, Ξ²
self.y_0, self.y_1, self.g = y_0, y_1, g
self.Ξ³, self.Οƒ = Ξ³, Οƒ

# Define intial conditions


self.ΞΌ_0 = [1, y_0, y_1]

self.ρ1 = α + β
self.ρ2 = -β

# Define transition matrix


self.A = [[1, 0, 0],
[γ + g, self.ρ1, self.ρ2],
[0, 1, 0]]

# Define output matrix


self.G = [[γ + g, self.ρ1, self.ρ2], # this is Y_{t+1}
[Ξ³, Ξ±, 0], # this is C_{t+1}
[0, Ξ², -Ξ²]] # this is I_{t+1}

self.C = np.zeros((3, 1))


self.C[1] = Οƒ # stochastic

# Initialize LSS with parameters from Samuelson model


LinearStateSpace.__init__(self, self.A, self.C, self.G, mu_0=self.ΞΌ_0)

def plot_simulation(self, ts_length=100, stationary=True):

# Temporarily store original parameters


temp_ΞΌ = self.ΞΌ_0
temp_Ξ£ = self.Sigma_0

# Set distribution parameters equal to their stationary


# values for simulation
if stationary == True:
try:
self.ΞΌ_x, self.ΞΌ_y, self.Οƒ_x, self.Οƒ_y = \
self.stationary_distributions()
self.ΞΌ_0 = self.ΞΌ_y
self.Ξ£_0 = self.Οƒ_y
# Exception where no convergence achieved when
#calculating stationary distributions
except ValueError:
13.8. USING THE LINEARSTATESPACE CLASS 197

print('Stationary distribution does not exist')

x, y = self.simulate(ts_length)

fig, axes = plt.subplots(3, 1, sharex=True, figsize=(12, 8))


titles = ['Output ($Y_t$)', 'Consumption ($C_t$)', 'Investment ($I_t$)']
colors = ['darkblue', 'red', 'purple']
for ax, series, title, color in zip(axes, y, titles, colors):
ax.plot(series, color=color)
ax.set(title=title, xlim=(0, n))
ax.grid()

axes[-1].set_xlabel('Iteration')

# Reset distribution parameters to their initial values


self.ΞΌ_0 = temp_ΞΌ
self.Sigma_0 = temp_Ξ£

return fig

def plot_irf(self, j=5):

x, y = self.impulse_response(j)

# Reshape into 3 x j matrix for plotting purposes


yimf = np.array(y).flatten().reshape(j+1, 3).T

fig, axes = plt.subplots(3, 1, sharex=True, figsize=(12, 8))


labels = ['$Y_t$', '$C_t$', '$I_t$']
colors = ['darkblue', 'red', 'purple']
for ax, series, label, color in zip(axes, yimf, labels, colors):
ax.plot(series, color=color)
ax.set(xlim=(0, j))
ax.set_ylabel(label, rotation=0, fontsize=14, labelpad=10)
ax.grid()

axes[0].set_title('Impulse Response Functions')


axes[-1].set_xlabel('Iteration')

return fig

def multipliers(self, j=5):


x, y = self.impulse_response(j)
return np.sum(np.array(y).flatten().reshape(j+1, 3), axis=0)

13.8.3 Illustrations

Let’s show how we can use the SamuelsonLSS


samlss = SamuelsonLSS()
[29]:
samlss.plot_simulation(100, stationary=False)
[30]: plt.show()
198 CHAPTER 13. OOP III: SAMUELSON MULTIPLIER ACCELERATOR

samlss.plot_simulation(100, stationary=True)
[31]: plt.show()

samlss.plot_irf(100)
[32]: plt.show()
13.9. PURE MULTIPLIER MODEL 199

samlss.multipliers()
[33]:
array([7.414389, 6.835896, 0.578493])
[33]:

13.9 Pure Multiplier Model

Let’s shut down the accelerator by setting 𝑏 = 0 to get a pure multiplier model

β€’ the absence of cycles gives an idea about why Samuelson included the accelerator

pure_multiplier = SamuelsonLSS(Ξ±=0.95, Ξ²=0)


[34]:
pure_multiplier.plot_simulation()
[35]:

Stationary distribution does not exist

[35]:
200 CHAPTER 13. OOP III: SAMUELSON MULTIPLIER ACCELERATOR

pure_multiplier = SamuelsonLSS(Ξ±=0.8, Ξ²=0)


[36]:
pure_multiplier.plot_simulation()
[37]:
[37]:
13.9. PURE MULTIPLIER MODEL 201

pure_multiplier.plot_irf(100)
[38]:
[38]:
202 CHAPTER 13. OOP III: SAMUELSON MULTIPLIER ACCELERATOR

13.10 Summary

In this lecture, we wrote functions and classes to represent non-stochastic and stochastic ver-
sions of the Samuelson (1939) multiplier-accelerator model, described in [118].
13.10. SUMMARY 203

We saw that different parameter values led to different output paths, which could either be
stationary, explosive, or oscillating.
We also were able to represent the model using the QuantEcon.py LinearStateSpace class.
204 CHAPTER 13. OOP III: SAMUELSON MULTIPLIER ACCELERATOR
Chapter 14

More Language Features

14.1 Contents

β€’ Overview 14.2
β€’ Iterables and Iterators 14.3
β€’ Names and Name Resolution 14.4
β€’ Handling Errors 14.5
β€’ Decorators and Descriptors 14.6
β€’ Generators 14.7
β€’ Recursive Function Calls 14.8
β€’ Exercises 14.9
β€’ Solutions 14.10

14.2 Overview

With this last lecture, our advice is to skip it on first pass, unless you have a burning de-
sire to read it.
It’s here

1. as a reference, so we can link back to it when required, and


2. for those who have worked through a number of applications, and now want to learn
more about the Python language

A variety of topics are treated in the lecture, including generators, exceptions and descriptors.

14.3 Iterables and Iterators

We’ve already said something about iterating in Python.


Now let’s look more closely at how it all works, focusing in Python’s implementation of the
for loop.

205
206 CHAPTER 14. MORE LANGUAGE FEATURES

14.3.1 Iterators

Iterators are a uniform interface to stepping through elements in a collection.


Here we’ll talk about using iteratorsβ€”later we’ll learn how to build our own.
Formally, an iterator is an object with a __next__ method.
For example, file objects are iterators .
To see this, let’s have another look at the US cities data, which is written to the present
working directory in the following cell
%%file us_cities.txt
[1]: new york: 8244910
los angeles: 3819702
chicago: 2707120
houston: 2145146
philadelphia: 1536471
phoenix: 1469471
san antonio: 1359758
san diego: 1326179
dallas: 1223229

Writing us_cities.txt

f = open('us_cities.txt')
[2]: f.__next__()

'new york: 8244910\n'


[2]:
f.__next__()
[3]:
'los angeles: 3819702\n'
[3]:

We see that file objects do indeed have a __next__ method, and that calling this method
returns the next line in the file.
The next method can also be accessed via the builtin function next(), which directly calls
this method
next(f)
[4]:
'chicago: 2707120\n'
[4]:

The objects returned by enumerate() are also iterators


e = enumerate(['foo', 'bar'])
[5]: next(e)

(0, 'foo')
[5]:
next(e)
[6]:
(1, 'bar')
[6]:

as are the reader objects from the csv module .


Let’s create a small csv file that contains data from the NIKKEI index
%%file test_table.csv
[7]: Date,Open,High,Low,Close,Volume,Adj Close
2009-05-21,9280.35,9286.35,9189.92,9264.15,133200,9264.15
2009-05-20,9372.72,9399.40,9311.61,9344.64,143200,9344.64
2009-05-19,9172.56,9326.75,9166.97,9290.29,167000,9290.29
14.3. ITERABLES AND ITERATORS 207

2009-05-18,9167.05,9167.82,8997.74,9038.69,147800,9038.69
2009-05-15,9150.21,9272.08,9140.90,9265.02,172000,9265.02
2009-05-14,9212.30,9223.77,9052.41,9093.73,169400,9093.73
2009-05-13,9305.79,9379.47,9278.89,9340.49,176000,9340.49
2009-05-12,9358.25,9389.61,9298.61,9298.61,188400,9298.61
2009-05-11,9460.72,9503.91,9342.75,9451.98,230800,9451.98
2009-05-08,9351.40,9464.43,9349.57,9432.83,220200,9432.83

Writing test_table.csv

from csv import reader


[8]:
f = open('test_table.csv', 'r')
nikkei_data = reader(f)
next(nikkei_data)

['Date', 'Open', 'High', 'Low', 'Close', 'Volume', 'Adj Close']


[8]:
next(nikkei_data)
[9]:
['2009-05-21', '9280.35', '9286.35', '9189.92', '9264.15', '133200', '9264.15']
[9]:

14.3.2 Iterators in For Loops

All iterators can be placed to the right of the in keyword in for loop statements.
In fact this is how the for loop works: If we write

for x in iterator:
<code block>

then the interpreter

β€’ calls iterator.___next___() and binds x to the result


β€’ executes the code block
β€’ repeats until a StopIteration error occurs

So now you know how this magical looking syntax works

f = open('somefile.txt', 'r')
for line in f:
# do something

The interpreter just keeps

1. calling f.__next__() and binding line to the result


2. executing the body of the loop

This continues until a StopIteration error occurs.

14.3.3 Iterables

You already know that we can put a Python list to the right of in in a for loop
208 CHAPTER 14. MORE LANGUAGE FEATURES

for i in ['spam', 'eggs']:


[10]: print(i)

spam
eggs

So does that mean that a list is an iterator?


The answer is no
x = ['foo', 'bar']
[11]: type(x)

list
[11]:
next(x)
[12]:

---------------------------------------------------------------------------

TypeError Traceback (most recent call last)

<ipython-input-12-92de4e9f6b1e> in <module>
----> 1 next(x)

TypeError: 'list' object is not an iterator

So why can we iterate over a list in a for loop?


The reason is that a list is iterable (as opposed to an iterator).
Formally, an object is iterable if it can be converted to an iterator using the built-in function
iter().
Lists are one such object
x = ['foo', 'bar']
[13]: type(x)

list
[13]:
y = iter(x)
[14]: type(y)

list_iterator
[14]:
next(y)
[15]:
'foo'
[15]:
next(y)
[16]:
'bar'
[16]:
next(y)
[17]:

---------------------------------------------------------------------------

StopIteration Traceback (most recent call last)


14.3. ITERABLES AND ITERATORS 209

<ipython-input-17-81b9d2f0f16a> in <module>
----> 1 next(y)

StopIteration:

Many other objects are iterable, such as dictionaries and tuples.


Of course, not all objects are iterable
iter(42)
[18]:

---------------------------------------------------------------------------

TypeError Traceback (most recent call last)

<ipython-input-18-ef50b48e4398> in <module>
----> 1 iter(42)

TypeError: 'int' object is not iterable

To conclude our discussion of for loops

β€’ for loops work on either iterators or iterables.


β€’ In the second case, the iterable is converted into an iterator before the loop starts.

14.3.4 Iterators and built-ins

Some built-in functions that act on sequences also work with iterables

β€’ max(), min(), sum(), all(), any()

For example
x = [10, -10]
[19]: max(x)

10
[19]:
y = iter(x)
[20]: type(y)

list_iterator
[20]:
max(y)
[21]:
10
[21]:

One thing to remember about iterators is that they are depleted by use
x = [10, -10]
[22]: y = iter(x)
max(y)

10
[22]:
210 CHAPTER 14. MORE LANGUAGE FEATURES

max(y)
[23]:

---------------------------------------------------------------------------

ValueError Traceback (most recent call last)

<ipython-input-23-062424e6ec08> in <module>
----> 1 max(y)

ValueError: max() arg is an empty sequence

14.4 Names and Name Resolution

14.4.1 Variable Names in Python

Consider the Python statement


x = 42
[24]:

We now know that when this statement is executed, Python creates an object of type int in
your computer’s memory, containing

β€’ the value 42
β€’ some associated attributes

But what is x itself?


In Python, x is called a name, and the statement x = 42 binds the name x to the integer
object we have just discussed.
Under the hood, this process of binding names to objects is implemented as a dictionaryβ€”
more about this in a moment.
There is no problem binding two or more names to the one object, regardless of what that
object is
def f(string): # Create a function called f
[25]: print(string) # that prints any string it's passed

g = f
id(g) == id(f)

True
[25]:
g('test')
[26]:

test

In the first step, a function object is created, and the name f is bound to it.
After binding the name g to the same object, we can use it anywhere we would use f.
What happens when the number of names bound to an object goes to zero?
Here’s an example of this situation, where the name x is first bound to one object and then
rebound to another
14.4. NAMES AND NAME RESOLUTION 211

x = 'foo'
[27]: id(x)

140341457605440
[27]:
x = 'bar' # No names bound to the first object
[28]:

What happens here is that the first object is garbage collected.


In other words, the memory slot that stores that object is deallocated, and returned to the
operating system.

14.4.2 Namespaces

Recall from the preceding discussion that the statement


x = 42
[29]:

binds the name x to the integer object on the right-hand side.


We also mentioned that this process of binding x to the correct object is implemented as a
dictionary.
This dictionary is called a namespace.
Definition: A namespace is a symbol table that maps names to objects in memory.
Python uses multiple namespaces, creating them on the fly as necessary .
For example, every time we import a module, Python creates a namespace for that module.
To see this in action, suppose we write a script math2.py with a single line
%%file math2.py
[30]: pi = 'foobar'

Writing math2.py

Now we start the Python interpreter and import it


import math2
[31]:

Next let’s import the math module from the standard library
import math
[32]:

Both of these modules have an attribute called pi


math.pi
[33]:
3.141592653589793
[33]:
math2.pi
[34]:
'foobar'
[34]:

These two different bindings of pi exist in different namespaces, each one implemented as a
dictionary.
We can look at the dictionary directly, using module_name.__dict__
212 CHAPTER 14. MORE LANGUAGE FEATURES

import math
[35]:
math.__dict__.items()

dict_items([('__name__', 'math'), ('__doc__', 'This module is always available.


[35]: It provides access to the\nmathematical functions defined by the C standard.'),
('__package__', ''), ('__loader__',
<_frozen_importlib_external.ExtensionFileLoader object at 0x7fa3cb42ccf8>),
('__spec__', ModuleSpec(name='math',
loader=<_frozen_importlib_external.ExtensionFileLoader object at
0x7fa3cb42ccf8>, origin='/home/ubuntu/anaconda3/lib/python3.7/lib-
dynload/math.cpython-37m-x86_64-linux-gnu.so')), ('acos', <built-in function
acos>), ('acosh', <built-in function acosh>), ('asin', <built-in function
asin>), ('asinh', <built-in function asinh>), ('atan', <built-in function
atan>), ('atan2', <built-in function atan2>), ('atanh', <built-in function
atanh>), ('ceil', <built-in function ceil>), ('copysign', <built-in function
copysign>), ('cos', <built-in function cos>), ('cosh', <built-in function
cosh>), ('degrees', <built-in function degrees>), ('erf', <built-in function
erf>), ('erfc', <built-in function erfc>), ('exp', <built-in function exp>),
('expm1', <built-in function expm1>), ('fabs', <built-in function fabs>),
('factorial', <built-in function factorial>), ('floor', <built-in function
floor>), ('fmod', <built-in function fmod>), ('frexp', <built-in function
frexp>), ('fsum', <built-in function fsum>), ('gamma', <built-in function
gamma>), ('gcd', <built-in function gcd>), ('hypot', <built-in function hypot>),
('isclose', <built-in function isclose>), ('isfinite', <built-in function
isfinite>), ('isinf', <built-in function isinf>), ('isnan', <built-in function
isnan>), ('ldexp', <built-in function ldexp>), ('lgamma', <built-in function
lgamma>), ('log', <built-in function log>), ('log1p', <built-in function
log1p>), ('log10', <built-in function log10>), ('log2', <built-in function
log2>), ('modf', <built-in function modf>), ('pow', <built-in function pow>),
('radians', <built-in function radians>), ('remainder', <built-in function
remainder>), ('sin', <built-in function sin>), ('sinh', <built-in function
sinh>), ('sqrt', <built-in function sqrt>), ('tan', <built-in function tan>),
('tanh', <built-in function tanh>), ('trunc', <built-in function trunc>), ('pi',
3.141592653589793), ('e', 2.718281828459045), ('tau', 6.283185307179586),
('inf', inf), ('nan', nan), ('__file__',
'/home/ubuntu/anaconda3/lib/python3.7/lib-dynload/math.cpython-37m-x86_64-linux-
gnu.so')])

import math2
[36]:
math2.__dict__.items()

dict_items([('__name__', 'math2'), ('__doc__', None), ('__package__', ''),


[36]: ('__loader__', <_frozen_importlib_external.SourceFileLoader object at
0x7fa3c0e6ab38>), ('__spec__', ModuleSpec(name='math2',
loader=<_frozen_importlib_external.SourceFileLoader object at 0x7fa3c0e6ab38>,
origin='/home/ubuntu/repos/lecture-source-
py/_build/pdf/jupyter/executed/math2.py')), ('__file__',
'/home/ubuntu/repos/lecture-source-py/_build/pdf/jupyter/executed/math2.py'),
('__cached__', '/home/ubuntu/repos/lecture-source-
py/_build/pdf/jupyter/executed/__pycache__/math2.cpython-37.pyc'),
('__builtins__', {'__name__': 'builtins', '__doc__': "Built-in functions,
exceptions, and other objects.\n\nNoteworthy: None is the `nil' object; Ellipsis
represents `…' in slices.", '__package__': '', '__loader__': <class
'_frozen_importlib.BuiltinImporter'>, '__spec__': ModuleSpec(name='builtins',
loader=<class '_frozen_importlib.BuiltinImporter'>), '__build_class__': <built-
in function __build_class__>, '__import__': <built-in function __import__>,
'abs': <built-in function abs>, 'all': <built-in function all>, 'any': <built-in
function any>, 'ascii': <built-in function ascii>, 'bin': <built-in function
bin>, 'breakpoint': <built-in function breakpoint>, 'callable': <built-in
function callable>, 'chr': <built-in function chr>, 'compile': <built-in
function compile>, 'delattr': <built-in function delattr>, 'dir': <built-in
function dir>, 'divmod': <built-in function divmod>, 'eval': <built-in function
eval>, 'exec': <built-in function exec>, 'format': <built-in function format>,
'getattr': <built-in function getattr>, 'globals': <built-in function globals>,
'hasattr': <built-in function hasattr>, 'hash': <built-in function hash>, 'hex':
<built-in function hex>, 'id': <built-in function id>, 'input': <bound method
Kernel.raw_input of <ipykernel.ipkernel.IPythonKernel object at
0x7fa3c9075b70>>, 'isinstance': <built-in function isinstance>, 'issubclass':
<built-in function issubclass>, 'iter': <built-in function iter>, 'len': <built-
in function len>, 'locals': <built-in function locals>, 'max': <built-in
14.4. NAMES AND NAME RESOLUTION 213

function max>, 'min': <built-in function min>, 'next': <built-in function next>,
'oct': <built-in function oct>, 'ord': <built-in function ord>, 'pow': <built-in
function pow>, 'print': <built-in function print>, 'repr': <built-in function
repr>, 'round': <built-in function round>, 'setattr': <built-in function
setattr>, 'sorted': <built-in function sorted>, 'sum': <built-in function sum>,
'vars': <built-in function vars>, 'None': None, 'Ellipsis': Ellipsis,
'NotImplemented': NotImplemented, 'False': False, 'True': True, 'bool': <class
'bool'>, 'memoryview': <class 'memoryview'>, 'bytearray': <class 'bytearray'>,
'bytes': <class 'bytes'>, 'classmethod': <class 'classmethod'>, 'complex':
<class 'complex'>, 'dict': <class 'dict'>, 'enumerate': <class 'enumerate'>,
'filter': <class 'filter'>, 'float': <class 'float'>, 'frozenset': <class
'frozenset'>, 'property': <class 'property'>, 'int': <class 'int'>, 'list':
<class 'list'>, 'map': <class 'map'>, 'object': <class 'object'>, 'range':
<class 'range'>, 'reversed': <class 'reversed'>, 'set': <class 'set'>, 'slice':
<class 'slice'>, 'staticmethod': <class 'staticmethod'>, 'str': <class 'str'>,
'super': <class 'super'>, 'tuple': <class 'tuple'>, 'type': <class 'type'>,
'zip': <class 'zip'>, '__debug__': True, 'BaseException': <class
'BaseException'>, 'Exception': <class 'Exception'>, 'TypeError': <class
'TypeError'>, 'StopAsyncIteration': <class 'StopAsyncIteration'>,
'StopIteration': <class 'StopIteration'>, 'GeneratorExit': <class
'GeneratorExit'>, 'SystemExit': <class 'SystemExit'>, 'KeyboardInterrupt':
<class 'KeyboardInterrupt'>, 'ImportError': <class 'ImportError'>,
'ModuleNotFoundError': <class 'ModuleNotFoundError'>, 'OSError': <class
'OSError'>, 'EnvironmentError': <class 'OSError'>, 'IOError': <class 'OSError'>,
'EOFError': <class 'EOFError'>, 'RuntimeError': <class 'RuntimeError'>,
'RecursionError': <class 'RecursionError'>, 'NotImplementedError': <class
'NotImplementedError'>, 'NameError': <class 'NameError'>, 'UnboundLocalError':
<class 'UnboundLocalError'>, 'AttributeError': <class 'AttributeError'>,
'SyntaxError': <class 'SyntaxError'>, 'IndentationError': <class
'IndentationError'>, 'TabError': <class 'TabError'>, 'LookupError': <class
'LookupError'>, 'IndexError': <class 'IndexError'>, 'KeyError': <class
'KeyError'>, 'ValueError': <class 'ValueError'>, 'UnicodeError': <class
'UnicodeError'>, 'UnicodeEncodeError': <class 'UnicodeEncodeError'>,
'UnicodeDecodeError': <class 'UnicodeDecodeError'>, 'UnicodeTranslateError':
<class 'UnicodeTranslateError'>, 'AssertionError': <class 'AssertionError'>,
'ArithmeticError': <class 'ArithmeticError'>, 'FloatingPointError': <class
'FloatingPointError'>, 'OverflowError': <class 'OverflowError'>,
'ZeroDivisionError': <class 'ZeroDivisionError'>, 'SystemError': <class
'SystemError'>, 'ReferenceError': <class 'ReferenceError'>, 'MemoryError':
<class 'MemoryError'>, 'BufferError': <class 'BufferError'>, 'Warning': <class
'Warning'>, 'UserWarning': <class 'UserWarning'>, 'DeprecationWarning': <class
'DeprecationWarning'>, 'PendingDeprecationWarning': <class
'PendingDeprecationWarning'>, 'SyntaxWarning': <class 'SyntaxWarning'>,
'RuntimeWarning': <class 'RuntimeWarning'>, 'FutureWarning': <class
'FutureWarning'>, 'ImportWarning': <class 'ImportWarning'>, 'UnicodeWarning':
<class 'UnicodeWarning'>, 'BytesWarning': <class 'BytesWarning'>,
'ResourceWarning': <class 'ResourceWarning'>, 'ConnectionError': <class
'ConnectionError'>, 'BlockingIOError': <class 'BlockingIOError'>,
'BrokenPipeError': <class 'BrokenPipeError'>, 'ChildProcessError': <class
'ChildProcessError'>, 'ConnectionAbortedError': <class
'ConnectionAbortedError'>, 'ConnectionRefusedError': <class
'ConnectionRefusedError'>, 'ConnectionResetError': <class
'ConnectionResetError'>, 'FileExistsError': <class 'FileExistsError'>,
'FileNotFoundError': <class 'FileNotFoundError'>, 'IsADirectoryError': <class
'IsADirectoryError'>, 'NotADirectoryError': <class 'NotADirectoryError'>,
'InterruptedError': <class 'InterruptedError'>, 'PermissionError': <class
'PermissionError'>, 'ProcessLookupError': <class 'ProcessLookupError'>,
'TimeoutError': <class 'TimeoutError'>, 'open': <built-in function open>,
'copyright': Copyright (c) 2001-2019 Python Software Foundation.
All Rights Reserved.

Copyright (c) 2000 BeOpen.com.


All Rights Reserved.

Copyright (c) 1995-2001 Corporation for National Research Initiatives.


All Rights Reserved.

Copyright (c) 1991-1995 Stichting Mathematisch Centrum, Amsterdam.


All Rights Reserved., 'credits': Thanks to CWI, CNRI, BeOpen.com, Zope
Corporation and a cast of thousands
for supporting Python development. See www.python.org for more
information., 'license': Type license() to see the full license text, 'help':
Type help() for interactive help, or help(object) for help about object.,
214 CHAPTER 14. MORE LANGUAGE FEATURES

'__IPYTHON__': True, 'display': <function display at 0x7fa3cb2f5d08>,


'get_ipython': <bound method InteractiveShell.get_ipython of
<ipykernel.zmqshell.ZMQInteractiveShell object at 0x7fa3ca99ee80>>}), ('pi',
'foobar')])

As you know, we access elements of the namespace using the dotted attribute notation
math.pi
[37]:
3.141592653589793
[37]:

In fact this is entirely equivalent to math.__dict__['pi']


math.__dict__['pi'] == math.pi
[38]:
True
[38]:

14.4.3 Viewing Namespaces

As we saw above, the math namespace can be printed by typing math.__dict__.


Another way to see its contents is to type vars(math)
vars(math).items()
[39]:
dict_items([('__name__', 'math'), ('__doc__', 'This module is always available.
[39]: It provides access to the\nmathematical functions defined by the C standard.'),
('__package__', ''), ('__loader__',
<_frozen_importlib_external.ExtensionFileLoader object at 0x7fa3cb42ccf8>),
('__spec__', ModuleSpec(name='math',
loader=<_frozen_importlib_external.ExtensionFileLoader object at
0x7fa3cb42ccf8>, origin='/home/ubuntu/anaconda3/lib/python3.7/lib-
dynload/math.cpython-37m-x86_64-linux-gnu.so')), ('acos', <built-in function
acos>), ('acosh', <built-in function acosh>), ('asin', <built-in function
asin>), ('asinh', <built-in function asinh>), ('atan', <built-in function
atan>), ('atan2', <built-in function atan2>), ('atanh', <built-in function
atanh>), ('ceil', <built-in function ceil>), ('copysign', <built-in function
copysign>), ('cos', <built-in function cos>), ('cosh', <built-in function
cosh>), ('degrees', <built-in function degrees>), ('erf', <built-in function
erf>), ('erfc', <built-in function erfc>), ('exp', <built-in function exp>),
('expm1', <built-in function expm1>), ('fabs', <built-in function fabs>),
('factorial', <built-in function factorial>), ('floor', <built-in function
floor>), ('fmod', <built-in function fmod>), ('frexp', <built-in function
frexp>), ('fsum', <built-in function fsum>), ('gamma', <built-in function
gamma>), ('gcd', <built-in function gcd>), ('hypot', <built-in function hypot>),
('isclose', <built-in function isclose>), ('isfinite', <built-in function
isfinite>), ('isinf', <built-in function isinf>), ('isnan', <built-in function
isnan>), ('ldexp', <built-in function ldexp>), ('lgamma', <built-in function
lgamma>), ('log', <built-in function log>), ('log1p', <built-in function
log1p>), ('log10', <built-in function log10>), ('log2', <built-in function
log2>), ('modf', <built-in function modf>), ('pow', <built-in function pow>),
('radians', <built-in function radians>), ('remainder', <built-in function
remainder>), ('sin', <built-in function sin>), ('sinh', <built-in function
sinh>), ('sqrt', <built-in function sqrt>), ('tan', <built-in function tan>),
('tanh', <built-in function tanh>), ('trunc', <built-in function trunc>), ('pi',
3.141592653589793), ('e', 2.718281828459045), ('tau', 6.283185307179586),
('inf', inf), ('nan', nan), ('__file__',
'/home/ubuntu/anaconda3/lib/python3.7/lib-dynload/math.cpython-37m-x86_64-linux-
gnu.so')])

If you just want to see the names, you can type


dir(math)[0:10]
[40]:
14.4. NAMES AND NAME RESOLUTION 215

['__doc__',
[40]: '__file__',
'__loader__',
'__name__',
'__package__',
'__spec__',
'acos',
'acosh',
'asin',
'asinh']

Notice the special names __doc__ and __name__.


These are initialized in the namespace when any module is imported

β€’ __doc__ is the doc string of the module


β€’ __name__ is the name of the module

print(math.__doc__)
[41]:

This module is always available. It provides access to the


mathematical functions defined by the C standard.

math.__name__
[42]:
'math'
[42]:

14.4.4 Interactive Sessions

In Python, all code executed by the interpreter runs in some module.


What about commands typed at the prompt?
These are also regarded as being executed within a module β€” in this case, a module called
__main__.
To check this, we can look at the current module name via the value of __name__ given at
the prompt
print(__name__)
[43]:

__main__

When we run a script using IPython’s run command, the contents of the file are executed as
part of __main__ too.
To see this, let’s create a file mod.py that prints its own __name__ attribute
%%file mod.py
[44]: print(__name__)

Writing mod.py

Now let’s look at two different ways of running it in IPython


import mod # Standard import
[45]:

mod
216 CHAPTER 14. MORE LANGUAGE FEATURES

%run mod.py # Run interactively


[46]:

__main__

In the second case, the code is executed as part of __main__, so __name__ is equal to
__main__.
To see the contents of the namespace of __main__ we use vars() rather than
vars(__main__) .
If you do this in IPython, you will see a whole lot of variables that IPython needs, and has
initialized when you started up your session.
If you prefer to see only the variables you have initialized, use whos
x = 2
[47]: y = 3

import numpy as np

%whos

Variable Type Data/Info


-----------------------------------------------------
e enumerate <enumerate object at 0x7fa3c0edeab0>
f function <function f at 0x7fa3c0e5aea0>
g function <function f at 0x7fa3c0e5aea0>
i str eggs
math module <module 'math' from
'/hom<…>37m-x86_64-linux-gnu.so'>
math2 module <module 'math2' from
'/ho<…>pyter/executed/math2.py'>
mod module <module 'mod' from
'/home<…>jupyter/executed/mod.py'>
nikkei_data reader <_csv.reader object at
0x7fa3c1767f98>
np module <module 'numpy' from
'/ho<…>kages/numpy/__init__.py'>
reader builtin_function_or_method <built-in function reader>
x int 2
y int 3

14.4.5 The Global Namespace

Python documentation often makes reference to the β€œglobal namespace”.


The global namespace is the namespace of the module currently being executed.
For example, suppose that we start the interpreter and begin making assignments .
We are now working in the module __main__, and hence the namespace for __main__ is
the global namespace.
Next, we import a module called amodule

import amodule

At this point, the interpreter creates a namespace for the module amodule and starts exe-
cuting commands in the module.
While this occurs, the namespace amodule.__dict__ is the global namespace.
14.4. NAMES AND NAME RESOLUTION 217

Once execution of the module finishes, the interpreter returns to the module from where the
import statement was made.
In this case it’s __main__, so the namespace of __main__ again becomes the global names-
pace.

14.4.6 Local Namespaces

Important fact: When we call a function, the interpreter creates a local namespace for that
function, and registers the variables in that namespace.
The reason for this will be explained in just a moment.
Variables in the local namespace are called local variables.
After the function returns, the namespace is deallocated and lost.
While the function is executing, we can view the contents of the local namespace with
locals().
For example, consider
def f(x):
[48]: a = 2
print(locals())
return a * x

Now let’s call the function


f(1)
[49]:

{'x': 1, 'a': 2}

2
[49]:

You can see the local namespace of f before it is destroyed.

14.4.7 The __builtins__ Namespace

We have been using various built-in functions, such as max(), dir(), str(), list(),
len(), range(), type(), etc.
How does access to these names work?

β€’ These definitions are stored in a module called __builtin__.


β€’ They have there own namespace called __builtins__.

dir()[0:10]
[50]:
['In', 'Out', '_', '_11', '_13', '_14', '_15', '_16', '_19', '_2']
[50]:
dir(__builtins__)[0:10]
[51]:
['ArithmeticError',
[51]: 'AssertionError',
'AttributeError',
'BaseException',
'BlockingIOError',
218 CHAPTER 14. MORE LANGUAGE FEATURES

'BrokenPipeError',
'BufferError',
'BytesWarning',
'ChildProcessError',
'ConnectionAbortedError']

We can access elements of the namespace as follows


__builtins__.max
[52]:
<function max>
[52]:

But __builtins__ is special, because we can always access them directly as well
max
[53]:
<function max>
[53]:
__builtins__.max == max
[54]:
True
[54]:

The next section explains how this works …

14.4.8 Name Resolution

Namespaces are great because they help us organize variable names.


(Type import this at the prompt and look at the last item that’s printed)
However, we do need to understand how the Python interpreter works with multiple names-
paces .
At any point of execution, there are in fact at least two namespaces that can be accessed di-
rectly.
(β€œAccessed directly” means without using a dot, as in pi rather than math.pi)
These namespaces are

β€’ The global namespace (of the module being executed)


β€’ The builtin namespace

If the interpreter is executing a function, then the directly accessible namespaces are

β€’ The local namespace of the function


β€’ The global namespace (of the module being executed)
β€’ The builtin namespace

Sometimes functions are defined within other functions, like so


def f():
[55]: a = 2
def g():
b = 4
print(a * b)
g()

Here f is the enclosing function for g, and each function gets its own namespaces.
14.4. NAMES AND NAME RESOLUTION 219

Now we can give the rule for how namespace resolution works:
The order in which the interpreter searches for names is

1. the local namespace (if it exists)


2. the hierarchy of enclosing namespaces (if they exist)
3. the global namespace
4. the builtin namespace

If the name is not in any of these namespaces, the interpreter raises a NameError.
This is called the LEGB rule (local, enclosing, global, builtin).
Here’s an example that helps to illustrate .
Consider a script test.py that looks as follows
%%file test.py
[56]: def g(x):
a = 1
x = x + a
return x

a = 0
y = g(10)
print("a = ", a, "y = ", y)

Writing test.py

What happens when we run this script?


%run test.py
[57]:

a = 0 y = 11

x
[58]:
2
[58]:

First,

β€’ The global namespace {} is created.


β€’ The function object is created, and g is bound to it within the global namespace.
β€’ The name a is bound to 0, again in the global namespace.

Next g is called via y = g(10), leading to the following sequence of actions

β€’ The local namespace for the function is created.


β€’ Local names x and a are bound, so that the local namespace becomes {'x': 10,
'a': 1}.
β€’ Statement x = x + a uses the local a and local x to compute x + a, and binds local
name x to the result.
β€’ This value is returned, and y is bound to it in the global namespace.
β€’ Local x and a are discarded (and the local namespace is deallocated).

Note that the global a was not affected by the local a.


220 CHAPTER 14. MORE LANGUAGE FEATURES

14.4.9 Mutable Versus Immutable Parameters

This is a good time to say a little more about mutable vs immutable objects.
Consider the code segment
def f(x):
[59]: x = x + 1
return x

x = 1
print(f(x), x)

2 1

We now understand what will happen here: The code prints 2 as the value of f(x) and 1 as
the value of x.
First f and x are registered in the global namespace.
The call f(x) creates a local namespace and adds x to it, bound to 1.
Next, this local x is rebound to the new integer object 2, and this value is returned.
None of this affects the global x.
However, it’s a different story when we use a mutable data type such as a list
def f(x):
[]: x[0] = x[0] + 1
return x

x = [1]
print(f(x), x)

[2] [2]

This prints [2] as the value of f(x) and same for x.


Here’s what happens

β€’ f is registered as a function in the global namespace

β€’ x bound to [1] in the global namespace

β€’ The call f(x)

– Creates a local namespace


– Adds x to local namespace, bound to [1]
– The list [1] is modified to [2]
– Returns the list [2]
– The local namespace is deallocated, and local x is lost

β€’ Global x has been modified

14.5 Handling Errors

Sometimes it’s possible to anticipate errors as we’re writing code.


14.5. HANDLING ERRORS 221

For example, the unbiased sample variance of sample 𝑦1 , … , 𝑦𝑛 is defined as

𝑛
1
𝑠2 ∢= βˆ‘(𝑦𝑖 βˆ’ 𝑦)Μ„ 2 𝑦 Μ„ = sample mean
𝑛 βˆ’ 1 𝑖=1

This can be calculated in NumPy using np.var.


But if you were writing a function to handle such a calculation, you might anticipate a divide-
by-zero error when the sample size is one.
One possible action is to do nothing β€” the program will just crash, and spit out an error
message.
But sometimes it’s worth writing your code in a way that anticipates and deals with runtime
errors that you think might arise.
Why?

β€’ Because the debugging information provided by the interpreter is often less useful than
the information on possible errors you have in your head when writing code.
β€’ Because errors causing execution to stop are frustrating if you’re in the middle of a
large computation.
β€’ Because it’s reduces confidence in your code on the part of your users (if you are writing
for others).

14.5.1 Assertions

A relatively easy way to handle checks is with the assert keyword.


For example, pretend for a moment that the np.var function doesn’t exist and we need to
write our own
def var(y):
[61]: n = len(y)
assert n > 1, 'Sample size must be greater than one.'
return np.sum((y - y.mean())**2) / float(n-1)

If we run this with an array of length one, the program will terminate and print our error
message
var([1])
[62]:

---------------------------------------------------------------------------

AssertionError Traceback (most recent call last)

<ipython-input-62-8419b6ab38ec> in <module>
----> 1 var([1])

<ipython-input-61-e6ffb16a7098> in var(y)
1 def var(y):
2 n = len(y)
----> 3 assert n > 1, 'Sample size must be greater than one.'
4 return np.sum((y - y.mean())**2) / float(n-1)

AssertionError: Sample size must be greater than one.


222 CHAPTER 14. MORE LANGUAGE FEATURES

The advantage is that we can

β€’ fail early, as soon as we know there will be a problem


β€’ supply specific information on why a program is failing

14.5.2 Handling Errors During Runtime

The approach used above is a bit limited, because it always leads to termination.
Sometimes we can handle errors more gracefully, by treating special cases.
Let’s look at how this is done.
Exceptions
Here’s an example of a common error type
def f:
[63]:

File "<ipython-input-63-262a7e387ba5>", line 1


def f:
^
SyntaxError: invalid syntax

Since illegal syntax cannot be executed, a syntax error terminates execution of the program.
Here’s a different kind of error, unrelated to syntax
1 / 0
[64]:

---------------------------------------------------------------------------

ZeroDivisionError Traceback (most recent call last)

<ipython-input-64-bc757c3fda29> in <module>
----> 1 1 / 0

ZeroDivisionError: division by zero

Here’s another
x1 = y1
[65]:

---------------------------------------------------------------------------

NameError Traceback (most recent call last)

<ipython-input-65-a7b8d65e9e45> in <module>
----> 1 x1 = y1

NameError: name 'y1' is not defined

And another
14.5. HANDLING ERRORS 223

'foo' + 6
[66]:

---------------------------------------------------------------------------

TypeError Traceback (most recent call last)

<ipython-input-66-216809d6e6fe> in <module>
----> 1 'foo' + 6

TypeError: can only concatenate str (not "int") to str

And another
X = []
[67]: x = X[0]

---------------------------------------------------------------------------

IndexError Traceback (most recent call last)

<ipython-input-67-082a18d7a0aa> in <module>
1 X = []
----> 2 x = X[0]

IndexError: list index out of range

On each occasion, the interpreter informs us of the error type

β€’ NameError, TypeError, IndexError, ZeroDivisionError, etc.

In Python, these errors are called exceptions.


Catching Exceptions
We can catch and deal with exceptions using try – except blocks.
Here’s a simple example
def f(x):
[68]: try:
return 1.0 / x
except ZeroDivisionError:
print('Error: division by zero. Returned None')
return None

When we call f we get the following output


f(2)
[69]:
0.5
[69]:
f(0)
[70]:

Error: division by zero. Returned None


224 CHAPTER 14. MORE LANGUAGE FEATURES

f(0.0)
[71]:

Error: division by zero. Returned None

The error is caught and execution of the program is not terminated.


Note that other error types are not caught.
If we are worried the user might pass in a string, we can catch that error too
def f(x):
[72]: try:
return 1.0 / x
except ZeroDivisionError:
print('Error: Division by zero. Returned None')
except TypeError:
print('Error: Unsupported operation. Returned None')
return None

Here’s what happens


f(2)
[73]:
0.5
[73]:
f(0)
[74]:

Error: Division by zero. Returned None

f('foo')
[75]:

Error: Unsupported operation. Returned None

If we feel lazy we can catch these errors together


def f(x):
[76]: try:
return 1.0 / x
except (TypeError, ZeroDivisionError):
print('Error: Unsupported operation. Returned None')
return None

Here’s what happens


f(2)
[77]:
0.5
[77]:
f(0)
[78]:

Error: Unsupported operation. Returned None

f('foo')
[79]:

Error: Unsupported operation. Returned None

If we feel extra lazy we can catch all error types as follows


14.6. DECORATORS AND DESCRIPTORS 225

def f(x):
[80]: try:
return 1.0 / x
except:
print('Error. Returned None')
return None

In general it’s better to be specific.

14.6 Decorators and Descriptors

Let’s look at some special syntax elements that are routinely used by Python developers.
You might not need the following concepts immediately, but you will see them in other peo-
ple’s code.
Hence you need to understand them at some stage of your Python education.

14.6.1 Decorators

Decorators are a bit of syntactic sugar that, while easily avoided, have turned out to be popu-
lar.
It’s very easy to say what decorators do.
On the other hand it takes a bit of effort to explain why you might use them.
An Example
Suppose we are working on a program that looks something like this
import numpy as np
[81]:
def f(x):
return np.log(np.log(x))

def g(x):
return np.sqrt(42 * x)

# Program continues with various calculations using f and g

Now suppose there’s a problem: occasionally negative numbers get fed to f and g in the cal-
culations that follow.
If you try it, you’ll see that when these functions are called with negative numbers they re-
turn a NumPy object called nan .
This stands for β€œnot a number” (and indicates that you are trying to evaluate a mathematical
function at a point where it is not defined).
Perhaps this isn’t what we want, because it causes other problems that are hard to pick up
later on.
Suppose that instead we want the program to terminate whenever this happens, with a sensi-
ble error message.
This change is easy enough to implement
import numpy as np
[82]:
def f(x):
226 CHAPTER 14. MORE LANGUAGE FEATURES

assert x >= 0, "Argument must be nonnegative"


return np.log(np.log(x))

def g(x):
assert x >= 0, "Argument must be nonnegative"
return np.sqrt(42 * x)

# Program continues with various calculations using f and g

Notice however that there is some repetition here, in the form of two identical lines of code.
Repetition makes our code longer and harder to maintain, and hence is something we try
hard to avoid.
Here it’s not a big deal, but imagine now that instead of just f and g, we have 20 such func-
tions that we need to modify in exactly the same way.
This means we need to repeat the test logic (i.e., the assert line testing nonnegativity) 20
times.
The situation is still worse if the test logic is longer and more complicated.
In this kind of scenario the following approach would be neater
import numpy as np
[83]:
def check_nonneg(func):
def safe_function(x):
assert x >= 0, "Argument must be nonnegative"
return func(x)
return safe_function

def f(x):
return np.log(np.log(x))

def g(x):
return np.sqrt(42 * x)

f = check_nonneg(f)
g = check_nonneg(g)
# Program continues with various calculations using f and g

This looks complicated so let’s work through it slowly.


To unravel the logic, consider what happens when we say f = check_nonneg(f).
This calls the function check_nonneg with parameter func set equal to f.
Now check_nonneg creates a new function called safe_function that verifies x as non-
negative and then calls func on it (which is the same as f).
Finally, the global name f is then set equal to safe_function.
Now the behavior of f is as we desire, and the same is true of g.
At the same time, the test logic is written only once.
Enter Decorators
The last version of our code is still not ideal.
For example, if someone is reading our code and wants to know how f works, they will be
looking for the function definition, which is
def f(x):
[84]: return np.log(np.log(x))
14.6. DECORATORS AND DESCRIPTORS 227

They may well miss the line f = check_nonneg(f).


For this and other reasons, decorators were introduced to Python.
With decorators, we can replace the lines
def f(x):
[85]: return np.log(np.log(x))

def g(x):
return np.sqrt(42 * x)

f = check_nonneg(f)
g = check_nonneg(g)

with
@check_nonneg
[86]: def f(x):
return np.log(np.log(x))

@check_nonneg
def g(x):
return np.sqrt(42 * x)

These two pieces of code do exactly the same thing.


If they do the same thing, do we really need decorator syntax?
Well, notice that the decorators sit right on top of the function definitions.
Hence anyone looking at the definition of the function will see them and be aware that the
function is modified.
In the opinion of many people, this makes the decorator syntax a significant improvement to
the language.

14.6.2 Descriptors

Descriptors solve a common problem regarding management of variables.


To understand the issue, consider a Car class, that simulates a car.
Suppose that this class defines the variables miles and kms, which give the distance traveled
in miles and kilometers respectively.
A highly simplified version of the class might look as follows
class Car:
[87]:
def __init__(self, miles=1000):
self.miles = miles
self.kms = miles * 1.61

# Some other functionality, details omitted

One potential problem we might have here is that a user alters one of these variables but not
the other
car = Car()
[88]: car.miles

1000
[88]:
car.kms
[89]:
228 CHAPTER 14. MORE LANGUAGE FEATURES

1610.0
[89]:
car.miles = 6000
[90]: car.kms

1610.0
[90]:

In the last two lines we see that miles and kms are out of sync.
What we really want is some mechanism whereby each time a user sets one of these variables,
the other is automatically updated.
A Solution
In Python, this issue is solved using descriptors.
A descriptor is just a Python object that implements certain methods.
These methods are triggered when the object is accessed through dotted attribute notation.
The best way to understand this is to see it in action.
Consider this alternative version of the Car class
class Car:
[91]:
def __init__(self, miles=1000):
self._miles = miles
self._kms = miles * 1.61

def set_miles(self, value):


self._miles = value
self._kms = value * 1.61

def set_kms(self, value):


self._kms = value
self._miles = value / 1.61

def get_miles(self):
return self._miles

def get_kms(self):
return self._kms

miles = property(get_miles, set_miles)


kms = property(get_kms, set_kms)

First let’s check that we get the desired behavior


car = Car()
[92]: car.miles

1000
[92]:
car.miles = 6000
[93]: car.kms

9660.0
[93]:

Yep, that’s what we want β€” car.kms is automatically updated.


How it Works
The names _miles and _kms are arbitrary names we are using to store the values of the
variables.
The objects miles and kms are properties, a common kind of descriptor.
14.7. GENERATORS 229

The methods get_miles, set_miles, get_kms and set_kms define what happens when
you get (i.e. access) or set (bind) these variables

β€’ So-called β€œgetter” and β€œsetter” methods.

The builtin Python function property takes getter and setter methods and creates a prop-
erty.
For example, after car is created as an instance of Car, the object car.miles is a property.
Being a property, when we set its value via car.miles = 6000 its setter method is trig-
gered β€” in this case set_miles.
Decorators and Properties
These days its very common to see the property function used via a decorator.
Here’s another version of our Car class that works as before but now uses decorators to set
up the properties
class Car:
[94]:
def __init__(self, miles=1000):
self._miles = miles
self._kms = miles * 1.61

@property
def miles(self):
return self._miles

@property
def kms(self):
return self._kms

@miles.setter
def miles(self, value):
self._miles = value
self._kms = value * 1.61

@kms.setter
def kms(self, value):
self._kms = value
self._miles = value / 1.61

We won’t go through all the details here.


For further information you can refer to the descriptor documentation.

14.7 Generators

A generator is a kind of iterator (i.e., it works with a next function).


We will study two ways to build generators: generator expressions and generator functions.

14.7.1 Generator Expressions

The easiest way to build generators is using generator expressions.


Just like a list comprehension, but with round brackets.
Here is the list comprehension:
230 CHAPTER 14. MORE LANGUAGE FEATURES

singular = ('dog', 'cat', 'bird')


[95]: type(singular)

tuple
[95]:
plural = [string + 's' for string in singular]
[96]: plural

['dogs', 'cats', 'birds']


[96]:
type(plural)
[97]:
list
[97]:

And here is the generator expression


singular = ('dog', 'cat', 'bird')
[98]: plural = (string + 's' for string in singular)
type(plural)

generator
[98]:
next(plural)
[99]:
'dogs'
[99]:
next(plural)
[100]:
'cats'
[100]:
next(plural)
[101]:
'birds'
[101]:

Since sum() can be called on iterators, we can do this


sum((x * x for x in range(10)))
[102]:
285
[102]:

The function sum() calls next() to get the items, adds successive terms.
In fact, we can omit the outer brackets in this case
sum(x * x for x in range(10))
[103]:
285
[103]:

14.7.2 Generator Functions

The most flexible way to create generator objects is to use generator functions.
Let’s look at some examples.
Example 1
Here’s a very simple example of a generator function
def f():
[104]: yield 'start'
yield 'middle'
yield 'end'
14.7. GENERATORS 231

It looks like a function, but uses a keyword yield that we haven’t met before.
Let’s see how it works after running this code
type(f)
[105]:
function
[105]:
gen = f()
[106]: gen

<generator object f at 0x7fa3c17322a0>


[106]:
next(gen)
[107]:
'start'
[107]:
next(gen)
[108]:
'middle'
[108]:
next(gen)
[109]:
'end'
[109]:
next(gen)
[110]:

---------------------------------------------------------------------------

StopIteration Traceback (most recent call last)

<ipython-input-110-6e72e47198db> in <module>
----> 1 next(gen)

StopIteration:

The generator function f() is used to create generator objects (in this case gen).
Generators are iterators, because they support a next method.
The first call to next(gen)

β€’ Executes code in the body of f() until it meets a yield statement.


β€’ Returns that value to the caller of next(gen).

The second call to next(gen) starts executing from the next line
def f():
[111]: yield 'start'
yield 'middle' # This line!
yield 'end'

and continues until the next yield statement.


At that point it returns the value following yield to the caller of next(gen), and so on.
When the code block ends, the generator throws a StopIteration error.
Example 2
Our next example receives an argument x from the caller
232 CHAPTER 14. MORE LANGUAGE FEATURES

def g(x):
[112]: while x < 100:
yield x
x = x * x

Let’s see how it works


g
[113]:
<function __main__.g(x)>
[113]:
gen = g(2)
[114]: type(gen)

generator
[114]:
next(gen)
[115]:
2
[115]:
next(gen)
[116]:
4
[116]:
next(gen)
[117]:
16
[117]:
next(gen)
[118]:

---------------------------------------------------------------------------

StopIteration Traceback (most recent call last)

<ipython-input-118-6e72e47198db> in <module>
----> 1 next(gen)

StopIteration:

The call gen = g(2) binds gen to a generator.


Inside the generator, the name x is bound to 2.
When we call next(gen)

β€’ The body of g() executes until the line yield x, and the value of x is returned.

Note that value of x is retained inside the generator.


When we call next(gen) again, execution continues from where it left off
def g(x):
[119]: while x < 100:
yield x
x = x * x # execution continues from here

When x < 100 fails, the generator throws a StopIteration error.


Incidentally, the loop inside the generator can be infinite
14.8. RECURSIVE FUNCTION CALLS 233

def g(x):
[120]: while 1:
yield x
x = x * x

14.7.3 Advantages of Iterators

What’s the advantage of using an iterator here?


Suppose we want to sample a binomial(n,0.5).
One way to do it is as follows
import random
[121]: n = 10000000
draws = [random.uniform(0, 1) < 0.5 for i in range(n)]
sum(draws)

5000779
[121]:

But we are creating two huge lists here, range(n) and draws.
This uses lots of memory and is very slow.
If we make n even bigger then this happens
n = 100000000
[122]: draws = [random.uniform(0, 1) < 0.5 for i in range(n)]

We can avoid these problems using iterators.


Here is the generator function
def f(n):
[123]: i = 1
while i <= n:
yield random.uniform(0, 1) < 0.5
i += 1

Now let’s do the sum


n = 10000000
[124]: draws = f(n)
draws

<generator object f at 0x7fa3c0dfeb10>


[124]:
sum(draws)
[125]:
4997378
[125]:

In summary, iterables

β€’ avoid the need to create big lists/tuples, and


β€’ provide a uniform interface to iteration that can be used transparently in for loops

14.8 Recursive Function Calls

This is not something that you will use every day, but it is still useful β€” you should learn it
at some stage.
234 CHAPTER 14. MORE LANGUAGE FEATURES

Basically, a recursive function is a function that calls itself.


For example, consider the problem of computing π‘₯𝑑 for some t when

π‘₯𝑑+1 = 2π‘₯𝑑 , π‘₯0 = 1 (1)

Obviously the answer is 2𝑑 .


We can compute this easily enough with a loop
def x_loop(t):
[126]: x = 1
for i in range(t):
x = 2 * x
return x

We can also use a recursive solution, as follows


def x(t):
[127]: if t == 0:
return 1
else:
return 2 * x(t-1)

What happens here is that each successive call uses it’s own frame in the stack

β€’ a frame is where the local variables of a given function call are held
β€’ stack is memory used to process function calls
– a First In Last Out (FILO) queue

This example is somewhat contrived, since the first (iterative) solution would usually be pre-
ferred to the recursive solution.
We’ll meet less contrived applications of recursion later on.

14.9 Exercises

14.9.1 Exercise 1

The Fibonacci numbers are defined by

π‘₯𝑑+1 = π‘₯𝑑 + π‘₯π‘‘βˆ’1 , π‘₯0 = 0, π‘₯1 = 1 (2)

The first few numbers in the sequence are 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55.
Write a function to recursively compute the 𝑑-th Fibonacci number for any 𝑑.

14.9.2 Exercise 2

Complete the following code, and test it using this csv file, which we assume that you’ve put
in your current working directory

def column_iterator(target_file, column_number):


14.10. SOLUTIONS 235

"""A generator function for CSV files.


When called with a file name target_file (string) and column number
column_number (integer), the generator function returns a generator
that steps through the elements of column column_number in file
target_file.
"""
# put your code here

dates = column_iterator('test_table.csv', 1)

for date in dates:


print(date)

14.9.3 Exercise 3

Suppose we have a text file numbers.txt containing the following lines

prices
3
8

7
21

Using try – except, write a program to read in the contents of the file and sum the num-
bers, ignoring lines without numbers.

14.10 Solutions

14.10.1 Exercise 1

Here’s the standard solution


def x(t):
[128]: if t == 0:
return 0
if t == 1:
return 1
else:
return x(t-1) + x(t-2)

Let’s test it
print([x(i) for i in range(10)])
[129]:

[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

14.10.2 Exercise 2

One solution is as follows


236 CHAPTER 14. MORE LANGUAGE FEATURES

def column_iterator(target_file, column_number):


[130]: """A generator function for CSV files.
When called with a file name target_file (string) and column number
column_number (integer), the generator function returns a generator
which steps through the elements of column column_number in file
target_file.
"""
f = open(target_file, 'r')
for line in f:
yield line.split(',')[column_number - 1]
f.close()

dates = column_iterator('test_table.csv', 1)

i = 1
for date in dates:
print(date)
if i == 10:
break
i += 1

Date
2009-05-21
2009-05-20
2009-05-19
2009-05-18
2009-05-15
2009-05-14
2009-05-13
2009-05-12
2009-05-11

14.10.3 Exercise 3

Let’s save the data first


%%file numbers.txt
[131]: prices
3
8

7
21

Writing numbers.txt

f = open('numbers.txt')
[132]:
total = 0.0
for line in f:
try:
total += float(line)
except ValueError:
pass

f.close()

print(total)

39.0
Chapter 15

Debugging

15.1 Contents

β€’ Overview 15.2
β€’ Debugging 15.3
β€’ Other Useful Magics 15.4

β€œDebugging is twice as hard as writing the code in the first place. Therefore, if
you write the code as cleverly as possible, you are, by definition, not smart enough
to debug it.” – Brian Kernighan

15.2 Overview

Are you one of those programmers who fills their code with print statements when trying to
debug their programs?
Hey, we all used to do that.
(OK, sometimes we still do that…)
But once you start writing larger programs you’ll need a better system.
Debugging tools for Python vary across platforms, IDEs and editors.
Here we’ll focus on Jupyter and leave you to explore other settings.
We’ll need the following imports
import numpy as np
[1]: import matplotlib.pyplot as plt
%matplotlib inline

15.3 Debugging

15.3.1 The debug Magic

Let’s consider a simple (and rather contrived) example

237
238 CHAPTER 15. DEBUGGING

def plot_log():
[2]: fig, ax = plt.subplots(2, 1)
x = np.linspace(1, 2, 10)
ax.plot(x, np.log(x))
plt.show()

plot_log() # Call the function, generate plot

---------------------------------------------------------------------------

AttributeError Traceback (most recent call last)

<ipython-input-2-c32a2280f47b> in <module>
5 plt.show()
6
----> 7 plot_log() # Call the function, generate plot

<ipython-input-2-c32a2280f47b> in plot_log()
2 fig, ax = plt.subplots(2, 1)
3 x = np.linspace(1, 2, 10)
----> 4 ax.plot(x, np.log(x))
5 plt.show()
6

AttributeError: 'numpy.ndarray' object has no attribute 'plot'

This code is intended to plot the log function over the interval [1, 2].
But there’s an error here: plt.subplots(2, 1) should be just plt.subplots().
(The call plt.subplots(2, 1) returns a NumPy array containing two axes objects, suit-
able for having two subplots on the same figure)
The traceback shows that the error occurs at the method call ax.plot(x, np.log(x)).
15.3. DEBUGGING 239

The error occurs because we have mistakenly made ax a NumPy array, and a NumPy array
has no plot method.
But let’s pretend that we don’t understand this for the moment.
We might suspect there’s something wrong with ax but when we try to investigate this ob-
ject, we get the following exception:
ax
[3]:

---------------------------------------------------------------------------

NameError Traceback (most recent call last)

<ipython-input-3-b00e77935981> in <module>
----> 1 ax

NameError: name 'ax' is not defined

The problem is that ax was defined inside plot_log(), and the name is lost once that func-
tion terminates.
Let’s try doing it a different way.
We run the first cell block again, generating the same error
def plot_log():
[4]: fig, ax = plt.subplots(2, 1)
x = np.linspace(1, 2, 10)
ax.plot(x, np.log(x))
plt.show()

plot_log() # Call the function, generate plot

---------------------------------------------------------------------------

AttributeError Traceback (most recent call last)

<ipython-input-4-c32a2280f47b> in <module>
5 plt.show()
6
----> 7 plot_log() # Call the function, generate plot

<ipython-input-4-c32a2280f47b> in plot_log()
2 fig, ax = plt.subplots(2, 1)
3 x = np.linspace(1, 2, 10)
----> 4 ax.plot(x, np.log(x))
5 plt.show()
6

AttributeError: 'numpy.ndarray' object has no attribute 'plot'


240 CHAPTER 15. DEBUGGING

But this time we type in the following cell block

%debug

You should be dropped into a new prompt that looks something like this

ipdb>

(You might see pdb> instead)


Now we can investigate the value of our variables at this point in the program, step forward
through the code, etc.
For example, here we simply type the name ax to see what’s happening with this object:

ipdb> ax
array([<matplotlib.axes.AxesSubplot object at 0x290f5d0>,
<matplotlib.axes.AxesSubplot object at 0x2930810>], dtype=object)

It’s now very clear that ax is an array, which clarifies the source of the problem.
To find out what else you can do from inside ipdb (or pdb), use the online help

ipdb> h

Documented commands (type help <topic>):


========================================
EOF bt cont enable jump pdef r tbreak w
a c continue exit l pdoc restart u whatis
alias cl d h list pinfo return unalias where
15.3. DEBUGGING 241

args clear debug help n pp run unt


b commands disable ignore next q s until
break condition down j p quit step up

Miscellaneous help topics:


==========================
exec pdb

Undocumented commands:
======================
retval rv

ipdb> h c
c(ont(inue))
Continue execution, only stop when a breakpoint is encountered.

15.3.2 Setting a Break Point

The preceding approach is handy but sometimes insufficient.


Consider the following modified version of our function above
def plot_log():
[5]: fig, ax = plt.subplots()
x = np.logspace(1, 2, 10)
ax.plot(x, np.log(x))
plt.show()

plot_log()

Here the original problem is fixed, but we’ve accidentally written np.logspace(1, 2,
10) instead of np.linspace(1, 2, 10).
242 CHAPTER 15. DEBUGGING

Now there won’t be any exception, but the plot won’t look right.
To investigate, it would be helpful if we could inspect variables like x during execution of the
function.
To this end, we add a β€œbreak point” by inserting breakpoint() inside the function code
block

def plot_log():
breakpoint()
fig, ax = plt.subplots()
x = np.logspace(1, 2, 10)
ax.plot(x, np.log(x))
plt.show()

plot_log()

Now let’s run the script, and investigate via the debugger

> <ipython-input-6-a188074383b7>(6)plot_log()
-> fig, ax = plt.subplots()
(Pdb) n
> <ipython-input-6-a188074383b7>(7)plot_log()
-> x = np.logspace(1, 2, 10)
(Pdb) n
> <ipython-input-6-a188074383b7>(8)plot_log()
-> ax.plot(x, np.log(x))
(Pdb) x
array([ 10. , 12.91549665, 16.68100537, 21.5443469 ,
27.82559402, 35.93813664, 46.41588834, 59.94842503,
77.42636827, 100. ])

We used n twice to step forward through the code (one line at a time).
Then we printed the value of x to see what was happening with that variable.
To exit from the debugger, use q.

15.4 Other Useful Magics

In this lecture, we used the %debug IPython magic.


There are many other useful magics:

β€’ %precision 4 sets printed precision for floats to 4 decimal places


β€’ %whos gives a list of variables and their values
β€’ %quickref gives a list of magics

The full list of magics is here.


Part IV

Data and Empirics

243
Chapter 16

Pandas

16.1 Contents

β€’ Overview 16.2

β€’ Series 16.3

β€’ DataFrames 16.4

β€’ On-Line Data Sources 16.5

β€’ Exercises 16.6

β€’ Solutions 16.7

In addition to what’s in Anaconda, this lecture will need the following libraries:
!pip install --upgrade pandas-datareader
[1]:

Requirement already up-to-date: pandas-datareader in


/home/ubuntu/anaconda3/lib/python3.7/site-packages (0.8.1)
Requirement already satisfied, skipping upgrade: pandas>=0.21 in
/home/ubuntu/anaconda3/lib/python3.7/site-packages (from pandas-datareader)
(0.24.2)
Requirement already satisfied, skipping upgrade: requests>=2.3.0 in
/home/ubuntu/anaconda3/lib/python3.7/site-packages (from pandas-datareader)
(2.22.0)
Requirement already satisfied, skipping upgrade: lxml in
/home/ubuntu/anaconda3/lib/python3.7/site-packages (from pandas-datareader)
(4.3.4)
Requirement already satisfied, skipping upgrade: numpy>=1.12.0 in
/home/ubuntu/anaconda3/lib/python3.7/site-packages (from pandas>=0.21->pandas-
datareader) (1.16.4)
Requirement already satisfied, skipping upgrade: pytz>=2011k in
/home/ubuntu/anaconda3/lib/python3.7/site-packages (from pandas>=0.21->pandas-
datareader) (2019.1)
Requirement already satisfied, skipping upgrade: python-dateutil>=2.5.0 in
/home/ubuntu/anaconda3/lib/python3.7/site-packages (from pandas>=0.21->pandas-
datareader) (2.8.0)
Requirement already satisfied, skipping upgrade:
urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in
/home/ubuntu/anaconda3/lib/python3.7/site-packages (from
requests>=2.3.0->pandas-datareader) (1.24.2)
Requirement already satisfied, skipping upgrade: idna<2.9,>=2.5 in
/home/ubuntu/anaconda3/lib/python3.7/site-packages (from
requests>=2.3.0->pandas-datareader) (2.8)
Requirement already satisfied, skipping upgrade: chardet<3.1.0,>=3.0.2 in

245
246 CHAPTER 16. PANDAS

/home/ubuntu/anaconda3/lib/python3.7/site-packages (from
requests>=2.3.0->pandas-datareader) (3.0.4)
Requirement already satisfied, skipping upgrade: certifi>=2017.4.17 in
/home/ubuntu/anaconda3/lib/python3.7/site-packages (from
requests>=2.3.0->pandas-datareader) (2019.6.16)
Requirement already satisfied, skipping upgrade: six>=1.5 in
/home/ubuntu/anaconda3/lib/python3.7/site-packages (from python-
dateutil>=2.5.0->pandas>=0.21->pandas-datareader) (1.12.0)

16.2 Overview

Pandas is a package of fast, efficient data analysis tools for Python.


Its popularity has surged in recent years, coincident with the rise of fields such as data science
and machine learning.
Here’s a popularity comparison over time against STATA and SAS, courtesy of Stack Over-
flow Trends

Just as NumPy provides the basic array data type plus core array operations, pandas

1. defines fundamental structures for working with data and


2. endows them with methods that facilitate operations such as

β€’ reading in data
β€’ adjusting indices
β€’ working with dates and time series
β€’ sorting, grouping, re-ordering and general data munging 1
β€’ dealing with missing values, etc., etc.

More sophisticated statistical functionality is left to other packages, such as statsmodels and
scikit-learn, which are built on top of pandas.
This lecture will provide a basic introduction to pandas.
Throughout the lecture, we will assume that the following imports have taken place
16.3. SERIES 247

import pandas as pd
[2]: import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import requests

16.3 Series

Two important data types defined by pandas are Series and DataFrame.
You can think of a Series as a β€œcolumn” of data, such as a collection of observations on a
single variable.
A DataFrame is an object for storing related columns of data.
Let’s start with Series
s = pd.Series(np.random.randn(4), name='daily returns')
[3]: s

0 1.520681
[3]: 1 0.619485
2 -0.127222
3 0.163491
Name: daily returns, dtype: float64

Here you can imagine the indices 0, 1, 2, 3 as indexing four listed companies, and the
values being daily returns on their shares.
Pandas Series are built on top of NumPy arrays and support many similar operations
s * 100
[4]:
0 152.068086
[4]: 1 61.948511
2 -12.722188
3 16.349095
Name: daily returns, dtype: float64

np.abs(s)
[5]:
0 1.520681
[5]: 1 0.619485
2 0.127222
3 0.163491
Name: daily returns, dtype: float64

But Series provide more than NumPy arrays.


Not only do they have some additional (statistically oriented) methods
s.describe()
[6]:
count 4.000000
[6]: mean 0.544109
std 0.719937
min -0.127222
25% 0.090813
50% 0.391488
75% 0.844784
max 1.520681
Name: daily returns, dtype: float64

But their indices are more flexible


248 CHAPTER 16. PANDAS

s.index = ['AMZN', 'AAPL', 'MSFT', 'GOOG']


[7]: s

AMZN 1.520681
[7]: AAPL 0.619485
MSFT -0.127222
GOOG 0.163491
Name: daily returns, dtype: float64

Viewed in this way, Series are like fast, efficient Python dictionaries (with the restriction
that the items in the dictionary all have the same typeβ€”in this case, floats).
In fact, you can use much of the same syntax as Python dictionaries
s['AMZN']
[8]:
1.520680864827497
[8]:
s['AMZN'] = 0
[9]: s

AMZN 0.000000
[9]: AAPL 0.619485
MSFT -0.127222
GOOG 0.163491
Name: daily returns, dtype: float64

'AAPL' in s
[10]:
True
[10]:

16.4 DataFrames

While a Series is a single column of data, a DataFrame is several columns, one for each
variable.
In essence, a DataFrame in pandas is analogous to a (highly optimized) Excel spreadsheet.
Thus, it is a powerful tool for representing and analyzing data that are naturally organized
into rows and columns, often with descriptive indexes for individual rows and individual
columns.
Let’s look at an example that reads data from the CSV file pandas/data/test_pwt.csv
that can be downloaded here.
Here’s the content of test_pwt.csv

"country","country isocode","year","POP","XRAT","tcgdp","cc","cg"
"Argentina","ARG","2000","37335.653","0.9995","295072.21869","75.716805379","5.5
"Australia","AUS","2000","19053.186","1.72483","541804.6521","67.759025993","6.7
"India","IND","2000","1006300.297","44.9416","1728144.3748","64.575551328","14.0
"Israel","ISR","2000","6114.57","4.07733","129253.89423","64.436450847","10.2666
"Malawi","MWI","2000","11801.505","59.543808333","5026.2217836","74.707624181","
"South Africa","ZAF","2000","45064.098","6.93983","227242.36949","72.718710427",
"United States","USA","2000","282171.957","1","9898700","72.347054303","6.032453
"Uruguay","URY","2000","3219.793","12.099591667","25255.961693","78.978740282","

Supposing you have this data saved as test_pwt.csv in the present working directory (type
%pwd in Jupyter to see what this is), it can be read in as follows:
16.4. DATAFRAMES 249

df = pd.read_csv('https://github.com/QuantEcon/QuantEcon.lectures.code/raw/master/pandas/
[11]: β†ͺdata/test_pwt.csv')
type(df)

pandas.core.frame.DataFrame
[11]:
df
[12]:
country country isocode year POP XRAT tcgdp \
[12]: 0 Argentina ARG 2000 37335.653 0.999500 2.950722e+05
1 Australia AUS 2000 19053.186 1.724830 5.418047e+05
2 India IND 2000 1006300.297 44.941600 1.728144e+06
3 Israel ISR 2000 6114.570 4.077330 1.292539e+05
4 Malawi MWI 2000 11801.505 59.543808 5.026222e+03
5 South Africa ZAF 2000 45064.098 6.939830 2.272424e+05
6 United States USA 2000 282171.957 1.000000 9.898700e+06
7 Uruguay URY 2000 3219.793 12.099592 2.525596e+04

cc cg
0 75.716805 5.578804
1 67.759026 6.720098
2 64.575551 14.072206
3 64.436451 10.266688
4 74.707624 11.658954
5 72.718710 5.726546
6 72.347054 6.032454
7 78.978740 5.108068

We can select particular rows using standard Python array slicing notation
df[2:5]
[13]:
country country isocode year POP XRAT tcgdp \
[13]: 2 India IND 2000 1006300.297 44.941600 1.728144e+06
3 Israel ISR 2000 6114.570 4.077330 1.292539e+05
4 Malawi MWI 2000 11801.505 59.543808 5.026222e+03

cc cg
2 64.575551 14.072206
3 64.436451 10.266688
4 74.707624 11.658954

To select columns, we can pass a list containing the names of the desired columns represented
as strings
df[['country', 'tcgdp']]
[14]:
country tcgdp
[14]: 0 Argentina 2.950722e+05
1 Australia 5.418047e+05
2 India 1.728144e+06
3 Israel 1.292539e+05
4 Malawi 5.026222e+03
5 South Africa 2.272424e+05
6 United States 9.898700e+06
7 Uruguay 2.525596e+04

To select both rows and columns using integers, the iloc attribute should be used with the
format .iloc[rows, columns]
df.iloc[2:5, 0:4]
[15]:
country country isocode year POP
[15]: 2 India IND 2000 1006300.297
3 Israel ISR 2000 6114.570
4 Malawi MWI 2000 11801.505

To select rows and columns using a mixture of integers and labels, the loc attribute can be
250 CHAPTER 16. PANDAS

used in a similar way


df.loc[df.index[2:5], ['country', 'tcgdp']]
[16]:
country tcgdp
[16]: 2 India 1.728144e+06
3 Israel 1.292539e+05
4 Malawi 5.026222e+03

Let’s imagine that we’re only interested in population and total GDP (tcgdp).
One way to strip the data frame df down to only these variables is to overwrite the
dataframe using the selection method described above
df = df[['country', 'POP', 'tcgdp']]
[17]: df

country POP tcgdp


[17]: 0 Argentina 37335.653 2.950722e+05
1 Australia 19053.186 5.418047e+05
2 India 1006300.297 1.728144e+06
3 Israel 6114.570 1.292539e+05
4 Malawi 11801.505 5.026222e+03
5 South Africa 45064.098 2.272424e+05
6 United States 282171.957 9.898700e+06
7 Uruguay 3219.793 2.525596e+04

Here the index 0, 1,..., 7 is redundant because we can use the country names as an in-
dex.
To do this, we set the index to be the country variable in the dataframe
df = df.set_index('country')
[18]: df

POP tcgdp
[18]: country
Argentina 37335.653 2.950722e+05
Australia 19053.186 5.418047e+05
India 1006300.297 1.728144e+06
Israel 6114.570 1.292539e+05
Malawi 11801.505 5.026222e+03
South Africa 45064.098 2.272424e+05
United States 282171.957 9.898700e+06
Uruguay 3219.793 2.525596e+04

Let’s give the columns slightly better names


df.columns = 'population', 'total GDP'
[19]: df

population total GDP


[19]: country
Argentina 37335.653 2.950722e+05
Australia 19053.186 5.418047e+05
India 1006300.297 1.728144e+06
Israel 6114.570 1.292539e+05
Malawi 11801.505 5.026222e+03
South Africa 45064.098 2.272424e+05
United States 282171.957 9.898700e+06
Uruguay 3219.793 2.525596e+04

Population is in thousands, let’s revert to single units


df['population'] = df['population'] * 1e3
[20]: df
16.4. DATAFRAMES 251

population total GDP


[20]: country
Argentina 3.733565e+07 2.950722e+05
Australia 1.905319e+07 5.418047e+05
India 1.006300e+09 1.728144e+06
Israel 6.114570e+06 1.292539e+05
Malawi 1.180150e+07 5.026222e+03
South Africa 4.506410e+07 2.272424e+05
United States 2.821720e+08 9.898700e+06
Uruguay 3.219793e+06 2.525596e+04

Next, we’re going to add a column showing real GDP per capita, multiplying by 1,000,000 as
we go because total GDP is in millions
df['GDP percap'] = df['total GDP'] * 1e6 / df['population']
[21]: df

population total GDP GDP percap


[21]: country
Argentina 3.733565e+07 2.950722e+05 7903.229085
Australia 1.905319e+07 5.418047e+05 28436.433261
India 1.006300e+09 1.728144e+06 1717.324719
Israel 6.114570e+06 1.292539e+05 21138.672749
Malawi 1.180150e+07 5.026222e+03 425.896679
South Africa 4.506410e+07 2.272424e+05 5042.647686
United States 2.821720e+08 9.898700e+06 35080.381854
Uruguay 3.219793e+06 2.525596e+04 7843.970620

One of the nice things about pandas DataFrame and Series objects is that they have
methods for plotting and visualization that work through Matplotlib.
For example, we can easily generate a bar plot of GDP per capita
df['GDP percap'].plot(kind='bar')
[22]: plt.show()
252 CHAPTER 16. PANDAS

At the moment the data frame is ordered alphabetically on the countriesβ€”let’s change it to
GDP per capita
df = df.sort_values(by='GDP percap', ascending=False)
[23]: df

population total GDP GDP percap


[23]: country
United States 2.821720e+08 9.898700e+06 35080.381854
Australia 1.905319e+07 5.418047e+05 28436.433261
Israel 6.114570e+06 1.292539e+05 21138.672749
Argentina 3.733565e+07 2.950722e+05 7903.229085
Uruguay 3.219793e+06 2.525596e+04 7843.970620
South Africa 4.506410e+07 2.272424e+05 5042.647686
India 1.006300e+09 1.728144e+06 1717.324719
Malawi 1.180150e+07 5.026222e+03 425.896679

Plotting as before now yields


df['GDP percap'].plot(kind='bar')
[24]: plt.show()

16.5 On-Line Data Sources

Python makes it straightforward to query online databases programmatically.


An important database for economists is FRED β€” a vast collection of time series data main-
tained by the St. Louis Fed.
16.5. ON-LINE DATA SOURCES 253

For example, suppose that we are interested in the unemployment rate.


Via FRED, the entire series for the US civilian unemployment rate can be downloaded di-
rectly by entering this URL into your browser (note that this requires an internet connection)

https://research.stlouisfed.org/fred2/series/UNRATE/downloaddata/UNRATE.csv

(Equivalently, click here: https://research.stlouisfed.org/fred2/series/


UNRATE/downloaddata/UNRATE.csv)
This request returns a CSV file, which will be handled by your default application for this
class of files.
Alternatively, we can access the CSV file from within a Python program.
This can be done with a variety of methods.
We start with a relatively low-level method and then return to pandas.

16.5.1 Accessing Data with requests

One option is to use requests, a standard Python library for requesting data over the Inter-
net.
To begin, try the following code on your computer
r = requests.get('http://research.stlouisfed.org/fred2/series/UNRATE/downloaddata/UNRATE.
[25]: β†ͺcsv')

If there’s no error message, then the call has succeeded.


If you do get an error, then there are two likely causes

1. You are not connected to the Internet β€” hopefully, this isn’t the case.
2. Your machine is accessing the Internet through a proxy server, and Python isn’t aware
of this.

In the second case, you can either

β€’ switch to another machine


β€’ solve your proxy problem by reading the documentation

Assuming that all is working, you can now pro-


ceed to use the source object returned by the call
requests.get('http://research.stlouisfed.org/fred2/series/UNRATE/downloaddata/UN
url = 'http://research.stlouisfed.org/fred2/series/UNRATE/downloaddata/UNRATE.csv'
[26]: source = requests.get(url).content.decode().split("\n")
source[0]

'DATE,VALUE\r'
[26]:
source[1]
[27]:
'1948-01-01,3.4\r'
[27]:
source[2]
[28]:
254 CHAPTER 16. PANDAS

'1948-02-01,3.8\r'
[28]:

We could now write some additional code to parse this text and store it as an array.
But this is unnecessary β€” pandas’ read_csv function can handle the task for us.
We use parse_dates=True so that pandas recognizes our dates column, allowing for simple
date filtering
data = pd.read_csv(url, index_col=0, parse_dates=True)
[29]:

The data has been read into a pandas DataFrame called data that we can now manipulate in
the usual way
type(data)
[30]:
pandas.core.frame.DataFrame
[30]:
data.head() # A useful method to get a quick look at a data frame
[31]:
VALUE
[31]: DATE
1948-01-01 3.4
1948-02-01 3.8
1948-03-01 4.0
1948-04-01 3.9
1948-05-01 3.5

pd.set_option('precision', 1)
[32]: data.describe() # Your output might differ slightly

VALUE
[32]: count 861.0
mean 5.7
std 1.6
min 2.5
25% 4.5
50% 5.6
75% 6.8
max 10.8

We can also plot the unemployment rate from 2006 to 2012 as follows
data['2006':'2012'].plot()
[33]: plt.show()
16.5. ON-LINE DATA SOURCES 255

Note that pandas offers many other file type alternatives.


Pandas has a wide variety of top-level methods that we can use to read, excel, json, parquet
or plug straight into a database server.

16.5.2 Using pandas_datareader to Access Data

The maker of pandas has also authored a library called pandas_datareader that gives pro-
grammatic access to many data sources straight from the Jupyter notebook.
While some sources require an access key, many of the most important (e.g., FRED, OECD,
EUROSTAT and the World Bank) are free to use.
For now let’s work through one example of downloading and plotting data β€” this time from
the World Bank.
The World Bank collects and organizes data on a huge range of indicators.
For example, here’s some data on government debt as a ratio to GDP.
The next code example fetches the data for you and plots time series for the US and Aus-
tralia
from pandas_datareader import wb
[34]:
govt_debt = wb.download(indicator='GC.DOD.TOTL.GD.ZS', country=['US', 'AU'], start=2005,οΏ½
β†ͺend=2016).stack().unstack(0)
ind = govt_debt.index.droplevel(-1)
govt_debt.index = ind
ax = govt_debt.plot(lw=2)
plt.title("Government Debt to GDP (%)")
plt.show()
256 CHAPTER 16. PANDAS

The documentation provides more details on how to access various data sources.

16.6 Exercises

16.6.1 Exercise 1

Write a program to calculate the percentage price change over 2013 for the following shares
ticker_list = {'INTC': 'Intel',
[35]: 'MSFT': 'Microsoft',
'IBM': 'IBM',
'BHP': 'BHP',
'TM': 'Toyota',
'AAPL': 'Apple',
'AMZN': 'Amazon',
'BA': 'Boeing',
'QCOM': 'Qualcomm',
'KO': 'Coca-Cola',
'GOOG': 'Google',
'SNE': 'Sony',
'PTR': 'PetroChina'}

A dataset of daily closing prices for the above firms can be found in
pandas/data/ticker_data.csv and can be downloaded here.
Plot the result as a bar graph like this one
16.7. SOLUTIONS 257

16.7 Solutions

16.7.1 Exercise 1

ticker = pd.read_csv('https://github.com/QuantEcon/QuantEcon.lectures.code/raw/master/
[36]: β†ͺpandas/data/ticker_data.csv')
ticker.set_index('Date', inplace=True)

ticker_list = {'INTC': 'Intel',


'MSFT': 'Microsoft',
'IBM': 'IBM',
'BHP': 'BHP',
'TM': 'Toyota',
'AAPL': 'Apple',
'AMZN': 'Amazon',
'BA': 'Boeing',
'QCOM': 'Qualcomm',
'KO': 'Coca-Cola',
'GOOG': 'Google',
'SNE': 'Sony',
'PTR': 'PetroChina'}

price_change = pd.Series()

for tick in ticker_list:


change = 100 * (ticker.loc[ticker.index[-1], tick] - ticker.loc[ticker.index[0], tick]) /
β†ͺ ticker.loc[ticker.index[0], tick]
name = ticker_list[tick]
price_change[name] = change

price_change.sort_values(inplace=True)
fig, ax = plt.subplots(figsize=(10,8))
price_change.plot(kind='bar', ax=ax)
plt.show()
258 CHAPTER 16. PANDAS

Footnotes
[1] Wikipedia defines munging as cleaning data from one raw form into a structured, purged
one.
Chapter 17

Pandas for Panel Data

17.1 Contents

β€’ Overview 17.2

β€’ Slicing and Reshaping Data 17.3

β€’ Merging Dataframes and Filling NaNs 17.4

β€’ Grouping and Summarizing Data 17.5

β€’ Final Remarks 17.6

β€’ Exercises 17.7

β€’ Solutions 17.8

17.2 Overview

In an earlier lecture on pandas, we looked at working with simple data sets.


Econometricians often need to work with more complex data sets, such as panels.
Common tasks include

β€’ Importing data, cleaning it and reshaping it across several axes.


β€’ Selecting a time series or cross-section from a panel.
β€’ Grouping and summarizing data.

pandas (derived from β€˜panel’ and β€˜data’) contains powerful and easy-to-use tools for solving
exactly these kinds of problems.
In what follows, we will use a panel data set of real minimum wages from the OECD to cre-
ate:

β€’ summary statistics over multiple dimensions of our data


β€’ a time series of the average minimum wage of countries in the dataset
β€’ kernel density estimates of wages by continent

259
260 CHAPTER 17. PANDAS FOR PANEL DATA

We will begin by reading in our long format panel data from a CSV file and reshaping the
resulting DataFrame with pivot_table to build a MultiIndex.
Additional detail will be added to our DataFrame using pandas’ merge function, and data
will be summarized with the groupby function.
Most of this lecture was created by Natasha Watkins.

17.3 Slicing and Reshaping Data

We will read in a dataset from the OECD of real minimum wages in 32 countries and assign
it to realwage.
The dataset pandas_panel/realwage.csv can be downloaded here.
Make sure the file is in your current working directory
import pandas as pd
[1]:
# Display 6 columns for viewing purposes
pd.set_option('display.max_columns', 6)

# Reduce decimal points to 2


pd.options.display.float_format = '{:,.2f}'.format

realwage = pd.read_csv('https://github.com/QuantEcon/QuantEcon.lectures.code/raw/master/
β†ͺpandas_panel/realwage.csv')

Let’s have a look at what we’ve got to work with


realwage.head() # Show first 5 rows
[2]:
Unnamed: 0 Time Country Series \
[2]: 0 0 2006-01-01 Ireland In 2015 constant prices at 2015 USD PPPs
1 1 2007-01-01 Ireland In 2015 constant prices at 2015 USD PPPs
2 2 2008-01-01 Ireland In 2015 constant prices at 2015 USD PPPs
3 3 2009-01-01 Ireland In 2015 constant prices at 2015 USD PPPs
4 4 2010-01-01 Ireland In 2015 constant prices at 2015 USD PPPs

Pay period value


0 Annual 17,132.44
1 Annual 18,100.92
2 Annual 17,747.41
3 Annual 18,580.14
4 Annual 18,755.83

The data is currently in long format, which is difficult to analyze when there are several di-
mensions to the data.
We will use pivot_table to create a wide format panel, with a MultiIndex to handle
higher dimensional data.
pivot_table arguments should specify the data (values), the index, and the columns we
want in our resulting dataframe.
By passing a list in columns, we can create a MultiIndex in our column axis
realwage = realwage.pivot_table(values='value',
[3]: index='Time',
columns=['Country', 'Series', 'Pay period'])
realwage.head()
17.3. SLICING AND RESHAPING DATA 261

Country Australia \
[3]: Series In 2015 constant prices at 2015 USD PPPs
Pay period Annual Hourly
Time
2006-01-01 20,410.65 10.33
2007-01-01 21,087.57 10.67
2008-01-01 20,718.24 10.48
2009-01-01 20,984.77 10.62
2010-01-01 20,879.33 10.57

Country … \
Series In 2015 constant prices at 2015 USD exchange rates …
Pay period Annual …
Time …
2006-01-01 23,826.64 …
2007-01-01 24,616.84 …
2008-01-01 24,185.70 …
2009-01-01 24,496.84 …
2010-01-01 24,373.76 …

Country United States \


Series In 2015 constant prices at 2015 USD PPPs
Pay period Hourly
Time
2006-01-01 6.05
2007-01-01 6.24
2008-01-01 6.78
2009-01-01 7.58
2010-01-01 7.88

Country
Series In 2015 constant prices at 2015 USD exchange rates
Pay period Annual Hourly
Time
2006-01-01 12,594.40 6.05
2007-01-01 12,974.40 6.24
2008-01-01 14,097.56 6.78
2009-01-01 15,756.42 7.58
2010-01-01 16,391.31 7.88

[5 rows x 128 columns]

To more easily filter our time series data, later on, we will convert the index into a
DateTimeIndex
realwage.index = pd.to_datetime(realwage.index)
[4]: type(realwage.index)

pandas.core.indexes.datetimes.DatetimeIndex
[4]:

The columns contain multiple levels of indexing, known as a MultiIndex, with levels being
ordered hierarchically (Country > Series > Pay period).
A MultiIndex is the simplest and most flexible way to manage panel data in pandas
type(realwage.columns)
[5]:
pandas.core.indexes.multi.MultiIndex
[5]:
realwage.columns.names
[6]:
FrozenList(['Country', 'Series', 'Pay period'])
[6]:

Like before, we can select the country (the top level of our MultiIndex)
realwage['United States'].head()
[7]:
262 CHAPTER 17. PANDAS FOR PANEL DATA

Series In 2015 constant prices at 2015 USD PPPs \


[7]: Pay period Annual Hourly
Time
2006-01-01 12,594.40 6.05
2007-01-01 12,974.40 6.24
2008-01-01 14,097.56 6.78
2009-01-01 15,756.42 7.58
2010-01-01 16,391.31 7.88

Series In 2015 constant prices at 2015 USD exchange rates


Pay period Annual Hourly
Time
2006-01-01 12,594.40 6.05
2007-01-01 12,974.40 6.24
2008-01-01 14,097.56 6.78
2009-01-01 15,756.42 7.58
2010-01-01 16,391.31 7.88

Stacking and unstacking levels of the MultiIndex will be used throughout this lecture to
reshape our dataframe into a format we need.
.stack() rotates the lowest level of the column MultiIndex to the row index
(.unstack() works in the opposite direction - try it out)
realwage.stack().head()
[8]:
Country Australia \
[8]: Series In 2015 constant prices at 2015 USD PPPs
Time Pay period
2006-01-01 Annual 20,410.65
Hourly 10.33
2007-01-01 Annual 21,087.57
Hourly 10.67
2008-01-01 Annual 20,718.24

Country \
Series In 2015 constant prices at 2015 USD exchange rates
Time Pay period
2006-01-01 Annual 23,826.64
Hourly 12.06
2007-01-01 Annual 24,616.84
Hourly 12.46
2008-01-01 Annual 24,185.70

Country Belgium … \
Series In 2015 constant prices at 2015 USD PPPs …
Time Pay period …
2006-01-01 Annual 21,042.28 …
Hourly 10.09 …
2007-01-01 Annual 21,310.05 …
Hourly 10.22 …
2008-01-01 Annual 21,416.96 …

Country United Kingdom \


Series In 2015 constant prices at 2015 USD exchange rates
Time Pay period
2006-01-01 Annual 20,376.32
Hourly 9.81
2007-01-01 Annual 20,954.13
Hourly 10.07
2008-01-01 Annual 20,902.87

Country United States \


Series In 2015 constant prices at 2015 USD PPPs
Time Pay period
2006-01-01 Annual 12,594.40
Hourly 6.05
2007-01-01 Annual 12,974.40
Hourly 6.24
2008-01-01 Annual 14,097.56
17.3. SLICING AND RESHAPING DATA 263

Country
Series In 2015 constant prices at 2015 USD exchange rates
Time Pay period
2006-01-01 Annual 12,594.40
Hourly 6.05
2007-01-01 Annual 12,974.40
Hourly 6.24
2008-01-01 Annual 14,097.56

[5 rows x 64 columns]

We can also pass in an argument to select the level we would like to stack
realwage.stack(level='Country').head()
[9]:
Series In 2015 constant prices at 2015 USD PPPs \
[9]: Pay period Annual Hourly
Time Country
2006-01-01 Australia 20,410.65 10.33
Belgium 21,042.28 10.09
Brazil 3,310.51 1.41
Canada 13,649.69 6.56
Chile 5,201.65 2.22

Series In 2015 constant prices at 2015 USD exchange rates


Pay period Annual Hourly
Time Country
2006-01-01 Australia 23,826.64 12.06
Belgium 20,228.74 9.70
Brazil 2,032.87 0.87
Canada 14,335.12 6.89
Chile 3,333.76 1.42

Using a DatetimeIndex makes it easy to select a particular time period.


Selecting one year and stacking the two lower levels of the MultiIndex creates a cross-
section of our panel data
realwage['2015'].stack(level=(1, 2)).transpose().head()
[10]:
Time 2015-01-01 \
[10]: Series In 2015 constant prices at 2015 USD PPPs
Pay period Annual Hourly
Country
Australia 21,715.53 10.99
Belgium 21,588.12 10.35
Brazil 4,628.63 2.00
Canada 16,536.83 7.95
Chile 6,633.56 2.80

Time
Series In 2015 constant prices at 2015 USD exchange rates
Pay period Annual Hourly
Country
Australia 25,349.90 12.83
Belgium 20,753.48 9.95
Brazil 2,842.28 1.21
Canada 17,367.24 8.35
Chile 4,251.49 1.81

For the rest of lecture, we will work with a dataframe of the hourly real minimum wages
across countries and time, measured in 2015 US dollars.
To create our filtered dataframe (realwage_f), we can use the xs method to select values
at lower levels in the multiindex, while keeping the higher levels (countries in this case)
realwage_f = realwage.xs(('Hourly', 'In 2015 constant prices at 2015 USD exchange rates'),
[11]: level=('Pay period', 'Series'), axis=1)
realwage_f.head()
264 CHAPTER 17. PANDAS FOR PANEL DATA

Country Australia Belgium Brazil … Turkey United Kingdom \


[11]: Time …
2006-01-01 12.06 9.70 0.87 … 2.27 9.81
2007-01-01 12.46 9.82 0.92 … 2.26 10.07
2008-01-01 12.24 9.87 0.96 … 2.22 10.04
2009-01-01 12.40 10.21 1.03 … 2.28 10.15
2010-01-01 12.34 10.05 1.08 … 2.30 9.96

Country United States


Time
2006-01-01 6.05
2007-01-01 6.24
2008-01-01 6.78
2009-01-01 7.58
2010-01-01 7.88

[5 rows x 32 columns]

17.4 Merging Dataframes and Filling NaNs

Similar to relational databases like SQL, pandas has built in methods to merge datasets to-
gether.
Using country information from WorldData.info, we’ll add the continent of each country to
realwage_f with the merge function.
The CSV file can be found in pandas_panel/countries.csv and can be downloaded
here.
worlddata = pd.read_csv('https://github.com/QuantEcon/QuantEcon.lectures.code/raw/master/
[12]: β†ͺpandas_panel/countries.csv', sep=';')
worlddata.head()

Country (en) Country (de) Country (local) … Deathrate \


[12]: 0 Afghanistan Afghanistan Afganistan/Afqanestan … 13.70
1 Egypt Γ„gypten Misr … 4.70
2 Γ…land Islands Γ…landinseln Γ…land … 0.00
3 Albania Albanien ShqipΓ«ria … 6.70
4 Algeria Algerien Al-Jaza’ir/AlgΓ©rie … 4.30

Life expectancy Url


0 51.30 https://www.laenderdaten.info/Asien/Afghanista…
1 72.70 https://www.laenderdaten.info/Afrika/Aegypten/…
2 0.00 https://www.laenderdaten.info/Europa/Aland/ind…
3 78.30 https://www.laenderdaten.info/Europa/Albanien/…
4 76.80 https://www.laenderdaten.info/Afrika/Algerien/…

[5 rows x 17 columns]

First, we’ll select just the country and continent variables from worlddata and rename the
column to β€˜Country’
worlddata = worlddata[['Country (en)', 'Continent']]
[13]: worlddata = worlddata.rename(columns={'Country (en)': 'Country'})
worlddata.head()

Country Continent
[13]: 0 Afghanistan Asia
1 Egypt Africa
2 Γ…land Islands Europe
3 Albania Europe
4 Algeria Africa

We want to merge our new dataframe, worlddata, with realwage_f.


17.4. MERGING DATAFRAMES AND FILLING NANS 265

The pandas merge function allows dataframes to be joined together by rows.


Our dataframes will be merged using country names, requiring us to use the transpose of
realwage_f so that rows correspond to country names in both dataframes
realwage_f.transpose().head()
[14]:
Time 2006-01-01 2007-01-01 2008-01-01 … 2014-01-01 2015-01-01 \
[14]: Country …
Australia 12.06 12.46 12.24 … 12.67 12.83
Belgium 9.70 9.82 9.87 … 10.01 9.95
Brazil 0.87 0.92 0.96 … 1.21 1.21
Canada 6.89 6.96 7.24 … 8.22 8.35
Chile 1.42 1.45 1.44 … 1.76 1.81

Time 2016-01-01
Country
Australia 12.98
Belgium 9.76
Brazil 1.24
Canada 8.48
Chile 1.91

[5 rows x 11 columns]

We can use either left, right, inner, or outer join to merge our datasets:

β€’ left join includes only countries from the left dataset


β€’ right join includes only countries from the right dataset
β€’ outer join includes countries that are in either the left and right datasets
β€’ inner join includes only countries common to both the left and right datasets

By default, merge will use an inner join.


Here we will pass how='left' to keep all countries in realwage_f, but discard countries
in worlddata that do not have a corresponding data entry realwage_f.
This is illustrated by the red shading in the following diagram

We will also need to specify where the country name is located in each dataframe, which will
be the key that is used to merge the dataframes β€˜on’.
266 CHAPTER 17. PANDAS FOR PANEL DATA

Our β€˜left’ dataframe (realwage_f.transpose()) contains countries in the index, so we


set left_index=True.
Our β€˜right’ dataframe (worlddata) contains countries in the β€˜Country’ column, so we set
right_on='Country'
merged = pd.merge(realwage_f.transpose(), worlddata,
[15]: how='left', left_index=True, right_on='Country')
merged.head()

2006-01-01 00:00:00 2007-01-01 00:00:00 2008-01-01 00:00:00 … \


[15]: 17 12.06 12.46 12.24 …
23 9.70 9.82 9.87 …
32 0.87 0.92 0.96 …
100 6.89 6.96 7.24 …
38 1.42 1.45 1.44 …

2016-01-01 00:00:00 Country Continent


17 12.98 Australia Australia
23 9.76 Belgium Europe
32 1.24 Brazil South America
100 8.48 Canada North America
38 1.91 Chile South America

[5 rows x 13 columns]

Countries that appeared in realwage_f but not in worlddata will have NaN in the Conti-
nent column.
To check whether this has occurred, we can use .isnull() on the continent column and
filter the merged dataframe
merged[merged['Continent'].isnull()]
[16]:
2006-01-01 00:00:00 2007-01-01 00:00:00 2008-01-01 00:00:00 … \
[16]: 247 3.42 3.74 3.87 …
247 0.23 0.45 0.39 …
247 1.50 1.64 1.71 …

2016-01-01 00:00:00 Country Continent


247 5.28 Korea NaN
247 0.55 Russian Federation NaN
247 2.08 Slovak Republic NaN

[3 rows x 13 columns]

We have three missing values!


One option to deal with NaN values is to create a dictionary containing these countries and
their respective continents.
.map() will match countries in merged['Country'] with their continent from the dictio-
nary.
Notice how countries not in our dictionary are mapped with NaN
missing_continents = {'Korea': 'Asia',
[17]: 'Russian Federation': 'Europe',
'Slovak Republic': 'Europe'}

merged['Country'].map(missing_continents)

17 NaN
[17]: 23 NaN
32 NaN
100 NaN
38 NaN
17.4. MERGING DATAFRAMES AND FILLING NANS 267

108 NaN
41 NaN
225 NaN
53 NaN
58 NaN
45 NaN
68 NaN
233 NaN
86 NaN
88 NaN
91 NaN
247 Asia
117 NaN
122 NaN
123 NaN
138 NaN
153 NaN
151 NaN
174 NaN
175 NaN
247 Europe
247 Europe
198 NaN
200 NaN
227 NaN
241 NaN
240 NaN
Name: Country, dtype: object

We don’t want to overwrite the entire series with this mapping.


.fillna() only fills in NaN values in merged['Continent'] with the mapping, while
leaving other values in the column unchanged
merged['Continent'] = merged['Continent'].fillna(merged['Country'].map(missing_continents))
[18]:
# Check for whether continents were correctly mapped

merged[merged['Country'] == 'Korea']

2006-01-01 00:00:00 2007-01-01 00:00:00 2008-01-01 00:00:00 … \


[18]: 247 3.42 3.74 3.87 …

2016-01-01 00:00:00 Country Continent


247 5.28 Korea Asia

[1 rows x 13 columns]

We will also combine the Americas into a single continent - this will make our visualization
nicer later on.
To do this, we will use .replace() and loop through a list of the continent values we want
to replace
replace = ['Central America', 'North America', 'South America']
[19]:
for country in replace:
merged['Continent'].replace(to_replace=country,
value='America',
inplace=True)

Now that we have all the data we want in a single DataFrame, we will reshape it back into
panel form with a MultiIndex.
We should also ensure to sort the index using .sort_index() so that we can efficiently fil-
ter our dataframe later on.
By default, levels will be sorted top-down
268 CHAPTER 17. PANDAS FOR PANEL DATA

merged = merged.set_index(['Continent', 'Country']).sort_index()


[20]: merged.head()

2006-01-01 2007-01-01 2008-01-01 … 2014-01-01 \


[20]: Continent Country …
America Brazil 0.87 0.92 0.96 … 1.21
Canada 6.89 6.96 7.24 … 8.22
Chile 1.42 1.45 1.44 … 1.76
Colombia 1.01 1.02 1.01 … 1.13
Costa Rica nan nan nan … 2.41

2015-01-01 2016-01-01
Continent Country
America Brazil 1.21 1.24
Canada 8.35 8.48
Chile 1.81 1.91
Colombia 1.13 1.12
Costa Rica 2.56 2.63

[5 rows x 11 columns]

While merging, we lost our DatetimeIndex, as we merged columns that were not in date-
time format
merged.columns
[21]:
Index([2006-01-01 00:00:00, 2007-01-01 00:00:00, 2008-01-01 00:00:00,
[21]: 2009-01-01 00:00:00, 2010-01-01 00:00:00, 2011-01-01 00:00:00,
2012-01-01 00:00:00, 2013-01-01 00:00:00, 2014-01-01 00:00:00,
2015-01-01 00:00:00, 2016-01-01 00:00:00],
dtype='object')

Now that we have set the merged columns as the index, we can recreate a DatetimeIndex
using .to_datetime()
merged.columns = pd.to_datetime(merged.columns)
[22]: merged.columns = merged.columns.rename('Time')
merged.columns

DatetimeIndex(['2006-01-01', '2007-01-01', '2008-01-01', '2009-01-01',


[22]: '2010-01-01', '2011-01-01', '2012-01-01', '2013-01-01',
'2014-01-01', '2015-01-01', '2016-01-01'],
dtype='datetime64[ns]', name='Time', freq=None)

The DatetimeIndex tends to work more smoothly in the row axis, so we will go ahead and
transpose merged
merged = merged.transpose()
[23]: merged.head()

Continent America … Europe


[23]: Country Brazil Canada Chile … Slovenia Spain United Kingdom
Time …
2006-01-01 0.87 6.89 1.42 … 3.92 3.99 9.81
2007-01-01 0.92 6.96 1.45 … 3.88 4.10 10.07
2008-01-01 0.96 7.24 1.44 … 3.96 4.14 10.04
2009-01-01 1.03 7.67 1.52 … 4.08 4.32 10.15
2010-01-01 1.08 7.94 1.56 … 4.81 4.30 9.96

[5 rows x 32 columns]

17.5 Grouping and Summarizing Data

Grouping and summarizing data can be particularly useful for understanding large panel
datasets.
17.5. GROUPING AND SUMMARIZING DATA 269

A simple way to summarize data is to call an aggregation method on the dataframe, such as
.mean() or .max().
For example, we can calculate the average real minimum wage for each country over the pe-
riod 2006 to 2016 (the default is to aggregate over rows)
merged.mean().head(10)
[24]:
Continent Country
[24]: America Brazil 1.09
Canada 7.82
Chile 1.62
Colombia 1.07
Costa Rica 2.53
Mexico 0.53
United States 7.15
Asia Israel 5.95
Japan 6.18
Korea 4.22
dtype: float64

Using this series, we can plot the average real minimum wage over the past decade for each
country in our data set
import matplotlib.pyplot as plt
[25]: %matplotlib inline
import matplotlib
matplotlib.style.use('seaborn')

merged.mean().sort_values(ascending=False).plot(kind='bar', title="Average real minimumοΏ½


β†ͺwage 2006 - 2016")

#Set country labels


country_labels = merged.mean().sort_values(ascending=False).index.
β†ͺget_level_values('Country').tolist()
plt.xticks(range(0, len(country_labels)), country_labels)
plt.xlabel('Country')

plt.show()
270 CHAPTER 17. PANDAS FOR PANEL DATA

Passing in axis=1 to .mean() will aggregate over columns (giving the average minimum
wage for all countries over time)
merged.mean(axis=1).head()
[26]:
Time
[26]: 2006-01-01 4.69
2007-01-01 4.84
2008-01-01 4.90
2009-01-01 5.08
2010-01-01 5.11
dtype: float64

We can plot this time series as a line graph


merged.mean(axis=1).plot()
[27]: plt.title('Average real minimum wage 2006 - 2016')
plt.ylabel('2015 USD')
plt.xlabel('Year')
plt.show()
17.5. GROUPING AND SUMMARIZING DATA 271

We can also specify a level of the MultiIndex (in the column axis) to aggregate over
merged.mean(level='Continent', axis=1).head()
[28]:
Continent America Asia Australia Europe
[28]: Time
2006-01-01 2.80 4.29 10.25 4.80
2007-01-01 2.85 4.44 10.73 4.94
2008-01-01 2.99 4.45 10.76 4.99
2009-01-01 3.23 4.53 10.97 5.16
2010-01-01 3.34 4.53 10.95 5.17

We can plot the average minimum wages in each continent as a time series
merged.mean(level='Continent', axis=1).plot()
[29]: plt.title('Average real minimum wage')
plt.ylabel('2015 USD')
plt.xlabel('Year')
plt.show()
272 CHAPTER 17. PANDAS FOR PANEL DATA

We will drop Australia as a continent for plotting purposes


merged = merged.drop('Australia', level='Continent', axis=1)
[30]: merged.mean(level='Continent', axis=1).plot()
plt.title('Average real minimum wage')
plt.ylabel('2015 USD')
plt.xlabel('Year')
plt.show()
17.5. GROUPING AND SUMMARIZING DATA 273

.describe() is useful for quickly retrieving a number of common summary statistics


merged.stack().describe()
[31]:
Continent America Asia Europe
[31]: count 69.00 44.00 200.00
mean 3.19 4.70 5.15
std 3.02 1.56 3.82
min 0.52 2.22 0.23
25% 1.03 3.37 2.02
50% 1.44 5.48 3.54
75% 6.96 5.95 9.70
max 8.48 6.65 12.39

This is a simplified way to use groupby.


Using groupby generally follows a β€˜split-apply-combine’ process:

β€’ split: data is grouped based on one or more keys


β€’ apply: a function is called on each group independently
β€’ combine: the results of the function calls are combined into a new data structure

The groupby method achieves the first step of this process, creating a new
DataFrameGroupBy object with data split into groups.
Let’s split merged by continent again, this time using the groupby function, and name the
resulting object grouped
grouped = merged.groupby(level='Continent', axis=1)
[32]: grouped

<pandas.core.groupby.generic.DataFrameGroupBy object at 0x7f89f91d0940>


[32]:

Calling an aggregation method on the object applies the function to each group, the results of
which are combined in a new data structure.
For example, we can return the number of countries in our dataset for each continent using
.size().
In this case, our new data structure is a Series
grouped.size()
[33]:
Continent
[33]: America 7
Asia 4
Europe 19
dtype: int64

Calling .get_group() to return just the countries in a single group, we can create a kernel
density estimate of the distribution of real minimum wages in 2016 for each continent.
grouped.groups.keys() will return the keys from the groupby object
import seaborn as sns
[34]:
continents = grouped.groups.keys()

for continent in continents:


sns.kdeplot(grouped.get_group(continent)['2015'].unstack(), label=continent, shade=True)

plt.title('Real minimum wages in 2015')


plt.xlabel('US dollars')
274 CHAPTER 17. PANDAS FOR PANEL DATA

plt.show()

17.6 Final Remarks

This lecture has provided an introduction to some of pandas’ more advanced features, includ-
ing multiindices, merging, grouping and plotting.
Other tools that may be useful in panel data analysis include xarray, a python package that
extends pandas to N-dimensional data structures.

17.7 Exercises

17.7.1 Exercise 1

In these exercises, you’ll work with a dataset of employment rates in Europe by age and sex
from Eurostat.
The dataset pandas_panel/employ.csv can be downloaded here.
Reading in the CSV file returns a panel dataset in long format. Use .pivot_table() to
construct a wide format dataframe with a MultiIndex in the columns.
Start off by exploring the dataframe and the variables available in the MultiIndex levels.
Write a program that quickly returns all values in the MultiIndex.
17.8. SOLUTIONS 275

17.7.2 Exercise 2

Filter the above dataframe to only include employment as a percentage of β€˜active population’.
Create a grouped boxplot using seaborn of employment rates in 2015 by age group and sex.
Hint: GEO includes both areas and countries.

17.8 Solutions

17.8.1 Exercise 1

employ = pd.read_csv('https://github.com/QuantEcon/QuantEcon.lectures.code/raw/master/
[35]: β†ͺpandas_panel/employ.csv')
employ = employ.pivot_table(values='Value',
index=['DATE'],
columns=['UNIT','AGE', 'SEX', 'INDIC_EM', 'GEO'])
employ.index = pd.to_datetime(employ.index) # ensure that dates are datetime format
employ.head()

UNIT Percentage of total population … \


[35]: AGE From 15 to 24 years …
SEX Females …
INDIC_EM Active population …
GEO Austria Belgium Bulgaria …
DATE …
2007-01-01 56.00 31.60 26.00 …
2008-01-01 56.20 30.80 26.10 …
2009-01-01 56.20 29.90 24.80 …
2010-01-01 54.00 29.80 26.60 …
2011-01-01 54.80 29.80 24.80 …

UNIT Thousand persons \


AGE From 55 to 64 years
SEX Total
INDIC_EM Total employment (resident population concept - LFS)
GEO Switzerland Turkey
DATE
2007-01-01 nan 1,282.00
2008-01-01 nan 1,354.00
2009-01-01 nan 1,449.00
2010-01-01 640.00 1,583.00
2011-01-01 661.00 1,760.00

UNIT
AGE
SEX
INDIC_EM
GEO United Kingdom
DATE
2007-01-01 4,131.00
2008-01-01 4,204.00
2009-01-01 4,193.00
2010-01-01 4,186.00
2011-01-01 4,164.00

[5 rows x 1440 columns]

This is a large dataset so it is useful to explore the levels and variables available
employ.columns.names
[36]:
FrozenList(['UNIT', 'AGE', 'SEX', 'INDIC_EM', 'GEO'])
[36]:

Variables within levels can be quickly retrieved with a loop


276 CHAPTER 17. PANDAS FOR PANEL DATA

for name in employ.columns.names:


[37]: print(name, employ.columns.get_level_values(name).unique())

UNIT Index(['Percentage of total population', 'Thousand persons'],


dtype='object', name='UNIT')
AGE Index(['From 15 to 24 years', 'From 25 to 54 years', 'From 55 to 64 years'],
dtype='object', name='AGE')
SEX Index(['Females', 'Males', 'Total'], dtype='object', name='SEX')
INDIC_EM Index(['Active population', 'Total employment (resident population
concept - LFS)'], dtype='object', name='INDIC_EM')
GEO Index(['Austria', 'Belgium', 'Bulgaria', 'Croatia', 'Cyprus', 'Czech
Republic',
'Denmark', 'Estonia', 'Euro area (17 countries)',
'Euro area (18 countries)', 'Euro area (19 countries)',
'European Union (15 countries)', 'European Union (27 countries)',
'European Union (28 countries)', 'Finland',
'Former Yugoslav Republic of Macedonia, the', 'France',
'France (metropolitan)',
'Germany (until 1990 former territory of the FRG)', 'Greece', 'Hungary',
'Iceland', 'Ireland', 'Italy', 'Latvia', 'Lithuania', 'Luxembourg',
'Malta', 'Netherlands', 'Norway', 'Poland', 'Portugal', 'Romania',
'Slovakia', 'Slovenia', 'Spain', 'Sweden', 'Switzerland', 'Turkey',
'United Kingdom'],
dtype='object', name='GEO')

17.8.2 Exercise 2

To easily filter by country, swap GEO to the top level and sort the MultiIndex
employ.columns = employ.columns.swaplevel(0,-1)
[38]: employ = employ.sort_index(axis=1)

We need to get rid of a few items in GEO which are not countries.
A fast way to get rid of the EU areas is to use a list comprehension to find the level values in
GEO that begin with β€˜Euro’
geo_list = employ.columns.get_level_values('GEO').unique().tolist()
[39]: countries = [x for x in geo_list if not x.startswith('Euro')]
employ = employ[countries]
employ.columns.get_level_values('GEO').unique()

Index(['Austria', 'Belgium', 'Bulgaria', 'Croatia', 'Cyprus', 'Czech Republic',


[39]: 'Denmark', 'Estonia', 'Finland',
'Former Yugoslav Republic of Macedonia, the', 'France',
'France (metropolitan)',
'Germany (until 1990 former territory of the FRG)', 'Greece', 'Hungary',
'Iceland', 'Ireland', 'Italy', 'Latvia', 'Lithuania', 'Luxembourg',
'Malta', 'Netherlands', 'Norway', 'Poland', 'Portugal', 'Romania',
'Slovakia', 'Slovenia', 'Spain', 'Sweden', 'Switzerland', 'Turkey',
'United Kingdom'],
dtype='object', name='GEO')

Select only percentage employed in the active population from the dataframe
employ_f = employ.xs(('Percentage of total population', 'Active population'),
[40]: level=('UNIT', 'INDIC_EM'),
axis=1)
employ_f.head()

GEO Austria … United Kingdom \


[40]: AGE From 15 to 24 years … From 55 to 64 years
SEX Females Males Total … Females Males
DATE …
2007-01-01 56.00 62.90 59.40 … 49.90 68.90
2008-01-01 56.20 62.90 59.50 … 50.20 69.80
2009-01-01 56.20 62.90 59.50 … 50.60 70.30
17.8. SOLUTIONS 277

2010-01-01 54.00 62.60 58.30 … 51.10 69.20


2011-01-01 54.80 63.60 59.20 … 51.30 68.40

GEO
AGE
SEX Total
DATE
2007-01-01 59.30
2008-01-01 59.80
2009-01-01 60.30
2010-01-01 60.00
2011-01-01 59.70

[5 rows x 306 columns]

Drop the β€˜Total’ value before creating the grouped boxplot


employ_f = employ_f.drop('Total', level='SEX', axis=1)
[41]:
box = employ_f['2015'].unstack().reset_index()
[42]: sns.boxplot(x="AGE", y=0, hue="SEX", data=box, palette=("husl"), showfliers=False)
plt.xlabel('')
plt.xticks(rotation=35)
plt.ylabel('Percentage of population (%)')
plt.title('Employment in Europe (2015)')
plt.legend(bbox_to_anchor=(1,0.5))
plt.show()
278 CHAPTER 17. PANDAS FOR PANEL DATA
Chapter 18

Linear Regression in Python

18.1 Contents

β€’ Overview 18.2

β€’ Simple Linear Regression 18.3

β€’ Extending the Linear Regression Model 18.4

β€’ Endogeneity 18.5

β€’ Summary 18.6

β€’ Exercises 18.7

β€’ Solutions 18.8

In addition to what’s in Anaconda, this lecture will need the following libraries:
!pip install linearmodels
[1]:

18.2 Overview

Linear regression is a standard tool for analyzing the relationship between two or more vari-
ables.
In this lecture, we’ll use the Python package statsmodels to estimate, interpret, and visu-
alize linear regression models.
Along the way, we’ll discuss a variety of topics, including

β€’ simple and multivariate linear regression


β€’ visualization
β€’ endogeneity and omitted variable bias
β€’ two-stage least squares

As an example, we will replicate results from Acemoglu, Johnson and Robinson’s seminal pa-
per [3].

279
280 CHAPTER 18. LINEAR REGRESSION IN PYTHON

β€’ You can download a copy here.

In the paper, the authors emphasize the importance of institutions in economic development.
The main contribution is the use of settler mortality rates as a source of exogenous variation
in institutional differences.
Such variation is needed to determine whether it is institutions that give rise to greater eco-
nomic growth, rather than the other way around.
Let’s start with some imports:
import numpy as np
[2]: import matplotlib.pyplot as plt
%matplotlib inline
import pandas as pd
import statsmodels.api as sm
from statsmodels.iolib.summary2 import summary_col
from linearmodels.iv import IV2SLS

18.2.1 Prerequisites

This lecture assumes you are familiar with basic econometrics.


For an introductory text covering these topics, see, for example, [140].

18.2.2 Comments

This lecture is coauthored with Natasha Watkins.

18.3 Simple Linear Regression

[3] wish to determine whether or not differences in institutions can help to explain observed
economic outcomes.
How do we measure institutional differences and economic outcomes?
In this paper,

β€’ economic outcomes are proxied by log GDP per capita in 1995, adjusted for exchange
rates.
β€’ institutional differences are proxied by an index of protection against expropriation on
average over 1985-95, constructed by the Political Risk Services Group.

These variables and other data used in the paper are available for download on Daron Ace-
moglu’s webpage.
We will use pandas’ .read_stata() function to read in data contained in the .dta files to
dataframes
df1 = pd.read_stata('https://github.com/QuantEcon/QuantEcon.lectures.code/raw/master/ols/
[3]: β†ͺmaketable1.dta')
df1.head()
18.3. SIMPLE LINEAR REGRESSION 281

shortnam euro1900 excolony avexpr logpgp95 cons1 cons90 democ00a \


[3]: 0 AFG 0.000000 1.0 NaN NaN 1.0 2.0 1.0
1 AGO 8.000000 1.0 5.363636 7.770645 3.0 3.0 0.0
2 ARE 0.000000 1.0 7.181818 9.804219 NaN NaN NaN
3 ARG 60.000004 1.0 6.386364 9.133459 1.0 6.0 3.0
4 ARM 0.000000 0.0 NaN 7.682482 NaN NaN NaN

cons00a extmort4 logem4 loghjypl baseco


0 1.0 93.699997 4.540098 NaN NaN
1 1.0 280.000000 5.634789 -3.411248 1.0
2 NaN NaN NaN NaN NaN
3 3.0 68.900002 4.232656 -0.872274 1.0
4 NaN NaN NaN NaN NaN

Let’s use a scatterplot to see whether any obvious relationship exists between GDP per capita
and the protection against expropriation index
plt.style.use('seaborn')
[4]:
df1.plot(x='avexpr', y='logpgp95', kind='scatter')
plt.show()

The plot shows a fairly strong positive relationship between protection against expropriation
and log GDP per capita.
Specifically, if higher protection against expropriation is a measure of institutional quality,
then better institutions appear to be positively correlated with better economic outcomes
(higher GDP per capita).
Given the plot, choosing a linear model to describe this relationship seems like a reasonable
assumption.
We can write our model as
282 CHAPTER 18. LINEAR REGRESSION IN PYTHON

π‘™π‘œπ‘”π‘π‘”π‘95𝑖 = 𝛽0 + 𝛽1 π‘Žπ‘£π‘’π‘₯π‘π‘Ÿπ‘– + 𝑒𝑖

where:

β€’ 𝛽0 is the intercept of the linear trend line on the y-axis


β€’ 𝛽1 is the slope of the linear trend line, representing the marginal effect of protection
against risk on log GDP per capita
β€’ 𝑒𝑖 is a random error term (deviations of observations from the linear trend due to fac-
tors not included in the model)

Visually, this linear model involves choosing a straight line that best fits the data, as in the
following plot (Figure 2 in [3])
# Dropping NA's is required to use numpy's polyfit
[5]: df1_subset = df1.dropna(subset=['logpgp95', 'avexpr'])

# Use only 'base sample' for plotting purposes


df1_subset = df1_subset[df1_subset['baseco'] == 1]

X = df1_subset['avexpr']
y = df1_subset['logpgp95']
labels = df1_subset['shortnam']

# Replace markers with country labels


fig, ax = plt.subplots()
ax.scatter(X, y, marker='')

for i, label in enumerate(labels):


ax.annotate(label, (X.iloc[i], y.iloc[i]))

# Fit a linear trend line


ax.plot(np.unique(X),
np.poly1d(np.polyfit(X, y, 1))(np.unique(X)),
color='black')

ax.set_xlim([3.3,10.5])
ax.set_ylim([4,10.5])
ax.set_xlabel('Average Expropriation Risk 1985-95')
ax.set_ylabel('Log GDP per capita, PPP, 1995')
ax.set_title('Figure 2: OLS relationship between expropriation \
risk and income')
plt.show()
18.3. SIMPLE LINEAR REGRESSION 283

The most common technique to estimate the parameters (𝛽’s) of the linear model is Ordinary
Least Squares (OLS).
As the name implies, an OLS model is solved by finding the parameters that minimize the
sum of squared residuals, i.e.

𝑁
min βˆ‘ 𝑒̂2𝑖
𝛽̂ 𝑖=1

where 𝑒̂𝑖 is the difference between the observation and the predicted value of the dependent
variable.
To estimate the constant term 𝛽0 , we need to add a column of 1’s to our dataset (consider
the equation if 𝛽0 was replaced with 𝛽0 π‘₯𝑖 and π‘₯𝑖 = 1)
df1['const'] = 1
[6]:

Now we can construct our model in statsmodels using the OLS function.
We will use pandas dataframes with statsmodels, however standard arrays can also be
used as arguments
reg1 = sm.OLS(endog=df1['logpgp95'], exog=df1[['const', 'avexpr']], \
[7]: missing='drop')
type(reg1)

statsmodels.regression.linear_model.OLS
[7]:

So far we have simply constructed our model.


We need to use .fit() to obtain parameter estimates 𝛽0Μ‚ and 𝛽1Μ‚
284 CHAPTER 18. LINEAR REGRESSION IN PYTHON

results = reg1.fit()
[8]: type(results)

statsmodels.regression.linear_model.RegressionResultsWrapper
[8]:

We now have the fitted regression model stored in results.


To view the OLS regression results, we can call the .summary() method.
Note that an observation was mistakenly dropped from the results in the original paper (see
the note located in maketable2.do from Acemoglu’s webpage), and thus the coefficients differ
slightly.
print(results.summary())
[9]:

OLS Regression Results


==============================================================================
Dep. Variable: logpgp95 R-squared: 0.611
Model: OLS Adj. R-squared: 0.608
Method: Least Squares F-statistic: 171.4
Date: Sun, 20 Oct 2019 Prob (F-statistic): 4.16e-24
Time: 17:09:20 Log-Likelihood: -119.71
No. Observations: 111 AIC: 243.4
Df Residuals: 109 BIC: 248.8
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const 4.6261 0.301 15.391 0.000 4.030 5.222
avexpr 0.5319 0.041 13.093 0.000 0.451 0.612
==============================================================================
Omnibus: 9.251 Durbin-Watson: 1.689
Prob(Omnibus): 0.010 Jarque-Bera (JB): 9.170
Skew: -0.680 Prob(JB): 0.0102
Kurtosis: 3.362 Cond. No. 33.2
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly
specified.

From our results, we see that

β€’ The intercept 𝛽0Μ‚ = 4.63.


β€’ The slope 𝛽1Μ‚ = 0.53.
β€’ The positive 𝛽1Μ‚ parameter estimate implies that. institutional quality has a positive ef-
fect on economic outcomes, as we saw in the figure.
β€’ The p-value of 0.000 for 𝛽1Μ‚ implies that the effect of institutions on GDP is statistically
significant (using p < 0.05 as a rejection rule).
β€’ The R-squared value of 0.611 indicates that around 61% of variation in log GDP per
capita is explained by protection against expropriation.

Using our parameter estimates, we can now write our estimated relationship as

Μ‚
π‘™π‘œπ‘”π‘π‘”π‘95 𝑖 = 4.63 + 0.53 π‘Žπ‘£π‘’π‘₯π‘π‘Ÿπ‘–

This equation describes the line that best fits our data, as shown in Figure 2.
We can use this equation to predict the level of log GDP per capita for a value of the index of
expropriation protection.
18.3. SIMPLE LINEAR REGRESSION 285

For example, for a country with an index value of 7.07 (the average for the dataset), we find
that their predicted level of log GDP per capita in 1995 is 8.38.
mean_expr = np.mean(df1_subset['avexpr'])
[10]: mean_expr

6.515625
[10]:
predicted_logpdp95 = 4.63 + 0.53 * 7.07
[11]: predicted_logpdp95

8.3771
[11]:

An easier (and more accurate) way to obtain this result is to use .predict() and set
π‘π‘œπ‘›π‘ π‘‘π‘Žπ‘›π‘‘ = 1 and π‘Žπ‘£π‘’π‘₯π‘π‘Ÿπ‘– = π‘šπ‘’π‘Žπ‘›_𝑒π‘₯π‘π‘Ÿ
results.predict(exog=[1, mean_expr])
[12]:
array([8.09156367])
[12]:

We can obtain an array of predicted π‘™π‘œπ‘”π‘π‘”π‘95𝑖 for every value of π‘Žπ‘£π‘’π‘₯π‘π‘Ÿπ‘– in our dataset by
calling .predict() on our results.
Plotting the predicted values against π‘Žπ‘£π‘’π‘₯π‘π‘Ÿπ‘– shows that the predicted values lie along the
linear line that we fitted above.
The observed values of π‘™π‘œπ‘”π‘π‘”π‘95𝑖 are also plotted for comparison purposes
# Drop missing observations from whole sample
[13]:
df1_plot = df1.dropna(subset=['logpgp95', 'avexpr'])

# Plot predicted values

fix, ax = plt.subplots()
ax.scatter(df1_plot['avexpr'], results.predict(), alpha=0.5,
label='predicted')

# Plot observed values

ax.scatter(df1_plot['avexpr'], df1_plot['logpgp95'], alpha=0.5,


label='observed')

ax.legend()
ax.set_title('OLS predicted values')
ax.set_xlabel('avexpr')
ax.set_ylabel('logpgp95')
plt.show()
286 CHAPTER 18. LINEAR REGRESSION IN PYTHON

18.4 Extending the Linear Regression Model

So far we have only accounted for institutions affecting economic performance - almost cer-
tainly there are numerous other factors affecting GDP that are not included in our model.
Leaving out variables that affect π‘™π‘œπ‘”π‘π‘”π‘95𝑖 will result in omitted variable bias, yielding
biased and inconsistent parameter estimates.
We can extend our bivariate regression model to a multivariate regression model by
adding in other factors that may affect π‘™π‘œπ‘”π‘π‘”π‘95𝑖 .
[3] consider other factors such as:

β€’ the effect of climate on economic outcomes; latitude is used to proxy this


β€’ differences that affect both economic performance and institutions, eg. cultural, histori-
cal, etc.; controlled for with the use of continent dummies

Let’s estimate some of the extended models considered in the paper (Table 2) using data from
maketable2.dta
df2 = pd.read_stata('https://github.com/QuantEcon/QuantEcon.lectures.code/raw/master/ols/
[14]: β†ͺmaketable2.dta')

# Add constant term to dataset


df2['const'] = 1

# Create lists of variables to be used in each regression


X1 = ['const', 'avexpr']
X2 = ['const', 'avexpr', 'lat_abst']
18.5. ENDOGENEITY 287

X3 = ['const', 'avexpr', 'lat_abst', 'asia', 'africa', 'other']

# Estimate an OLS regression for each set of variables


reg1 = sm.OLS(df2['logpgp95'], df2[X1], missing='drop').fit()
reg2 = sm.OLS(df2['logpgp95'], df2[X2], missing='drop').fit()
reg3 = sm.OLS(df2['logpgp95'], df2[X3], missing='drop').fit()

Now that we have fitted our model, we will use summary_col to display the results in a sin-
gle table (model numbers correspond to those in the paper)
info_dict={'R-squared' : lambda x: f"{x.rsquared:.2f}",
[15]: 'No. observations' : lambda x: f"{int(x.nobs):d}"}

results_table = summary_col(results=[reg1,reg2,reg3],
float_format='%0.2f',
stars = True,
model_names=['Model 1',
'Model 3',
'Model 4'],
info_dict=info_dict,
regressor_order=['const',
'avexpr',
'lat_abst',
'asia',
'africa'])

results_table.add_title('Table 2 - OLS Regressions')

print(results_table)

Table 2 - OLS Regressions


=========================================
Model 1 Model 3 Model 4
-----------------------------------------
const 4.63*** 4.87*** 5.85***
(0.30) (0.33) (0.34)
avexpr 0.53*** 0.46*** 0.39***
(0.04) (0.06) (0.05)
lat_abst 0.87* 0.33
(0.49) (0.45)
asia -0.15
(0.15)
africa -0.92***
(0.17)
other 0.30
(0.37)
R-squared 0.61 0.62 0.72
No. observations 111 111 111
=========================================
Standard errors in parentheses.
* p<.1, ** p<.05, ***p<.01

18.5 Endogeneity

As [3] discuss, the OLS models likely suffer from endogeneity issues, resulting in biased and
inconsistent model estimates.
Namely, there is likely a two-way relationship between institutions and economic outcomes:

β€’ richer countries may be able to afford or prefer better institutions


β€’ variables that affect income may also be correlated with institutional differences
β€’ the construction of the index may be biased; analysts may be biased towards seeing
countries with higher income having better institutions
288 CHAPTER 18. LINEAR REGRESSION IN PYTHON

To deal with endogeneity, we can use two-stage least squares (2SLS) regression, which
is an extension of OLS regression.
This method requires replacing the endogenous variable π‘Žπ‘£π‘’π‘₯π‘π‘Ÿπ‘– with a variable that is:

1. correlated with π‘Žπ‘£π‘’π‘₯π‘π‘Ÿπ‘–


2. not correlated with the error term (ie. it should not directly affect the dependent vari-
able, otherwise it would be correlated with 𝑒𝑖 due to omitted variable bias)

The new set of regressors is called an instrument, which aims to remove endogeneity in our
proxy of institutional differences.
The main contribution of [3] is the use of settler mortality rates to instrument for institu-
tional differences.
They hypothesize that higher mortality rates of colonizers led to the establishment of insti-
tutions that were more extractive in nature (less protection against expropriation), and these
institutions still persist today.
Using a scatterplot (Figure 3 in [3]), we can see protection against expropriation is negatively
correlated with settler mortality rates, coinciding with the authors’ hypothesis and satisfying
the first condition of a valid instrument.
# Dropping NA's is required to use numpy's polyfit
[16]: df1_subset2 = df1.dropna(subset=['logem4', 'avexpr'])

X = df1_subset2['logem4']
y = df1_subset2['avexpr']
labels = df1_subset2['shortnam']

# Replace markers with country labels


fig, ax = plt.subplots()
ax.scatter(X, y, marker='')

for i, label in enumerate(labels):


ax.annotate(label, (X.iloc[i], y.iloc[i]))

# Fit a linear trend line


ax.plot(np.unique(X),
np.poly1d(np.polyfit(X, y, 1))(np.unique(X)),
color='black')

ax.set_xlim([1.8,8.4])
ax.set_ylim([3.3,10.4])
ax.set_xlabel('Log of Settler Mortality')
ax.set_ylabel('Average Expropriation Risk 1985-95')
ax.set_title('Figure 3: First-stage relationship between settler mortality \
and expropriation risk')
plt.show()
18.5. ENDOGENEITY 289

The second condition may not be satisfied if settler mortality rates in the 17th to 19th cen-
turies have a direct effect on current GDP (in addition to their indirect effect through institu-
tions).
For example, settler mortality rates may be related to the current disease environment in a
country, which could affect current economic performance.
[3] argue this is unlikely because:

β€’ The majority of settler deaths were due to malaria and yellow fever and had a limited
effect on local people.
β€’ The disease burden on local people in Africa or India, for example, did not appear to
be higher than average, supported by relatively high population densities in these areas
before colonization.

As we appear to have a valid instrument, we can use 2SLS regression to obtain consistent and
unbiased parameter estimates.
First stage
The first stage involves regressing the endogenous variable (π‘Žπ‘£π‘’π‘₯π‘π‘Ÿπ‘– ) on the instrument.
The instrument is the set of all exogenous variables in our model (and not just the variable
we have replaced).
Using model 1 as an example, our instrument is simply a constant and settler mortality rates
π‘™π‘œπ‘”π‘’π‘š4𝑖 .
Therefore, we will estimate the first-stage regression as

π‘Žπ‘£π‘’π‘₯π‘π‘Ÿπ‘– = 𝛿0 + 𝛿1 π‘™π‘œπ‘”π‘’π‘š4𝑖 + 𝑣𝑖
290 CHAPTER 18. LINEAR REGRESSION IN PYTHON

The data we need to estimate this equation is located in maketable4.dta (only complete
data, indicated by baseco = 1, is used for estimation)
# Import and select the data
[17]: df4 = pd.read_stata('https://github.com/QuantEcon/QuantEcon.lectures.code/raw/master/ols/
β†ͺmaketable4.dta')
df4 = df4[df4['baseco'] == 1]

# Add a constant variable


df4['const'] = 1

# Fit the first stage regression and print summary


results_fs = sm.OLS(df4['avexpr'],
df4[['const', 'logem4']],
missing='drop').fit()
print(results_fs.summary())

OLS Regression Results


==============================================================================
Dep. Variable: avexpr R-squared: 0.270
Model: OLS Adj. R-squared: 0.258
Method: Least Squares F-statistic: 22.95
Date: Sun, 20 Oct 2019 Prob (F-statistic): 1.08e-05
Time: 17:09:23 Log-Likelihood: -104.83
No. Observations: 64 AIC: 213.7
Df Residuals: 62 BIC: 218.0
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const 9.3414 0.611 15.296 0.000 8.121 10.562
logem4 -0.6068 0.127 -4.790 0.000 -0.860 -0.354
==============================================================================
Omnibus: 0.035 Durbin-Watson: 2.003
Prob(Omnibus): 0.983 Jarque-Bera (JB): 0.172
Skew: 0.045 Prob(JB): 0.918
Kurtosis: 2.763 Cond. No. 19.4
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly
specified.

Second stage
We need to retrieve the predicted values of π‘Žπ‘£π‘’π‘₯π‘π‘Ÿπ‘– using .predict().
We then replace the endogenous variable π‘Žπ‘£π‘’π‘₯π‘π‘Ÿπ‘– with the predicted values π‘Žπ‘£π‘’π‘₯π‘π‘Ÿ
Μ‚ 𝑖 in the
original linear model.
Our second stage regression is thus

π‘™π‘œπ‘”π‘π‘”π‘95𝑖 = 𝛽0 + 𝛽1 π‘Žπ‘£π‘’π‘₯π‘π‘Ÿ
Μ‚ 𝑖 + 𝑒𝑖

df4['predicted_avexpr'] = results_fs.predict()
[18]:
results_ss = sm.OLS(df4['logpgp95'],
df4[['const', 'predicted_avexpr']]).fit()
print(results_ss.summary())

OLS Regression Results


==============================================================================
Dep. Variable: logpgp95 R-squared: 0.477
Model: OLS Adj. R-squared: 0.469
18.5. ENDOGENEITY 291

Method: Least Squares F-statistic: 56.60


Date: Sun, 20 Oct 2019 Prob (F-statistic): 2.66e-10
Time: 17:09:23 Log-Likelihood: -72.268
No. Observations: 64 AIC: 148.5
Df Residuals: 62 BIC: 152.9
Df Model: 1
Covariance Type: nonrobust
================================================================================
====
coef std err t P>|t| [0.025
0.975]
--------------------------------------------------------------------------------
----
const 1.9097 0.823 2.320 0.024 0.264
3.555
predicted_avexpr 0.9443 0.126 7.523 0.000 0.693
1.195
==============================================================================
Omnibus: 10.547 Durbin-Watson: 2.137
Prob(Omnibus): 0.005 Jarque-Bera (JB): 11.010
Skew: -0.790 Prob(JB): 0.00407
Kurtosis: 4.277 Cond. No. 58.1
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly
specified.

The second-stage regression results give us an unbiased and consistent estimate of the effect
of institutions on economic outcomes.
The result suggests a stronger positive relationship than what the OLS results indicated.
Note that while our parameter estimates are correct, our standard errors are not and for this
reason, computing 2SLS β€˜manually’ (in stages with OLS) is not recommended.
We can correctly estimate a 2SLS regression in one step using the linearmodels package, an
extension of statsmodels
Note that when using IV2SLS, the exogenous and instrument variables are split up in the
function arguments (whereas before the instrument included exogenous variables)
iv = IV2SLS(dependent=df4['logpgp95'],
[19]: exog=df4['const'],
endog=df4['avexpr'],
instruments=df4['logem4']).fit(cov_type='unadjusted')

print(iv.summary)

IV-2SLS Estimation Summary


==============================================================================
Dep. Variable: logpgp95 R-squared: 0.1870
Estimator: IV-2SLS Adj. R-squared: 0.1739
No. Observations: 64 F-statistic: 37.568
Date: Sun, Oct 20 2019 P-value (F-stat) 0.0000
Time: 17:09:23 Distribution: chi2(1)
Cov. Estimator: unadjusted

Parameter Estimates
==============================================================================
Parameter Std. Err. T-stat P-value Lower CI Upper CI
------------------------------------------------------------------------------
const 1.9097 1.0106 1.8897 0.0588 -0.0710 3.8903
avexpr 0.9443 0.1541 6.1293 0.0000 0.6423 1.2462
==============================================================================

Endogenous: avexpr
Instruments: logem4
292 CHAPTER 18. LINEAR REGRESSION IN PYTHON

Unadjusted Covariance (Homoskedastic)


Debiased: False

Given that we now have consistent and unbiased estimates, we can infer from the model we
have estimated that institutional differences (stemming from institutions set up during colo-
nization) can help to explain differences in income levels across countries today.
[3] use a marginal effect of 0.94 to calculate that the difference in the index between Chile
and Nigeria (ie. institutional quality) implies up to a 7-fold difference in income, emphasizing
the significance of institutions in economic development.

18.6 Summary

We have demonstrated basic OLS and 2SLS regression in statsmodels and


linearmodels.
If you are familiar with R, you may want to use the formula interface to statsmodels, or
consider using r2py to call R from within Python.

18.7 Exercises

18.7.1 Exercise 1

In the lecture, we think the original model suffers from endogeneity bias due to the likely ef-
fect income has on institutional development.
Although endogeneity is often best identified by thinking about the data and model, we can
formally test for endogeneity using the Hausman test.
We want to test for correlation between the endogenous variable, π‘Žπ‘£π‘’π‘₯π‘π‘Ÿπ‘– , and the errors, 𝑒𝑖

𝐻0 ∢ πΆπ‘œπ‘£(π‘Žπ‘£π‘’π‘₯π‘π‘Ÿπ‘– , 𝑒𝑖 ) = 0 (π‘›π‘œ π‘’π‘›π‘‘π‘œπ‘”π‘’π‘›π‘’π‘–π‘‘π‘¦)


𝐻1 ∢ πΆπ‘œπ‘£(π‘Žπ‘£π‘’π‘₯π‘π‘Ÿπ‘– , 𝑒𝑖 ) β‰  0 (π‘’π‘›π‘‘π‘œπ‘”π‘’π‘›π‘’π‘–π‘‘π‘¦)

This test is running in two stages.


First, we regress π‘Žπ‘£π‘’π‘₯π‘π‘Ÿπ‘– on the instrument, π‘™π‘œπ‘”π‘’π‘š4𝑖

π‘Žπ‘£π‘’π‘₯π‘π‘Ÿπ‘– = πœ‹0 + πœ‹1 π‘™π‘œπ‘”π‘’π‘š4𝑖 + πœπ‘–

Second, we retrieve the residuals πœπ‘–Μ‚ and include them in the original equation

π‘™π‘œπ‘”π‘π‘”π‘95𝑖 = 𝛽0 + 𝛽1 π‘Žπ‘£π‘’π‘₯π‘π‘Ÿπ‘– + π›Όπœπ‘–Μ‚ + 𝑒𝑖

If 𝛼 is statistically significant (with a p-value < 0.05), then we reject the null hypothesis and
conclude that π‘Žπ‘£π‘’π‘₯π‘π‘Ÿπ‘– is endogenous.
Using the above information, estimate a Hausman test and interpret your results.
18.8. SOLUTIONS 293

18.7.2 Exercise 2

The OLS parameter 𝛽 can also be estimated using matrix algebra and numpy (you may need
to review the numpy lecture to complete this exercise).
The linear equation we want to estimate is (written in matrix form)

𝑦 = 𝑋𝛽 + 𝑒

To solve for the unknown parameter 𝛽, we want to minimize the sum of squared residuals

min𝑒̂′ 𝑒̂
𝛽̂

Rearranging the first equation and substituting into the second equation, we can write

min (π‘Œ βˆ’ 𝑋 𝛽)Μ‚ β€² (π‘Œ βˆ’ 𝑋 𝛽)Μ‚


𝛽̂

Solving this optimization problem gives the solution for the 𝛽 Μ‚ coefficients

𝛽 Μ‚ = (𝑋 β€² 𝑋)βˆ’1 𝑋 β€² 𝑦

Using the above information, compute 𝛽 Μ‚ from model 1 using numpy - your results should be
the same as those in the statsmodels output from earlier in the lecture.

18.8 Solutions

18.8.1 Exercise 1
# Load in data
[20]: df4 = pd.read_stata('https://github.com/QuantEcon/QuantEcon.lectures.code/raw/master/ols/
β†ͺmaketable4.dta')

# Add a constant term


df4['const'] = 1

# Estimate the first stage regression


reg1 = sm.OLS(endog=df4['avexpr'],
exog=df4[['const', 'logem4']],
missing='drop').fit()

# Retrieve the residuals


df4['resid'] = reg1.resid

# Estimate the second stage residuals


reg2 = sm.OLS(endog=df4['logpgp95'],
exog=df4[['const', 'avexpr', 'resid']],
missing='drop').fit()

print(reg2.summary())

OLS Regression Results


==============================================================================
Dep. Variable: logpgp95 R-squared: 0.689
Model: OLS Adj. R-squared: 0.679
294 CHAPTER 18. LINEAR REGRESSION IN PYTHON

Method: Least Squares F-statistic: 74.05


Date: Sun, 20 Oct 2019 Prob (F-statistic): 1.07e-17
Time: 17:09:23 Log-Likelihood: -62.031
No. Observations: 70 AIC: 130.1
Df Residuals: 67 BIC: 136.8
Df Model: 2
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const 2.4782 0.547 4.530 0.000 1.386 3.570
avexpr 0.8564 0.082 10.406 0.000 0.692 1.021
resid -0.4951 0.099 -5.017 0.000 -0.692 -0.298
==============================================================================
Omnibus: 17.597 Durbin-Watson: 2.086
Prob(Omnibus): 0.000 Jarque-Bera (JB): 23.194
Skew: -1.054 Prob(JB): 9.19e-06
Kurtosis: 4.873 Cond. No. 53.8
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly
specified.

The output shows that the coefficient on the residuals is statistically significant, indicating
π‘Žπ‘£π‘’π‘₯π‘π‘Ÿπ‘– is endogenous.

18.8.2 Exercise 2
# Load in data
[21]: df1 = pd.read_stata('https://github.com/QuantEcon/QuantEcon.lectures.code/raw/master/ols/
β†ͺmaketable1.dta')
df1 = df1.dropna(subset=['logpgp95', 'avexpr'])

# Add a constant term


df1['const'] = 1

# Define the X and y variables


y = np.asarray(df1['logpgp95'])
X = np.asarray(df1[['const', 'avexpr']])

# Compute Ξ²_hat
Ξ²_hat = np.linalg.solve(X.T @ X, X.T @ y)

# Print out the results from the 2 x 1 vector Ξ²_hat


print(f'Ξ²_0 = {Ξ²_hat[0]:.2}')
print(f'Ξ²_1 = {Ξ²_hat[1]:.2}')

Ξ²_0 = 4.6
Ξ²_1 = 0.53

It is also possible to use np.linalg.inv(X.T @ X) @ X.T @ y to solve for 𝛽, however


.solve() is preferred as it involves fewer computations.
Chapter 19

Maximum Likelihood Estimation

19.1 Contents

β€’ Overview 19.2

β€’ Set Up and Assumptions 19.3

β€’ Conditional Distributions 19.4

β€’ Maximum Likelihood Estimation 19.5

β€’ MLE with Numerical Methods 19.6

β€’ Maximum Likelihood Estimation 19.7

β€’ Summary 19.8

β€’ Exercises 19.9

β€’ Solutions 19.10

19.2 Overview

In a previous lecture, we estimated the relationship between dependent and explanatory vari-
ables using linear regression.
But what if a linear relationship is not an appropriate assumption for our model?
One widely used alternative is maximum likelihood estimation, which involves specifying a
class of distributions, indexed by unknown parameters, and then using the data to pin down
these parameter values.
The benefit relative to linear regression is that it allows more flexibility in the probabilistic
relationships between variables.
Here we illustrate maximum likelihood by replicating Daniel Treisman’s (2016) paper, Rus-
sia’s Billionaires, which connects the number of billionaires in a country to its economic char-
acteristics.
The paper concludes that Russia has a higher number of billionaires than economic factors
such as market size and tax rate predict.

295
296 CHAPTER 19. MAXIMUM LIKELIHOOD ESTIMATION

We’ll require the following imports:


import numpy as np
[1]: from numpy import exp
import matplotlib.pyplot as plt
%matplotlib inline
from scipy.special import factorial
import pandas as pd
from mpl_toolkits.mplot3d import Axes3D
import statsmodels.api as sm
from statsmodels.api import Poisson
from scipy import stats
from scipy.stats import norm
from statsmodels.iolib.summary2 import summary_col

19.2.1 Prerequisites

We assume familiarity with basic probability and multivariate calculus.

19.2.2 Comments

This lecture is co-authored with Natasha Watkins.

19.3 Set Up and Assumptions

Let’s consider the steps we need to go through in maximum likelihood estimation and how
they pertain to this study.

19.3.1 Flow of Ideas

The first step with maximum likelihood estimation is to choose the probability distribution
believed to be generating the data.
More precisely, we need to make an assumption as to which parametric class of distributions
is generating the data.

β€’ e.g., the class of all normal distributions, or the class of all gamma distributions.

Each such class is a family of distributions indexed by a finite number of parameters.

β€’ e.g., the class of normal distributions is a family of distributions indexed by its mean
πœ‡ ∈ (βˆ’βˆž, ∞) and standard deviation 𝜎 ∈ (0, ∞).

We’ll let the data pick out a particular element of the class by pinning down the parameters.
The parameter estimates so produced will be called maximum likelihood estimates.

19.3.2 Counting Billionaires

Treisman [134] is interested in estimating the number of billionaires in different countries.


The number of billionaires is integer-valued.
19.3. SET UP AND ASSUMPTIONS 297

Hence we consider distributions that take values only in the nonnegative integers.
(This is one reason least squares regression is not the best tool for the present problem, since
the dependent variable in linear regression is not restricted to integer values)
One integer distribution is the Poisson distribution, the probability mass function (pmf) of
which is

πœ‡π‘¦ βˆ’πœ‡
𝑓(𝑦) = 𝑒 , 𝑦 = 0, 1, 2, … , ∞
𝑦!

We can plot the Poisson distribution over 𝑦 for different values of πœ‡ as follows
poisson_pmf = lambda y, ΞΌ: ΞΌ**y / factorial(y) * exp(-ΞΌ)
[2]: y_values = range(0, 25)

fig, ax = plt.subplots(figsize=(12, 8))

for ΞΌ in [1, 5, 10]:


distribution = []
for y_i in y_values:
distribution.append(poisson_pmf(y_i, ΞΌ))
ax.plot(y_values,
distribution,
label=f'$\mu$={ΞΌ}',
alpha=0.5,
marker='o',
markersize=8)

ax.grid()
ax.set_xlabel('$y$', fontsize=14)
ax.set_ylabel('$f(y \mid \mu)$', fontsize=14)
ax.axis(xmin=0, ymin=0)
ax.legend(fontsize=14)

plt.show()
298 CHAPTER 19. MAXIMUM LIKELIHOOD ESTIMATION

Notice that the Poisson distribution begins to resemble a normal distribution as the mean of
𝑦 increases.
Let’s have a look at the distribution of the data we’ll be working with in this lecture.
Treisman’s main source of data is Forbes’ annual rankings of billionaires and their estimated
net worth.
The dataset mle/fp.dta can be downloaded here or from its AER page.
pd.options.display.max_columns = 10
[3]:
# Load in data and view
df = pd.read_stata('https://github.com/QuantEcon/QuantEcon.lectures.code/raw/master/mle/fp.
β†ͺdta')
df.head()

country ccode year cyear numbil … topint08 rintr \


[3]: 0 United States 2.0 1990.0 21990.0 NaN … 39.799999 4.988405
1 United States 2.0 1991.0 21991.0 NaN … 39.799999 4.988405
2 United States 2.0 1992.0 21992.0 NaN … 39.799999 4.988405
3 United States 2.0 1993.0 21993.0 NaN … 39.799999 4.988405
4 United States 2.0 1994.0 21994.0 NaN … 39.799999 4.988405

noyrs roflaw nrrents


0 20.0 1.61 NaN
1 20.0 1.61 NaN
2 20.0 1.61 NaN
3 20.0 1.61 NaN
4 20.0 1.61 NaN

[5 rows x 36 columns]

Using a histogram, we can view the distribution of the number of billionaires per country,
numbil0, in 2008 (the United States is dropped for plotting purposes)
numbil0_2008 = df[(df['year'] == 2008) & (
[4]: df['country'] != 'United States')].loc[:, 'numbil0']

plt.subplots(figsize=(12, 8))
plt.hist(numbil0_2008, bins=30)
plt.xlim(left=0)
plt.grid()
plt.xlabel('Number of billionaires in 2008')
plt.ylabel('Count')
plt.show()
19.4. CONDITIONAL DISTRIBUTIONS 299

From the histogram, it appears that the Poisson assumption is not unreasonable (albeit with
a very low πœ‡ and some outliers).

19.4 Conditional Distributions

In Treisman’s paper, the dependent variable β€” the number of billionaires 𝑦𝑖 in country 𝑖 β€”


is modeled as a function of GDP per capita, population size, and years membership in GATT
and WTO.
Hence, the distribution of 𝑦𝑖 needs to be conditioned on the vector of explanatory variables
x𝑖 .
The standard formulation β€” the so-called poisson regression model β€” is as follows:

𝑦
πœ‡π‘– 𝑖 βˆ’πœ‡π‘–
𝑓(𝑦𝑖 ∣ x𝑖 ) = 𝑒 ; 𝑦𝑖 = 0, 1, 2, … , ∞. (1)
𝑦𝑖 !

where πœ‡π‘– = exp(x′𝑖 𝛽) = exp(𝛽0 + 𝛽1 π‘₯𝑖1 + … + π›½π‘˜ π‘₯π‘–π‘˜ )

To illustrate the idea that the distribution of 𝑦𝑖 depends on x𝑖 let’s run a simple simulation.
We use our poisson_pmf function from above and arbitrary values for 𝛽 and x𝑖
y_values = range(0, 20)
[5]:
# Define a parameter vector with estimates
Ξ² = np.array([0.26, 0.18, 0.25, -0.1, -0.22])

# Create some observations X


datasets = [np.array([0, 1, 1, 1, 2]),
np.array([2, 3, 2, 4, 0]),
np.array([3, 4, 5, 3, 2]),
300 CHAPTER 19. MAXIMUM LIKELIHOOD ESTIMATION

np.array([6, 5, 4, 4, 7])]

fig, ax = plt.subplots(figsize=(12, 8))

for X in datasets:
ΞΌ = exp(X @ Ξ²)
distribution = []
for y_i in y_values:
distribution.append(poisson_pmf(y_i, ΞΌ))
ax.plot(y_values,
distribution,
label=f'$\mu_i$={ΞΌ:.1}',
marker='o',
markersize=8,
alpha=0.5)

ax.grid()
ax.legend()
ax.set_xlabel('$y \mid x_i$')
ax.set_ylabel(r'$f(y \mid x_i; \beta )$')
ax.axis(xmin=0, ymin=0)
plt.show()

We can see that the distribution of 𝑦𝑖 is conditional on x𝑖 (πœ‡π‘– is no longer constant).

19.5 Maximum Likelihood Estimation

In our model for number of billionaires, the conditional distribution contains 4 (π‘˜ = 4) pa-
rameters that we need to estimate.
We will label our entire parameter vector as 𝛽 where
19.5. MAXIMUM LIKELIHOOD ESTIMATION 301

𝛽0
βŽ‘π›½ ⎀
𝛽 = ⎒ 1βŽ₯
βŽ’π›½2 βŽ₯
βŽ£π›½3 ⎦

To estimate the model using MLE, we want to maximize the likelihood that our estimate 𝛽̂ is
the true parameter 𝛽.
Intuitively, we want to find the 𝛽̂ that best fits our data.
First, we need to construct the likelihood function β„’(𝛽), which is similar to a joint probabil-
ity density function.
Assume we have some data 𝑦𝑖 = {𝑦1 , 𝑦2 } and 𝑦𝑖 ∼ 𝑓(𝑦𝑖 ).
If 𝑦1 and 𝑦2 are independent, the joint pmf of these data is 𝑓(𝑦1 , 𝑦2 ) = 𝑓(𝑦1 ) β‹… 𝑓(𝑦2 ).
If 𝑦𝑖 follows a Poisson distribution with πœ† = 7, we can visualize the joint pmf like so
def plot_joint_poisson(ΞΌ=7, y_n=20):
[6]: yi_values = np.arange(0, y_n, 1)

# Create coordinate points of X and Y


X, Y = np.meshgrid(yi_values, yi_values)

# Multiply distributions together


Z = poisson_pmf(X, ΞΌ) * poisson_pmf(Y, ΞΌ)

fig = plt.figure(figsize=(12, 8))


ax = fig.add_subplot(111, projection='3d')
ax.plot_surface(X, Y, Z.T, cmap='terrain', alpha=0.6)
ax.scatter(X, Y, Z.T, color='black', alpha=0.5, linewidths=1)
ax.set(xlabel='$y_1$', ylabel='$y_2$')
ax.set_zlabel('$f(y_1, y_2)$', labelpad=10)
plt.show()

plot_joint_poisson(ΞΌ=7, y_n=20)
302 CHAPTER 19. MAXIMUM LIKELIHOOD ESTIMATION

Similarly, the joint pmf of our data (which is distributed as a conditional Poisson distribu-
tion) can be written as

𝑛 𝑦
πœ‡π‘– 𝑖 βˆ’πœ‡π‘–
𝑓(𝑦1 , 𝑦2 , … , 𝑦𝑛 ∣ x1 , x2 , … , x𝑛 ; 𝛽) = ∏ 𝑒
𝑖=1
𝑦𝑖 !

𝑦𝑖 is conditional on both the values of x𝑖 and the parameters 𝛽.


The likelihood function is the same as the joint pmf, but treats the parameter 𝛽 as a random
variable and takes the observations (𝑦𝑖 , x𝑖 ) as given

𝑛 𝑦
πœ‡π‘– 𝑖 βˆ’πœ‡π‘–
β„’(𝛽 ∣ 𝑦1 , 𝑦2 , … , 𝑦𝑛 ; x1 , x2 , … , x𝑛 ) = ∏ 𝑒
𝑖=1
𝑦𝑖 !
=𝑓(𝑦1 , 𝑦2 , … , 𝑦𝑛 ∣ x1 , x2 , … , x𝑛 ; 𝛽)

Now that we have our likelihood function, we want to find the 𝛽̂ that yields the maximum
likelihood value

maxβ„’(𝛽)
𝛽

In doing so it is generally easier to maximize the log-likelihood (consider differentiating


𝑓(π‘₯) = π‘₯ exp(π‘₯) vs. 𝑓(π‘₯) = log(π‘₯) + π‘₯).
Given that taking a logarithm is a monotone increasing transformation, a maximizer of the
likelihood function will also be a maximizer of the log-likelihood function.
In our case the log-likelihood is

log β„’(𝛽) = log (𝑓(𝑦1 ; 𝛽) β‹… 𝑓(𝑦2 ; 𝛽) β‹… … β‹… 𝑓(𝑦𝑛 ; 𝛽))


𝑛
= βˆ‘ log 𝑓(𝑦𝑖 ; 𝛽)
𝑖=1
𝑛 𝑦
πœ‡π‘– 𝑖 βˆ’πœ‡π‘–
= βˆ‘ log ( 𝑒 )
𝑖=1
𝑦𝑖 !
𝑛 𝑛 𝑛
= βˆ‘ 𝑦𝑖 log πœ‡π‘– βˆ’ βˆ‘ πœ‡π‘– βˆ’ βˆ‘ log 𝑦!
𝑖=1 𝑖=1 𝑖=1

The MLE of the Poisson to the Poisson for 𝛽 Μ‚ can be obtained by solving

𝑛 𝑛 𝑛
max( βˆ‘ 𝑦𝑖 log πœ‡π‘– βˆ’ βˆ‘ πœ‡π‘– βˆ’ βˆ‘ log 𝑦!)
𝛽
𝑖=1 𝑖=1 𝑖=1

However, no analytical solution exists to the above problem – to find the MLE we need to use
numerical methods.
19.6. MLE WITH NUMERICAL METHODS 303

19.6 MLE with Numerical Methods

Many distributions do not have nice, analytical solutions and therefore require numerical
methods to solve for parameter estimates.
One such numerical method is the Newton-Raphson algorithm.
Our goal is to find the maximum likelihood estimate 𝛽.Μ‚
At 𝛽,Μ‚ the first derivative of the log-likelihood function will be equal to 0.
Let’s illustrate this by supposing

log β„’(𝛽) = βˆ’(𝛽 βˆ’ 10)2 βˆ’ 10

Ξ² = np.linspace(1, 20)
[7]: logL = -(Ξ² - 10) ** 2 - 10
dlogL = -2 * Ξ² + 20

fig, (ax1, ax2) = plt.subplots(2, sharex=True, figsize=(12, 8))

ax1.plot(Ξ², logL, lw=2)


ax2.plot(Ξ², dlogL, lw=2)

ax1.set_ylabel(r'$log \mathcal{L(\beta)}$',
rotation=0,
labelpad=35,
fontsize=15)
ax2.set_ylabel(r'$\frac{dlog \mathcal{L(\beta)}}{d \beta}$ ',
rotation=0,
labelpad=35,
fontsize=19)
ax2.set_xlabel(r'$\beta$', fontsize=15)
ax1.grid(), ax2.grid()
plt.axhline(c='black')
plt.show()
304 CHAPTER 19. MAXIMUM LIKELIHOOD ESTIMATION

𝑑 log β„’(𝛽)
The plot shows that the maximum likelihood value (the top plot) occurs when 𝑑𝛽 = 0
(the bottom plot).
Therefore, the likelihood is maximized when 𝛽 = 10.
We can also ensure that this value is a maximum (as opposed to a minimum) by checking
that the second derivative (slope of the bottom plot) is negative.
The Newton-Raphson algorithm finds a point where the first derivative is 0.
To use the algorithm, we take an initial guess at the maximum value, 𝛽0 (the OLS parameter
estimates might be a reasonable guess), then

1. Use the updating rule to iterate the algorithm

𝛽 (π‘˜+1) = 𝛽 (π‘˜) βˆ’ 𝐻 βˆ’1 (𝛽 (π‘˜) )𝐺(𝛽 (π‘˜) )


where:

𝑑 log β„’(𝛽 (π‘˜) )


𝐺(𝛽 (π‘˜) ) =
𝑑𝛽 (π‘˜)
𝑑2 log β„’(𝛽 (π‘˜) )
𝐻(𝛽 (π‘˜) ) = β€²
𝑑𝛽 (π‘˜) 𝑑𝛽 (π‘˜)
2. Check whether 𝛽 (π‘˜+1) βˆ’ 𝛽 (π‘˜) < π‘‘π‘œπ‘™

β€’ If true, then stop iterating and set 𝛽̂ = 𝛽 (π‘˜+1)


β€’ If false, then update 𝛽 (π‘˜+1)

As can be seen from the updating equation, 𝛽 (π‘˜+1) = 𝛽 (π‘˜) only when 𝐺(𝛽 (π‘˜) ) = 0 ie. where the
first derivative is equal to 0.
(In practice, we stop iterating when the difference is below a small tolerance threshold)
Let’s have a go at implementing the Newton-Raphson algorithm.
First, we’ll create a class called PoissonRegression so we can easily recompute the values
of the log likelihood, gradient and Hessian for every iteration
class PoissonRegression:
[8]:
def __init__(self, y, X, Ξ²):
self.X = X
self.n, self.k = X.shape
# Reshape y as a n_by_1 column vector
self.y = y.reshape(self.n,1)
# Reshape Ξ² as a k_by_1 column vector
self.Ξ² = Ξ².reshape(self.k,1)

def ΞΌ(self):
return np.exp(self.X @ self.Ξ²)

def logL(self):
y = self.y
ΞΌ = self.ΞΌ()
return np.sum(y * np.log(ΞΌ) - ΞΌ - np.log(factorial(y)))

def G(self):
y = self.y
ΞΌ = self.ΞΌ()
return X.T @ (y - ΞΌ)
19.6. MLE WITH NUMERICAL METHODS 305

def H(self):
X = self.X
ΞΌ = self.ΞΌ()
return -(X.T @ (ΞΌ * X))

Our function newton_raphson will take a PoissonRegression object that has an initial
guess of the parameter vector 𝛽 0 .
The algorithm will update the parameter vector according to the updating rule, and recalcu-
late the gradient and Hessian matrices at the new parameter estimates.
Iteration will end when either:

β€’ The difference between the parameter and the updated parameter is below a tolerance
level.
β€’ The maximum number of iterations has been achieved (meaning convergence is not
achieved).

So we can get an idea of what’s going on while the algorithm is running, an option
display=True is added to print out values at each iteration.
def newton_raphson(model, tol=1e-3, max_iter=1000, display=True):
[9]:
i = 0
error = 100 # Initial error value

# Print header of output


if display:
header = f'{"Iteration_k":<13}{"Log-likelihood":<16}{"ΞΈ":<60}'
print(header)
print("-" * len(header))

# While loop runs while any value in error is greater


# than the tolerance until max iterations are reached
while np.any(error > tol) and i < max_iter:
H, G = model.H(), model.G()
Ξ²_new = model.Ξ² - (np.linalg.inv(H) @ G)
error = Ξ²_new - model.Ξ²
model.Ξ² = Ξ²_new

# Print iterations
if display:
Ξ²_list = [f'{t:.3}' for t in list(model.Ξ².flatten())]
update = f'{i:<13}{model.logL():<16.8}{Ξ²_list}'
print(update)

i += 1

print(f'Number of iterations: {i}')


print(f'Ξ²_hat = {model.Ξ².flatten()}')

# Return a flat array for Ξ² (instead of a k_by_1 column vector)


return model.Ξ².flatten()

Let’s try out our algorithm with a small dataset of 5 observations and 3 variables in X.
X = np.array([[1, 2, 5],
[10]: [1, 1, 3],
[1, 4, 2],
[1, 5, 2],
[1, 3, 1]])

y = np.array([1, 0, 1, 1, 0])

# Take a guess at initial Ξ²s


init_Ξ² = np.array([0.1, 0.1, 0.1])
306 CHAPTER 19. MAXIMUM LIKELIHOOD ESTIMATION

# Create an object with Poisson model values


poi = PoissonRegression(y, X, Ξ²=init_Ξ²)

# Use newton_raphson to find the MLE


Ξ²_hat = newton_raphson(poi, display=True)

Iteration_k Log-likelihood ΞΈ
--------------------------------------------------------------------------------
---------
0 -4.3447622 ['-1.49', '0.265', '0.244']
1 -3.5742413 ['-3.38', '0.528', '0.474']
2 -3.3999526 ['-5.06', '0.782', '0.702']
3 -3.3788646 ['-5.92', '0.909', '0.82']
4 -3.3783559 ['-6.07', '0.933', '0.843']
5 -3.3783555 ['-6.08', '0.933', '0.843']
Number of iterations: 6
Ξ²_hat = [-6.07848205 0.93340226 0.84329625]

As this was a simple model with few observations, the algorithm achieved convergence in only
6 iterations.
You can see that with each iteration, the log-likelihood value increased.
Remember, our objective was to maximize the log-likelihood function, which the algorithm
has worked to achieve.
Also, note that the increase in log β„’(𝛽 (π‘˜) ) becomes smaller with each iteration.
This is because the gradient is approaching 0 as we reach the maximum, and therefore the
numerator in our updating equation is becoming smaller.
The gradient vector should be close to 0 at 𝛽̂
poi.G()
[11]:
array([[-3.95169226e-07],
[11]: [-1.00114804e-06],
[-7.73114559e-07]])

The iterative process can be visualized in the following diagram, where the maximum is found
at 𝛽 = 10
logL = lambda x: -(x - 10) ** 2 - 10
[12]:
def find_tangent(Ξ², a=0.01):
y1 = logL(Ξ²)
y2 = logL(Ξ²+a)
x = np.array([[Ξ², 1], [Ξ²+a, 1]])
m, c = np.linalg.lstsq(x, np.array([y1, y2]), rcond=None)[0]
return m, c

Ξ² = np.linspace(2, 18)
fig, ax = plt.subplots(figsize=(12, 8))
ax.plot(Ξ², logL(Ξ²), lw=2, c='black')

for Ξ² in [7, 8.5, 9.5, 10]:


Ξ²_line = np.linspace(Ξ²-2, Ξ²+2)
m, c = find_tangent(Ξ²)
y = m * Ξ²_line + c
ax.plot(Ξ²_line, y, '-', c='purple', alpha=0.8)
ax.text(Ξ²+2.05, y[-1], f'$G({Ξ²}) = {abs(m):.0f}$', fontsize=12)
ax.vlines(Ξ², -24, logL(Ξ²), linestyles='--', alpha=0.5)
ax.hlines(logL(Ξ²), 6, Ξ², linestyles='--', alpha=0.5)

ax.set(ylim=(-24, -4), xlim=(6, 13))


ax.set_xlabel(r'$\beta$', fontsize=15)
ax.set_ylabel(r'$log \mathcal{L(\beta)}$',
rotation=0,
19.7. MAXIMUM LIKELIHOOD ESTIMATION WITH STATSMODELS 307

labelpad=25,
fontsize=15)
ax.grid(alpha=0.3)
plt.show()

Note that our implementation of the Newton-Raphson algorithm is rather basic β€” for more
robust implementations see, for example, scipy.optimize.

19.7 Maximum Likelihood Estimation with statsmodels

Now that we know what’s going on under the hood, we can apply MLE to an interesting ap-
plication.
We’ll use the Poisson regression model in statsmodels to obtain a richer output with stan-
dard errors, test values, and more.
statsmodels uses the same algorithm as above to find the maximum likelihood estimates.
Before we begin, let’s re-estimate our simple model with statsmodels to confirm we obtain
the same coefficients and log-likelihood value.
X = np.array([[1, 2, 5],
[13]: [1, 1, 3],
[1, 4, 2],
[1, 5, 2],
[1, 3, 1]])

y = np.array([1, 0, 1, 1, 0])

stats_poisson = Poisson(y, X).fit()


print(stats_poisson.summary())

Optimization terminated successfully.


Current function value: 0.675671
308 CHAPTER 19. MAXIMUM LIKELIHOOD ESTIMATION

Iterations 7
Poisson Regression Results
==============================================================================
Dep. Variable: y No. Observations: 5
Model: Poisson Df Residuals: 2
Method: MLE Df Model: 2
Date: Sun, 20 Oct 2019 Pseudo R-squ.: 0.2546
Time: 17:06:51 Log-Likelihood: -3.3784
converged: True LL-Null: -4.5325
Covariance Type: nonrobust LLR p-value: 0.3153
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
const -6.0785 5.279 -1.151 0.250 -16.425 4.268
x1 0.9334 0.829 1.126 0.260 -0.691 2.558
x2 0.8433 0.798 1.057 0.291 -0.720 2.407
==============================================================================

Now let’s replicate results from Daniel Treisman’s paper, Russia’s Billionaires, mentioned ear-
lier in the lecture.
Treisman starts by estimating equation Eq. (1), where:

β€’ 𝑦𝑖 is π‘›π‘’π‘šπ‘π‘’π‘Ÿ π‘œπ‘“ π‘π‘–π‘™π‘™π‘–π‘œπ‘›π‘Žπ‘–π‘Ÿπ‘’π‘ π‘–
β€’ π‘₯𝑖1 is log 𝐺𝐷𝑃 π‘π‘’π‘Ÿ π‘π‘Žπ‘π‘–π‘‘π‘Žπ‘–
β€’ π‘₯𝑖2 is log π‘π‘œπ‘π‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘›π‘–
β€’ π‘₯𝑖3 is π‘¦π‘’π‘Žπ‘Ÿπ‘  𝑖𝑛 𝐺𝐴𝑇 𝑇 𝑖 – years membership in GATT and WTO (to proxy access to in-
ternational markets)

The paper only considers the year 2008 for estimation.


We will set up our variables for estimation like so (you should have the data assigned to df
from earlier in the lecture)
# Keep only year 2008
[14]: df = df[df['year'] == 2008]

# Add a constant
df['const'] = 1

# Variable sets
reg1 = ['const', 'lngdppc', 'lnpop', 'gattwto08']
reg2 = ['const', 'lngdppc', 'lnpop',
'gattwto08', 'lnmcap08', 'rintr', 'topint08']
reg3 = ['const', 'lngdppc', 'lnpop', 'gattwto08', 'lnmcap08',
'rintr', 'topint08', 'nrrents', 'roflaw']

Then we can use the Poisson function from statsmodels to fit the model.
We’ll use robust standard errors as in the author’s paper
# Specify model
[15]: poisson_reg = sm.Poisson(df[['numbil0']], df[reg1],
missing='drop').fit(cov_type='HC0')
print(poisson_reg.summary())

Optimization terminated successfully.


Current function value: 2.226090
Iterations 9
Poisson Regression Results
==============================================================================
Dep. Variable: numbil0 No. Observations: 197
Model: Poisson Df Residuals: 193
Method: MLE Df Model: 3
19.7. MAXIMUM LIKELIHOOD ESTIMATION WITH STATSMODELS 309

Date: Sun, 20 Oct 2019 Pseudo R-squ.: 0.8574


Time: 17:06:51 Log-Likelihood: -438.54
converged: True LL-Null: -3074.7
Covariance Type: HC0 LLR p-value: 0.000
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
const -29.0495 2.578 -11.268 0.000 -34.103 -23.997
lngdppc 1.0839 0.138 7.834 0.000 0.813 1.355
lnpop 1.1714 0.097 12.024 0.000 0.980 1.362
gattwto08 0.0060 0.007 0.868 0.386 -0.008 0.019
==============================================================================

Success! The algorithm was able to achieve convergence in 9 iterations.


Our output indicates that GDP per capita, population, and years of membership in the Gen-
eral Agreement on Tariffs and Trade (GATT) are positively related to the number of billion-
aires a country has, as expected.
Let’s also estimate the author’s more full-featured models and display them in a single table
regs = [reg1, reg2, reg3]
[16]: reg_names = ['Model 1', 'Model 2', 'Model 3']
info_dict = {'Pseudo R-squared': lambda x: f"{x.prsquared:.2f}",
'No. observations': lambda x: f"{int(x.nobs):d}"}
regressor_order = ['const',
'lngdppc',
'lnpop',
'gattwto08',
'lnmcap08',
'rintr',
'topint08',
'nrrents',
'roflaw']
results = []

for reg in regs:


result = sm.Poisson(df[['numbil0']], df[reg],
missing='drop').fit(cov_type='HC0',
maxiter=100, disp=0)
results.append(result)

results_table = summary_col(results=results,
float_format='%0.3f',
stars=True,
model_names=reg_names,
info_dict=info_dict,
regressor_order=regressor_order)
results_table.add_title('Table 1 - Explaining the Number of Billionaires \
in 2008')
print(results_table)

Table 1 - Explaining the Number of Billionaires in 2008


=================================================
Model 1 Model 2 Model 3
-------------------------------------------------
const -29.050*** -19.444*** -20.858***
(2.578) (4.820) (4.255)
lngdppc 1.084*** 0.717*** 0.737***
(0.138) (0.244) (0.233)
lnpop 1.171*** 0.806*** 0.929***
(0.097) (0.213) (0.195)
gattwto08 0.006 0.007 0.004
(0.007) (0.006) (0.006)
lnmcap08 0.399** 0.286*
(0.172) (0.167)
rintr -0.010 -0.009
(0.010) (0.010)
topint08 -0.051*** -0.058***
(0.011) (0.012)
310 CHAPTER 19. MAXIMUM LIKELIHOOD ESTIMATION

nrrents -0.005
(0.010)
roflaw 0.203
(0.372)
Pseudo R-squared 0.86 0.90 0.90
No. observations 197 131 131
=================================================
Standard errors in parentheses.
* p<.1, ** p<.05, ***p<.01

The output suggests that the frequency of billionaires is positively correlated with GDP
per capita, population size, stock market capitalization, and negatively correlated with top
marginal income tax rate.
To analyze our results by country, we can plot the difference between the predicted an actual
values, then sort from highest to lowest and plot the first 15
data = ['const', 'lngdppc', 'lnpop', 'gattwto08', 'lnmcap08', 'rintr',
[17]: 'topint08', 'nrrents', 'roflaw', 'numbil0', 'country']
results_df = df[data].dropna()

# Use last model (model 3)


results_df['prediction'] = results[-1].predict()

# Calculate difference
results_df['difference'] = results_df['numbil0'] - results_df['prediction']

# Sort in descending order


results_df.sort_values('difference', ascending=False, inplace=True)

# Plot the first 15 data points


results_df[:15].plot('country', 'difference', kind='bar',
figsize=(12,8), legend=False)
plt.ylabel('Number of billionaires above predicted level')
plt.xlabel('Country')
plt.show()
19.8. SUMMARY 311

As we can see, Russia has by far the highest number of billionaires in excess of what is pre-
dicted by the model (around 50 more than expected).
Treisman uses this empirical result to discuss possible reasons for Russia’s excess of billion-
aires, including the origination of wealth in Russia, the political climate, and the history of
privatization in the years after the USSR.

19.8 Summary

In this lecture, we used Maximum Likelihood Estimation to estimate the parameters of a


Poisson model.
statsmodels contains other built-in likelihood models such as Probit and Logit.
For further flexibility, statsmodels provides a way to specify the distribution manually us-
ing the GenericLikelihoodModel class - an example notebook can be found here.

19.9 Exercises

19.9.1 Exercise 1

Suppose we wanted to estimate the probability of an event 𝑦𝑖 occurring, given some observa-
tions.
312 CHAPTER 19. MAXIMUM LIKELIHOOD ESTIMATION

We could use a probit regression model, where the pmf of 𝑦𝑖 is

𝑦
𝑓(𝑦𝑖 ; 𝛽) = πœ‡π‘– 𝑖 (1 βˆ’ πœ‡π‘– )1βˆ’π‘¦π‘– , 𝑦𝑖 = 0, 1
where πœ‡π‘– = Ξ¦(x′𝑖 𝛽)

Ξ¦ represents the cumulative normal distribution and constrains the predicted 𝑦𝑖 to be be-
tween 0 and 1 (as required for a probability).
𝛽 is a vector of coefficients.
Following the example in the lecture, write a class to represent the Probit model.
To begin, find the log-likelihood function and derive the gradient and Hessian.
The scipy module stats.norm contains the functions needed to compute the cmf and pmf
of the normal distribution.

19.9.2 Exercise 2

Use the following dataset and initial values of 𝛽 to estimate the MLE with the Newton-
Raphson algorithm developed earlier in the lecture

1 2 4 1
⎑1 1 1⎀ ⎑0⎀ 0.1
⎒ βŽ₯ ⎒ βŽ₯
X = ⎒1 4 3βŽ₯ 𝑦 = ⎒1βŽ₯ 𝛽 (0) = ⎑
⎒0.1βŽ₯
⎀
⎒1 5 6βŽ₯ ⎒1βŽ₯ ⎣0.1⎦
⎣1 3 5⎦ ⎣0⎦

Verify your results with statsmodels - you can import the Probit function with the follow-
ing import statement
from statsmodels.discrete.discrete_model import Probit
[18]:

Note that the simple Newton-Raphson algorithm developed in this lecture is very sensitive to
initial values, and therefore you may fail to achieve convergence with different starting values.

19.10 Solutions

19.10.1 Exercise 1

The log-likelihood can be written as

𝑛
log β„’ = βˆ‘ [𝑦𝑖 log Ξ¦(x′𝑖 𝛽) + (1 βˆ’ 𝑦𝑖 ) log(1 βˆ’ Ξ¦(x′𝑖 𝛽))]
𝑖=1

Using the fundamental theorem of calculus, the derivative of a cumulative probability


distribution is its marginal distribution

πœ•
Ξ¦(𝑠) = πœ™(𝑠)
πœ•π‘ 
where πœ™ is the marginal normal distribution.
19.10. SOLUTIONS 313

The gradient vector of the Probit model is

𝑛
πœ• log β„’ πœ™(x′𝑖 𝛽) πœ™(x′𝑖 𝛽)
= βˆ‘ [𝑦𝑖 βˆ’ (1 βˆ’ 𝑦𝑖 ) ]x
πœ•π›½ 𝑖=1
β€²
Ξ¦(x𝑖 𝛽) 1 βˆ’ Ξ¦(x′𝑖 𝛽) 𝑖

The Hessian of the Probit model is

𝑛
πœ• 2 log β„’ β€² πœ™(x′𝑖 𝛽) + x′𝑖 𝛽Φ(x′𝑖 𝛽) πœ™π‘– (x′𝑖 𝛽) βˆ’ x′𝑖 𝛽(1 βˆ’ Ξ¦(x′𝑖 𝛽))
β€² = βˆ’ βˆ‘ πœ™(x 𝑖 𝛽)[𝑦 𝑖 β€² 2
+ (1 βˆ’ 𝑦 𝑖 ) β€² 2
]x𝑖 x′𝑖
πœ•π›½πœ•π›½ 𝑖=1
[Ξ¦(x 𝑖 𝛽)] [1 βˆ’ Ξ¦(x 𝑖 𝛽)]

Using these results, we can write a class for the Probit model as follows
class ProbitRegression:
[19]:
def __init__(self, y, X, Ξ²):
self.X, self.y, self.Ξ² = X, y, Ξ²
self.n, self.k = X.shape

def ΞΌ(self):
return norm.cdf(self.X @ self.Ξ².T)

def οΏ½(self):
return norm.pdf(self.X @ self.Ξ².T)

def logL(self):
ΞΌ = self.ΞΌ()
return np.sum(y * np.log(ΞΌ) + (1 - y) * np.log(1 - ΞΌ))

def G(self):
ΞΌ = self.ΞΌ()
οΏ½ = self.οΏ½()
return np.sum((X.T * y * οΏ½ / ΞΌ - X.T * (1 - y) * οΏ½ / (1 - ΞΌ)),
axis=1)

def H(self):
X = self.X
Ξ² = self.Ξ²
ΞΌ = self.ΞΌ()
οΏ½ = self.οΏ½()
a = (οΏ½ + (X @ Ξ².T) * ΞΌ) / ΞΌ**2
b = (οΏ½ - (X @ Ξ².T) * (1 - ΞΌ)) / (1 - ΞΌ)**2
return -(οΏ½ * (y * a + (1 - y) * b) * X.T) @ X

19.10.2 Exercise 2

X = np.array([[1, 2, 4],
[20]: [1, 1, 1],
[1, 4, 3],
[1, 5, 6],
[1, 3, 5]])

y = np.array([1, 0, 1, 1, 0])

# Take a guess at initial Ξ²s


Ξ² = np.array([0.1, 0.1, 0.1])

# Create instance of Probit regression class


prob = ProbitRegression(y, X, Ξ²)

# Run Newton-Raphson algorithm


newton_raphson(prob)
314 CHAPTER 19. MAXIMUM LIKELIHOOD ESTIMATION

Iteration_k Log-likelihood ΞΈ
--------------------------------------------------------------------------------
---------
0 -2.3796884 ['-1.34', '0.775', '-0.157']
1 -2.3687526 ['-1.53', '0.775', '-0.0981']
2 -2.3687294 ['-1.55', '0.778', '-0.0971']
3 -2.3687294 ['-1.55', '0.778', '-0.0971']
Number of iterations: 4
Ξ²_hat = [-1.54625858 0.77778952 -0.09709757]

array([-1.54625858, 0.77778952, -0.09709757])


[20]:
# Use statsmodels to verify results
[21]:
print(Probit(y, X).fit().summary())

Optimization terminated successfully.


Current function value: 0.473746
Iterations 6
Probit Regression Results
======================================================================