0% found this document useful (0 votes)
22 views9 pages

01 Python Alt Data NLP Class Example

This document is a Python primer for data analysis, covering basic concepts such as variables, lists, dictionaries, and control structures like loops and conditionals. It also introduces the Python standard library, NumPy for numerical operations, and pandas for data manipulation, including loading and filtering time series data. Additionally, it demonstrates how to perform arithmetic operations with pandas and visualize data using plotting libraries.

Uploaded by

Karry
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views9 pages

01 Python Alt Data NLP Class Example

This document is a Python primer for data analysis, covering basic concepts such as variables, lists, dictionaries, and control structures like loops and conditionals. It also introduces the Python standard library, NumPy for numerical operations, and pandas for data manipulation, including loading and filtering time series data. Additionally, it demonstrates how to perform arithmetic operations with pandas and visualize data using plotting libraries.

Uploaded by

Karry
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Example of Python primer for data analysis

Saeed Amen / Founder of Cuemacro

https://www.cuemacro.com / [email protected] / @saeedamenfx / All material is copyright Cuemacro /


2020

In this class example, we show some examples of basic concepts in Python like variables, lists etc. some of the
Python standard library too.

Print hello world!


In [1]:

hello world!

Defining variables and checking types


Examples where we define several variables and check Python's duck typing.

In [2]: # Define a few variables and check their types

<class 'int'>
<class 'str'>
<class 'float'>

Arithmetic and logical operators


Playing around with various operators with Python variables

In [3]: # Use some arithmetic operators, start with addition

2
Out[3]:

In [4]: # Now multiplication

1.4
Out[4]:

In [5]: # Use some comparative operators (greater than)

False
Out[5]:

In [6]: # Smaller than

True
Out[6]:

In [7]: # Equals

True
Out[7]:
In [8]: # Not equals

True
Out[8]:

In [9]: # Concatenating strings

'helloworld'
Out[9]:

Lists
A few examples of combining together variables in Python list and running operations on lists.

In [10]: # Create a list of integers

In [11]: # Append to the list

In [12]: # What does the list look like?

[1, 3, 2, 1]

In [13]: # Sort the list

In [14]: # Print the list

[1, 1, 2, 3]

In [15]: # Only print the last element

In [16]: # Print the first element

In [17]: # Print middle elements

[1, 2]

Dictionaries
Dictionaries can be useful to store many types of data, such as phone directories.

In [18]: # Create a dictionary of restaurants as keys and values as burgers

In [19]: # Print out a value for one of the keys

Whopper

In [20]:
# Create a list of the keys and print the lsit

dict_keys(['Burger King', 'McDonalds'])

If statement
In [21]: # Demonstrate an if statement

Yes 2 is greater than one

For loops
A for loop lets us run the same operation repeatedly for different elements.

In [22]: # Create a simple for loop to print numbers between 0 and 5


# We use range to create the list

0
1
2
3
4
5

List comprehension
This is similar to a for loop, but is more succinct.

In [23]: # Do the same thing using a list comprehension

0
1
2
3
4
5

Creating a function
A function let's us collect together code which will be used repeatedly.

In [24]: # Create a function to double numbers

In [25]: # Try running the function

Iteration vs recursion
Using an example based on concatenation of characters.

In [26]: # Let's concatenate numbers 0 to 5 by iteration

012345
In [27]: # Let's concatenate numbers 0 to 5 by recursion

# Convert numbers to strings (so can be concatenated)

012345

Python standard library


We'll show a few common libraries, such as datetime and math from Python.

In [28]: # Print the current time and date using datetime

2020-06-27 13:55:16.519604

In [29]: # Add a day to the current time

2020-06-28 13:55:16.534604

In [30]: # Round a float

In [31]: # Use the floor and ceil function from math

4
5

NumPy
We show a few basic examples of creating NumPy arrays and some operations we can perform on them.

In [32]: # Import NumPy library to begin

# Let's create two lists of numbers and convert into NumPy arrays

# The results will be 1 x 4 narrays

In [33]: # Do basic arithmetic operations on them (don't use for loops here!)

[0 0 0 0]
[ 1 4 9 16]
[ 5 7 9 11]

In [34]: # Query the various properties of the NumPy arrays


# such as their size and their dimensions

4
1

In [35]: # Create a 2 x 2 narray, ie. matrix

[[1 2]
[3 4]]

In [36]: # Resize our original 1 x 4 narray into a 2 x 2 matrix


[[1 2]
[3 4]]

In [37]: # Sum the columns of our matrix and then the rows

[4 6]
[3 7]

Loading time series with pandas and filter them


In [38]: # We'll fetch a CSV file on Cuemacro's GitHub site that has FX spot data
# which we've got from FRED earlier
# CSV file at 'https://raw.githubusercontent.com/cuemacro/teaching/master/pythoncourse/dat

In [39]: # Let's load up the CSV file and make sure to convert the index column into
# pandas dates

# Hint: use read_csv function from pandas

In [40]: # Print the first 5 rows to see what it looks like...

EURUSD.close USDJPY.close GBPUSD.close AUDUSD.close \


Date
1989-01-04 NaN 125.05 1.8070 0.8698
1989-01-05 NaN 125.59 1.7970 0.8679
1989-01-06 NaN 126.62 1.7800 0.8625
1989-01-09 NaN 126.45 1.7636 0.8637
1989-01-10 NaN 126.30 1.7637 0.8664

USDCAD.close NZDUSD.close USDCHF.close USDNOK.close \


Date
1989-01-04 1.1921 0.6365 1.5188 6.581
1989-01-05 1.1895 0.6380 1.5310 6.593
1989-01-06 1.1952 0.6365 1.5490 6.645
1989-01-09 1.1970 0.6350 1.5575 6.678
1989-01-10 1.2009 0.6310 1.5645 6.689

USDSEK.close
Date
1989-01-04 6.160
1989-01-05 6.180
1989-01-06 6.230
1989-01-09 6.264
1989-01-10 6.275

In [41]: # Obviously don't have any EURUSD data before the introduction of the EUR in 1999
# Let's remove any entries before '04 Jan 1999'

# Fill forward NaN values to simplify analysis later (hint: use fillna)

In [42]: # Print the first 5 points again to check

EURUSD.close USDJPY.close GBPUSD.close AUDUSD.close \


Date
1999-01-04 1.1812 112.15 1.6581 0.6182
1999-01-05 1.1760 111.15 1.6566 0.6217
1999-01-06 1.1636 112.78 1.6547 0.6285
1999-01-07 1.1672 111.69 1.6495 0.6340
1999-01-08 1.1554 111.52 1.6405 0.6326

USDCAD.close NZDUSD.close USDCHF.close USDNOK.close \


Date
1999-01-04 1.5268 0.5340 1.3666 7.4960
1999-01-05 1.5213 0.5370 1.3694 7.4340
1999-01-06 1.5110 0.5385 1.3852 7.4355
1999-01-07 1.5117 0.5405 1.3863 7.4050
1999-01-08 1.5145 0.5400 1.3970 7.3970

USDSEK.close
Date
1999-01-04 8.0200
1999-01-05 7.9720
1999-01-06 7.9360
1999-01-07 7.9150
1999-01-08 7.9285

Basic arithmetic operations with Pandas


We'll show how we can do arithmetic on Pandas DataFrames. Here we are making the USD base currency for
each time series.

In [43]: # Next step we might want to do is to make sure everything is quoted with USD base

# Go through each USD cross (hint: use a for loop)

# Invert column if USD is not base


# Otherwise append it as it is

In [44]: # Note, how EUR, GBP, AUD and NZD have now been inverted
# Everything is now quoted USDabc

USDEUR.close USDJPY.close USDGBP.close USDAUD.close \


Date
1999-01-04 0.846597 112.15 0.603100 1.617599
1999-01-05 0.850340 111.15 0.603646 1.608493
1999-01-06 0.859402 112.78 0.604339 1.591090
1999-01-07 0.856751 111.69 0.606244 1.577287
1999-01-08 0.865501 111.52 0.609570 1.580778

USDCAD.close USDNZD.close USDCHF.close USDNOK.close \


Date
1999-01-04 1.5268 1.872659 1.3666 7.4960
1999-01-05 1.5213 1.862197 1.3694 7.4340
1999-01-06 1.5110 1.857010 1.3852 7.4355
1999-01-07 1.5117 1.850139 1.3863 7.4050
1999-01-08 1.5145 1.851852 1.3970 7.3970

USDSEK.close
Date
1999-01-04 8.0200
1999-01-05 7.9720
1999-01-06 7.9360
1999-01-07 7.9150
1999-01-08 7.9285

Creating rebased indices


We rebase our FX returns index, so they start at 100 so the assets are easier to compare with each other.

In [45]: # Calculate simple returns for every FX pair

In [46]: # Let's rebase each currency pair so it starts at 100


# Hint: use cumprod function

# Set the first values as being 100

In [47]: # Everything now starts at 100, so it's possible to compare them


# First value is undefined because the first day's returns

USDEUR.close USDJPY.close USDGBP.close USDAUD.close \


Date
1999-01-04 100.000000 100.000000 100.000000 100.000000
1999-01-05 100.442177 99.108337 100.090547 99.437028
1999-01-06 101.512547 100.561748 100.205475 98.361177
1999-01-07 101.199452 99.589835 100.521370 97.507886
1999-01-08 102.232993 99.438252 101.072844 97.723680

USDCAD.close USDNZD.close USDCHF.close USDNOK.close \


Date
1999-01-04 100.000000 100.000000 100.000000 100.000000
1999-01-05 99.639769 99.441341 100.204888 99.172892
1999-01-06 98.965156 99.164345 101.361042 99.192903
1999-01-07 99.011003 98.797410 101.441534 98.786019
1999-01-08 99.194394 98.888889 102.224499 98.679296

USDSEK.close
Date
1999-01-04 100.000000
1999-01-05 99.401496
1999-01-06 98.952618
1999-01-07 98.690773
1999-01-08 98.859102

Plotting line charts


We now demonstrate how to plot time series in Pandas DataFrames.

In [48]: # Make sure to inline charts, so they appear in notebook

# Set matplotlib style

<matplotlib.axes._subplots.AxesSubplot at 0x20a0660df28>
Out[48]:
In [49]: # We can also plot with chartpy (with matplotlib first...)

# Create Style object to set title etc.

# Now plot with chartpy via matplotlib

In [50]: # Now use chartpy but changing size (or can try plotly)

You might also like