0% found this document useful (0 votes)

9 views113 pages

Advanced Backtesting Paper

The document discusses advanced backtesting techniques in Python for trading strategies, emphasizing the importance of efficiency and performance. It highlights how Python can achieve C-like performance through various optimization techniques while maintaining readability. The paper provides practical examples and results demonstrating significant speed improvements in strategy simulations.

Uploaded by

dreamland_101

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views113 pages

Advanced Backtesting Paper

Uploaded by

dreamland_101

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 113

Flirting with CPUs

Advanced Backtesting in Python (with Code)

HangukQuant1, 2 *

September 12, 2022

1 2

Abstract

Testing trading strategies are an essential part of the quant tool-kit. By definition, researching and
testing of ideas/strategies in the quantitative space is synonymous with prototyping. Ease of expressing
our ideas and correctness of code is prioritised. As a result, quantitative researchers often perform analysis
in a high-level language such as Python, abstracting away the marginal concerns of memory management,
garbage collection, object typing and compilation, while leaving quantitative engineers to do the ‘dirty’
work in a low-level language such as cpp. This comes at the cost of speed, often running up to 100 times
slower than it ought to! However, many are still reluctant to ‘leave’ the Python community - there is
excellent community support and a vast universe of useful libraries for data scientists and quants. In this
paper we see how we can write simple, readable code in Python with ‘C’ performance. The techniques
discussed herein can be extrapolated to other quantitative tasks, particularly where computationally
demanding tasks are at hand. Our focus is on single machine (as opposed to clusters), CPU (as opposed
to GPU) optimizations for faster computation (as opposed to memory/RAM) efficiencies.

1 *1: [email protected], hangukquant.substack.com

2 *2: DISCLAIMER: the contents of this work are not intended as investment, legal, tax or any other advice, and is for
informational purposes only. It is illegal to make unauthorized copies, forward to an unauthorized user or to post this article
electronically without express written consent by HangukQuant.

1
Contents

1 Introduction 5

2 Our Results 5

3 Introducing the Alpha Set 6

3.1 Some Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3.2 Equity Skewness Idiosyncratic Premia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.3 Short-Term FX Cross Sectional, Custom Risk Mean Reversion . . . . . . . . . . . . . . . . . 16

3.3.1 FX-Calculator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.3.2 Custom Risk Variant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.3.3 Generalising Portfolio Volatility Scalar . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.4 Time-Series Long-Biased Momentum in Equities With Ex-Post Risk Management for Crash
Risk Factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4 Benchmarking 35

4.1 Comment on Version Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.2 Comment on Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.3 Strat1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4.4 Strat2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4.5 Strat3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5 Parent-Child Relationships - Some OOP & Rewriting our Code 40

5.1 Generic Alpha Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

5.2 Strat1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5.3 Strat2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

5.4 Strat3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

6 Profiling Our Code 50

6.1 Profiling, Iterating, Rewriting... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

6.1.1 Function Profiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

2
6.1.2 Replacing Function Calls with More Suitable Choices . . . . . . . . . . . . . . . . . . 54

6.1.3 Note on Profilers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

6.1.4 Profiling Code Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

6.1.5 Inspecting Bytecode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

7 Numba’s Just in Time Compilers 62

8 Vector Operations, CPUs and Memory Fragmentation 67

8.1 Computing Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

8.2 Memory Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

8.3 Vectorization in Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

8.4 Vectorization in Our Code, Bit Masking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

9 Taking a Step Back, Efficiency with Pandas 81

9.1 Same-Same but Different . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

9.2 Making our Code Same Same-ly Different . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

9.2.1 Improvement 1 - Rearranging Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

9.2.2 Improvement 2 - Custom Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

9.2.3 Improvement 3 - Custom Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

9.2.4 Improvement 4 - Compiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

10 I/O Reduction, Cython, and More Vectorization 94

10.1 I/O Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

10.2 Introducing Cython . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

10.3 Back to Numpy - All Functions are Equal, But Some are More Equal than Others . . . . . . 97

10.3.1 Vectorized Forecasting, 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

10.3.2 Vectorized Forecasting, 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

10.3.3 Cython + Numpy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

11 Fighting for Scraps 100

12 Other Notable Tools 112

3
13 Conclusions 113

4
1 Introduction

As we often do, in our papers we do not beat about the bush. We cover technical details as we go through
the examples. We ‘learn by doing’, instead of rambling on about theory. If the theory is well-known in
literature such that one can easily reference them, we refer readers to the source. There is no point for me
to waste time re-iterating another blog post. As before, the same caveat applies - this is a result of weekly
musings - there are bound to be mistakes. Please email all errors/issues to [email protected] or tweet
me at @HangukQuant.

For most traders (non-MM/HFT firms), ‘instantaneous’ is not a necessity - but we still require efficiency.
For smaller (1 ∼ 5) groups of traders, members of the group often have to be equipped with skills in both
quantitative research and trade execution. The ‘efficiency’ refers not only to the trading latency - it also
refers to the efficiency of the team as a whole; the ability of the team to pipeline between data, alpha and
trade management. If you spend a week prototyping an idea and then another implementing it in another
‘fast’ language, this is an efficiency cost. Programming in a homogeneous framework tremendously speeds
up the quant process, and is significantly easier to handle. A well-written Python script can run at fractions
of a second slower than a finely-tuned C program - in most cases this is not a concern - then the team should
favor the more efficient solution, which is to adopt a homogeneous framework with a ‘simpler’ interface.

The downside is that many quantitative researchers do not know how to write efficient, readable Python
code. The idea they want to test takes 1 hour for them to see if it works - when it should only take a couple
of minutes. In this case, these issues seriously hamper the team efficiency as a whole. We should avoid this
whenever possible.

2 Our Results

We manage to achieve speedup up by one to two orders of magnitude with the discussions involved:

Version Strat 1 Strat 2 Strat 3

Original 1338 226 1250

Improved 14 5 12

Table 1: run simulation runtimes, full problem set in SECONDS

5
Version Strat 1 Strat 2 Strat 3

Original 116 - 81
Improved 10 - 5

Table 2: run simulation runtimes, reduced problem set in SECONDS

3 Introducing the Alpha Set

As with most other papers, we will introduce the sample alphas and preliminary code. Then we iteratively
improve upon them to arrive at a better solution. Most readers should know: ‘We can do better’ is implicit
in all iterations of our code, our work and our life. We always start with the imperfect and strive to be better.
We show the tools, processes and intuitions for the differences between the iterations. To drive concreteness,
we will not work with contrive examples. Let’s begin right away.

We are not particularly concerned about the attractiveness of the alpha - although some of them appear
to have fairly attractive performance, we caution readers from taking them ‘as is’. We take no responsibility
over their correctness, viability or profitability in trading markets. Trading is a risky operation.

Here are their equity curves, for reference:

6
Figure 1: equity curve of our alpha set

with Sharpe:

Strat 1 Strat 2 Strat 3

0.116 0.679 1.127

Table 3: Strategy Sharpe(s)

3.1 Some Utilities

Let’s create a mini-utility library that acts as a stub for our database and other utilities. Relevant data can
be obtained from YahooFinance, Oanda or other sources. Data retrieval is discussed and implemented in
previous papers by HangukQuant - we will not meddle with them here.

dirtylib is the library that we will call to retrieve data for our equities and FX data. Our dataset consists
of stocks and fx-pairs from 2010.01.01 ∼ 2020.01.01, for no particular reason. Our pickled data is a tuple

7
of an array of instruments in our universe, and the dfs is a dictionary of the corresponding OHLCV(AC)
dataframes for each instrument. We do a simple cleanup of null dataframes and empty dataframes. Next,
we label the currency that the instrument in denominated in, using % to mean ‘denominated’. For equity
tickers such as AAPL, we use the ticker AAPL%USD to indicate denomination in USD. For fx-pairs, we label
with the last 3 characters, such that EURJPY → EURJPY%JPY, USDJPY → USDJPY%JPY, EURUSD
→ EURUSD%USD and so on. We assume the base currency of our simulator is in USD. We see how this
nomenclature is useful later.

1 TRADE_START = datetime . datetime (2010 , 1 , 1)

2 TRADE_END = datetime . datetime (2020 , 1 , 1)
3

4 def g e t _ e q u i t i e s _ d i c t () :
5 ( instruments , dfs ) = get_cleaned ( " equities_temp . pickle " )
6 renamed = {}
7 for i in instruments :
8 renamed [ i + " % USD " ] = dfs [ i ]
9 return list ( renamed . keys () ) , renamed
10

11 def get_fx_dict () :
12 ( instruments , dfs ) = get_cleaned ( " fx_temp . pickle " )
13 renamed = {}
14 for i in instruments :
15 if len ( i ) == 6:
16 renamed [ i + " % " + i [3:]] = dfs [ i ]
17 else :
18 renamed [ i + " % USD " ] = dfs [ i ]
19 return list ( renamed . keys () ) , renamed
20

21 def get_cleaned ( picklename ) :

22 ( instruments , dfs ) = gu . load_file ( picklename )
23 dirty = []
24 for inst in instruments :
25 if not inst in dfs or dfs [ inst ] is None or len ( dfs [ inst ]) <= 0:
26 dirty . append ( inst )
27 del dfs [ inst ]
28 else :
29 dfs [ inst ] = dfs [ inst ]. set_index ( " datetime " , drop = True )
30

31 print ( " Deleted : " , dirty )

32 return [ inst for inst in instruments if inst not in dirty ] , dfs

Listing 1: dirtylib.py

8
3.2 Equity Skewness Idiosyncratic Premia

For a detailed explanation of the skewness premia, refer to other literature. This is one that I have lightly
discussed upon many times before - it is simple and intuitive to understand. We will explain it in very high
level - people love lotteries, even though they are negative EV products. The return distribution of lotteries
are positively skewed. If there is a systematic preference for positively skewed return assets, then these assets
must be overpriced. Our trading strategy is short positive skew and long negative skew.

The return skewness can be decomposed into idiosyncratic skew and systematic skew, which is the skew
of the index/benchmark itself. In order to remove the beta effect, instead of computing skew based on
absolute returns of the asset, we do a variant where we compute skew based on the residuals of individual
returns regressed on market returns. We adopt a volatility targeting approach, at two levels - a) first
individual positions sizing are taken inversely proportional to their 30d rolling volatility of close-to-close
returns relative to other traded positions, and b) overall positions are scaled relative to their historical
volatility of the strategy to achieve a target portfolio volatility of 20% annualized. Our choice of variant is
simply to demonstrate a task that is rather computationally demanding - we need to do this on a rolling
basis, such that the regressions and skew computations are rolled every-day. Perhaps it is better to illustrate
with code.

Let’s start with some basic imports.

1 import numpy as np
2 import pandas as pd
3

4 import backt est_util s

5 import dirtylib as dlib
6

7 from dirtylib import g e t _ e q u i t i e s _ d i c t

8 from dirtylib import get_fx_dict

Listing 2: strat1.py

We hope to run a test with some code as such: (we picked the first 200 instruments from our equities
universe to save time...)
1 if __name__ == " __main__ " :
2 ( instruments , dfs ) = g e t _ e q u i t i e s _ d i c t ()
3 alpha = Alpha (
4 instruments = instruments [:200] ,
5 dfs = dfs ,
6 configs ={
7 " start " : dlib . TRADE_START ,
8 " end " : dlib . TRADE_END ,

9
9 " longsize " : 50 ,
10 " shortsize " : 50
11 }
12 )
13 portfolio_df = alpha . run _simulat ion ()

Listing 3: strat1.py

which means we need to create an Alpha class and implement the function run simulation.

This is as simple as:

1 class Alpha () :
2

3 def init ( self , instruments , dfs , configs ) :

4 self . instruments = instruments
5 self . dfs = dfs
6 self . configs = config

Listing 4: strat1.py

and the meat of our testing engine (bear through the 99 lines of code here, we will make it alot cleaner
as we go...):
1 def run_sim ulation ( self ) :
2 """
3 Settings
4 """
5 portfolio_vol = 0.20
6 t r a d e _ d a t e t i m e _ r a n g e = pd . date_range (
7 start = self . g e t _ t r a d e _ d a t e t i m e _ r a n g e () [0] ,
8 end = self . g e t _ t r a d e _ d a t e t i m e _ r a n g e () [1] ,
9 freq = " D "
10 )
11

12 """
13 Compute Metas
14 """
15 self . compute_metas ( index = t r a d e _ d a t e t i m e _ r a n g e )
16

17 """
18 I ni ti ali sa ti on s
19 """
20 portfolio_df = self . i n i t _ p o r t f o l i o _ s e t t i n g s ( trade_range = t r a d e _ d a t e t i m e _ r a n g e )
21

22 for i in portfolio_df . index :

10
23 date = portfolio_df . loc [i , " datetime " ]
24 strat_scalar = 2
25

26 eligibles , non_eligibles = self . c o m p u t e _ e l i g i b l e s ( date = date )

28 if i != 0:
29 date_prev = portfolio_df . loc [ i - 1 , " datetime " ]
30 day_pnl , nominal_ret = b acktest_ utils . get_pnl_stats (
31 date = date ,
32 prev = date_prev ,
33 portfolio_df = portfolio_df ,
34 instruments = self . instruments ,
35 idx =i ,
36 historicals = self . dfs ,
37 close_col = " adj_close "
38 )
39

40 strat_scalar = self . g e t _ s t r a t _ sc a l e r (
41 portfolio_df = portfolio_df ,
42 lookback =30 ,
43 portfolio_vol = portfolio_vol ,
44 idx =i ,
45 default = strat_scalar
46 )
47 portfolio_df . loc [i , " strat scalar " ] = strat_scalar
48

49 """
50 Compute Alpha
51 """
52 alpha_scores = {}
53 for inst in eligibles :
54 from sklearn . linear_model import L i n e a r R e g r e s s i on
55 from sklearn . preprocessing import P o l y n o m i a l F e a t u r e s
56 from scipy . stats import skew
57

58 date_start = portfolio_df . loc [ i - 60 , " datetime " ]

59 x = self . dfs [ " mkt " ][ " ret " ]. loc [ date_start : date ]. values . reshape (( -1 , 1) )
60 y = self . dfs [ inst ][ " ret " ]. loc [ date_start : date ]
61 model = L i n e a r R e g re s s i o n () . fit (x , y )
62 ypred = model . predict ( x )
63 residuals = y - ypred
64 alpha_scores [ inst ] = -1 * skew ( residuals )
65

11
66 alpha_scores = { k : v for k , v in sorted ( alpha_scores . items () , key = lambda pair : pair
[1]) }
67 alphalong = list ( alpha_scores . keys () ) [ - self . configs [ " longsize " ]:]
68 alphashort = list ( alpha_scores . keys () ) [: self . configs [ " shortsize " ]]
69

70 for inst in non_eligibles :

71 portfolio_df . loc [ i , " {} w " . format ( inst ) ] = 0
72 portfolio_df . loc [ i , " {} units " . format ( inst ) ] = 0
73

74 nominal_tot = 0
75 for inst in eligibles :
76 forecast = 1 if inst in alphalong else 0
77 forecast = -1 if inst in alphashort else forecast
78 vol_target = 1 /( self . configs [ " longsize " ] + self . configs [ " shortsize " ]) \
79 * portfolio_df . loc [i , " capital " ] \
80 * portfolio_vol \
81 / np . sqrt (253)
82 dollar_vol = self . dfs [ inst ]. loc [ date , " vol " ] * self . dfs [ inst ]. loc [ date , "
adj_close " ]
83 position = strat_scalar * forecast * vol_target / dollar_vol
84 portfolio_df . loc [i , inst + " units " ] = position
85 nominal_tot += abs ( position * self . dfs [ inst ]. loc [ date , " adj_close " ])
86

87 for inst in eligibles :

88 units = portfolio_df . loc [i , " {} units " . format ( inst ) ]
89 nominal_inst = units * self . dfs [ inst ]. loc [ date , " adj_close " ]
90 inst_w = nominal_inst / nominal_tot
91 portfolio_df . loc [i , " {} w " . format ( inst ) ] = inst_w
92

93 portfolio_df . loc [i , " nominal " ] = nominal_tot

94 portfolio_df . loc [i , " leverage " ] = portfolio_df . loc [i , " nominal " ] / portfolio_df . loc [
i , " capital " ]
95 print ( portfolio_df . loc [ i ])
96

97 print ((1 + portfolio_df [ " capital ret " ]) . cumprod () )

98 portfolio_df . to_csv ( " strat1 . csv " )
99 return portfolio_df

Listing 5: strat1.py

In order, we initialise the settings for portfolio volatility at 20%, create a date range that goes from the
start to end of our chosen range. We then use the compute metas function to compute metadata required for
the backtest, and then for each day: a) get a list of dynamic tradable assets, b) get a position sizing scalar,

12
c) get pnl, d) calculate updated positions, e) set positions inversely proportional to their historical volatility
with the capital divided among the actively traded instruments, f) compute target weights and number of
units to hold in the portfolio.

We need to implement the called functions in run simulation:

2 def g e t _ t r a d e _ d a t e t i m e _ r a n g e ( self ) :
3 return ( self . configs [ " start " ] , self . configs [ " end " ])
4

5 def compute_metas ( self , index ) :

6 retagg = pd . DataFrame ( index = index )
7

8 for inst in self . instruments :

9 print ( inst )
10 df = pd . DataFrame ( index = index )
11 self . dfs [ inst ][ " vol " ] = ( -1 + self . dfs [ inst ][ " adj_close " ] / self . dfs [ inst ]. shift (1) [
" adj_close " ]) . rolling (30) . std ()
12 self . dfs [ inst ][ " alive " ] = self . dfs [ inst ][ " adj_close " ]. rolling (61) . apply ( lambda x : ~
np . any ( np . isnan ( x ) ) ) . fillna (0)
13 self . dfs [ inst ] = df . join ( self . dfs [ inst ])
14 self . dfs [ inst ] = self . dfs [ inst ]. fillna ( method = " ffill " ) . fillna ( method = " bfill " )
15 self . dfs [ inst ][ " ret " ] = -1 + self . dfs [ inst ][ " adj_close " ] / self . dfs [ inst ]. shift (1) [ "
adj_close " ]
16 self . dfs [ inst ][ " sampled " ] = self . dfs [ inst ][ " adj_close " ] != self . dfs [ inst ]. shift (1) [ "
adj_close " ]
17 self . dfs [ inst ][ " active " ] = self . dfs [ inst ][ " sampled " ]. rolling (5) . apply ( lambda x : int (
np . any ( x ) ) ) . fillna (0)
18

19 self . dfs [ inst ][ " eligible " ] = \

20 self . dfs [ inst ][ " alive " ]. astype ( int ) &\
21 self . dfs [ inst ][ " active " ]. astype ( int ) &\
22 ( self . dfs [ inst ][ " vol " ] > 0) . astype ( int ) &\
23 ( self . dfs [ inst ][ " adj_close " ] > 0) . astype ( int )
24

25 retagg [ inst ] = self . dfs [ inst ][ " ret " ]

27 retagg [ " ret " ] = retagg . mean ( axis =1)

28 self . dfs [ " mkt " ] = retagg
29

30 def i n i t _ p o r t f o l i o _ s e t t i n g s ( self , trade_range ) :

31 portfolio_df = pd . DataFrame ( index = trade_range ) . reset_index () . rename ( columns ={ " index " : "
datetime " })
32 portfolio_df . loc [0 , " capital " ] = 10000

13
33 portfolio_df . loc [0 , " ewma " ] = 0.001
34 return portfolio_df
35

36 def c o m p u t e _ e l i g i b l e s ( self , date ) :

37 eligibles = [ inst for inst in self . instruments if self . dfs [ inst ]. loc [ date , " eligible " ]]
38 non_eligibles = [ inst for inst in self . instruments if not self . dfs [ inst ]. loc [ date , "
eligible " ]]
39 return eligibles , non_eligibles

Listing 6: strat1.py

Throughout our backtest, we will need volatility statistics for risk management, close-to-close returns
to compute pnl, a list of eligible stocks to be traded on a particular date, and the ‘index/benchmark’ returns
(computed as the aggregated mean of all the returns in the universe). A ticker is considered eligible to be
traded if there are continuous 61 days of data for skew computation AND if there is movement in the stock
prices over the last 5 entries with non-negative price.

Note that the order of the statements in compute metas is non-trivial, in particular with respect to the
statement

self.dfs[inst] = df.join(self.dfs[inst])

. The statement

df = pd.DataFrame(index=index)

creates a daily series of dates (including weekends, holidays and so on) while the dataframes referenced by

self.dfs[inst]

contains a dataframe where the index is not guaranteed to be a continuous series (this depends on your
data source). Likely, the data source will contain only rows where there were samples to take, such that on
weekends and exchange holidays there were no entries. The ‘join’ operation will cause gaps in the joined
dataframe. We want to compute volatility and active universes before we forward fill and backward fill our
data. This subtlety is important.

Finally, we want to implement the method that gets our portfolio scalar,
1 def g e t _ s t r a t _ s c a l e r ( self , portfolio_df , lookback , portfolio_vol , idx , default ) :
2 a n n _ r e al i z e d _ v o l = np . sqrt ( portfolio_df . loc [ idx - 1 , " ewma " ] * 252)
3 return portfolio_vol / a n n _ r e a l i z e d _ v o l

Listing 7: strat1.py

14
where the ewma references the forward volatility estimate of our trading strategy’s nominal returns on an
unleveraged portfolio. We discussed this volatility targeting formulation in our previous papers [1]. Recall
that the individual positions are computed as follows:

vol_target = 1 \
/ (self.configs["longsize"] + self.configs["shortsize"]) \
* portfolio_df.loc[i, "capital"] \
* portfolio_vol \
/ np.sqrt(253)
dollar_vol = self.dfs[inst].loc[date, "vol"] * self.dfs[inst].loc[date, "adj_close"]
position = strat_scalar * forecast * vol_target / dollar_vol

The value

capital * volatility / sqrt(253)

is the volatility in dollar terms rescaled to daily granularity for the entire portfolio. Hence, the variable
vol target is the volatility in dollar terms for a single position, which is divided by the asset volatility (also
in dollar terms). Hence, strat scalar is the portfolio scalar to increase position sizing due to volatility loss
attributed to diversification. We will improve this later to achieve more flexibility in our risk management.

Last but not least, we need to implement the functionality to get our pnl each day:
1 def get_pnl_stats ( date , prev , portfolio_df , instruments , idx , historicals , close_col = "
adj_close " ) :
2 day_pnl = 0
3 nominal_ret = 0
4 for inst in instruments :
5 units_held = portfolio_df . loc [ idx - 1 , inst + " units " . format ( inst ) ]
6 if units_held != 0:
7 delta_price = historicals [ inst ]. loc [ date , " adj_close " ] - historicals [ inst ]. loc [
prev , " adj_close " ]
8 inst_pnl = units_held * delta_price
9 day_pnl += inst_pnl
10 nominal_ret += portfolio_df . loc [ idx - 1 , inst + " w " ] * historicals [ inst ]. loc [
date , " ret " ]
11

12 capital_ret = nominal_ret * portfolio_df . loc [ idx - 1 , " leverage " ]

13 portfolio_df . loc [ idx , " capital " ] = portfolio_df . loc [ idx - 1 , " capital " ] + day_pnl
14 portfolio_df . loc [ idx , " daily pnl " ] = day_pnl
15 portfolio_df . loc [ idx , " nominal ret " ] = nominal_ret
16 portfolio_df . loc [ idx , " capital ret " ] = capital_ret

15
17 portfolio_df . loc [ idx , " ewma " ] = 0.06 * ( nominal_ret **2) + 0.94 * portfolio_df . loc [ idx -
1 , " ewma " ] \
18 if nominal_ret != 0 else portfolio_df . loc [ idx - 1 , " ewma " ]
19 return day_pnl , nominal_ret

Listing 8: backtest utils.py

Note the line that performs EWMA volatility forecasts, which we use when computing the volatility
scalar. It is computed on unleveraged portfolio returns, hence we need to use the variable nominal ret. For
a brief discussion on this, refer to paper on vol & leverage targeting. [1].

We are done. This is our first strategy. Most of the strategies follow this format - a risk ‘controller’,
an alpha ‘controller’ and these two interacting together to churn signals. The biggest differences between
trading strategies are how the ‘controller’-s vary in terms of their logic and implementation. We look to
expand on this by introducing some more variants of strategies that we might encounter in the wild.

3.3 Short-Term FX Cross Sectional, Custom Risk Mean Reversion

The first strategy was a cross-sectional, inverse volatility-risk strategy trading the skewness premia on US
equities. For this strategy, we want to expand beyond and trade FX/products based in other currencies. We
consider a very simple signal that goes long the largest drawdown over a rolling 12 day period and shorts the
smallest drawdowns. In order to do this, we need to implement an fx-calculator - let’s add this functionality
to our backtest utils.

3.3.1 FX-Calculator

1 def u ni t_ b as e_ va l ue ( instrument , dfs , date ) :

2 if instrument [ -3:] == " USD " :
3 return dfs [ instrument ]. loc [ date , " adj_close " ]
4 if instrument [ -3:] + " USD % USD " in dfs :
5 return dfs [ instrument ]. loc [ date , " adj_close " ] * dfs [ instrument [ -3:] + " USD % USD " ]. loc
[ date , " adj_close " ]
6 elif " USD " + instrument [ -3:] + " % " + instrument [ -3:] in dfs :
7 return dfs [ instrument ]. loc [ date , " adj_close " ] * 1 / dfs [ " USD " + instrument [ -3:] + " %
" + instrument [ -3:]]. loc [ date , " adj_close " ]
8 else :
9 print ( " no resolution " , instrument )
10 exit ()

Listing 9: backtest utils.py

16
This function takes in an instrument (recall that we labelled them in their denominated currency with the
% sign, and returns the value of a single unit of that instrument in the base currency. Corresponding to the
condition:

if instrument[-3:] == "USD":
return dfs[instrument].loc[date, "adj_close"]

an instrument denominated in ‘USD’ such as AAPL%USD or EURUSD%USD has a value (in base currency
USD) worth the price itself. The condition:

if instrument[-3:] + "USD%USD" in dfs:

return dfs[instrument].loc[date, "adj_close"]
* dfs[instrument[-3:] + "USD%USD"].loc[date, "adj_close"]

for an instrument such EURAUD%AUD is worth EURAUD%AUD * AUDUSD%USD, and the condition:

elif "USD" + instrument[-3:] + "%" + instrument[-3:] in dfs:

return dfs[instrument].loc[date, "adj_close"]
* 1 / dfs["USD" + instrument[-3:] + "%" + instrument[-3:]].loc[date, "adj_close"]

for an instrument such EURJPY%JPY is worth EURJPY%JPY * (1 / USDJPY%JPY).

This affects our calculation of the nominal value of a single unit of non-US denominated product, and
shall be reflected in our code. Accordingly, the ‘dollar volatility’ computation is also affected, and we need to
make the relevant adjustments. Before demonstrating the code right away, we shall also introduce another
1
variant to our risk allocator - suppose we no longer want to take relative position sizing as vol .

3.3.2 Custom Risk Variant

We may want to adopt a risk variant that is agnostic to relative volatilties, such that their ‘risk’ estimate a
priori is a unit constant. Or, we may want to adopt a risk variant that takes positions inversely related to
their volatilities, but not proportionately so - penalising positions that are too large (a result of volatility
compression). Either way, we want to implement position sizing that is in an inverse relation to our estimate
of its relative risk (whether our estimate is constant or variable is another issue), else it would be a foolish
decision.

In FX (& other) markets, there is often compression of volatility followed by aggressive movement
(breakouts). The compression of volatility would - in an inverse volatility framework - cause us to position
aggressively in the asset to achieve the same amount of relative risk in volatility terms with respect to the

17
other assets. However, this can often be risky as volatility has a mean reverting component; the state of
affairs can change and a big move in the low-volatility asset may occur. Getting caught wrong-footed in these
positions would be career-ending. These situations can occur naturally in markets, or under influence of a
third-party (such as central bank FX-pegging). Let’s do a case-study in a 2 asset universe, using EURCHF
and USDJPY as our examples.

Figure 2: EURCHF, USDJPY closing prices

We see immediately that there are jumps in the EURCHF closing prices, while USDJPY prices have a
smoother path. The Swiss National Bank pegged its Swiss franc to the euro on Sept 6, 2011, and scrapped
it on Jan 15, 2015 - the flash crash is evident in the price chart. Let’s take a look at their 30-day rolling
volatilities, and their inverses (since we take positions inversely related to it):

(a) 30-D Rolling Volatilities (b) 30-D Inverse Rolling Volatilities

We see there is some correlation but EURCHF is significantly jumpier - since their relative sizing scalar
is the ratio of their volatilities let’s take a look at the ratio of the rolling volatilties:

18
Figure 4: EURCHF USDJPY ratio of rolling inverse volatilities

, where we see that often we have positions in EURCHF that is 5 ∼ 20 times larger than the position
in USDJPY! Recalling the mean reverting nature of volatilities, this position sizing might not be astute and
in this 2 asset scenario the entire strategy pnl distribution is dominated by our EURCHF directional bets
instead of the premia targeted. There is lopsided nominal exposure, even though on a historical volatility
basis we are market neutral.

Additionally, we know that alpha is theoretical, and trading costs are real. We want to reduce trading
costs as much as possible, and jumpiness introduces positional change, which is costly. Let’s take a look at
the 1-day delta of rolling inverse volatilties:

19
Figure 5: EURCHF USDJPY rolling inverse volatilities deltas

which suggest that from one day to the next we might increase our position sizing by up to 400% in an
asset held. We want to penalise extreme low volatilities, while maintaining relative position sizing based on
perceived risk - consider the order of growth of some monotonic functions that ‘punish’ larger values:

20
Figure 6: Sqrt and log(base) ord of growth.

where we can determine the ‘punish’ factor for large positions by varying the log base. Let’s now suppose
we choose the base 1.3, and re-view the ratio of their rolling-log-inverse-volatilties,

21
Figure 7: EURCHF USDJPY ratio of rolling log inverse volatilities

which might give a more palatable relative sizing between the two assets, and their delta-1 ratios:

22
Figure 8: EURCHF USDJPY rolling log inverse volatilities deltas

which gives significantly less position turnover during volatility jumps. This simple exercise leads us
to different variants of relative risk sizing, and the determination of the behavior of our system is based
on the trader preference - it is entirely possible that the trader wants aggressive relative allocations based
on volatility, but he should be aware that it is a ’feature not a bug’. Note that in real systems it is often
preferred to remove pegged currencies from the universe under consideration entirely but the same issue
applies to illiquid names, and the punishing component is a relatively common feature, although it might
have implementation differences.

3.3.3 Generalising Portfolio Volatility Scalar

Summing up the last two sections, with the FX calculator and the custom risk variant, we may want to
adjust the code logic computing position sizing, such that it looks something like:

vol_target = 1 / (self.configs["longsize"] + self.configs["shortsize"])

* portfolio_df.loc[i, "capital"]
* portfolio_vol
/ np.sqrt(253)

23
position = strat_scalar
* forecast
* vol_target
* self.dfs[inst].loc[date, "loginvvol"]
/ backtest_utils.unit_base_value(instrument=inst, dfs=self.dfs, date=date)

where the unit base value is the value of unit of that instrument adjusted in base currency terms and
the loginvvol is our revised estimate of risk (or whatever else it might be). However, the adjustments create a
minor complication, in that our portfolio scalar strat scalar was computed on an unleveraged portfolio returns
and compensates for volatility loss from diversification. However, in our log risk variant the volatility loss is
not purely diversification, since now our position has a different scaling factor. We need to compute strategy
scalar from realised portfolio (leveraged) returns, such that we make the following adjustments:
1 def g e t _ s t r a t _ s c a l e r ( self , portfolio_df , lookback , portfolio_vol , idx , default ) :
2 a n n _ r e al i z e d _ v o l = np . sqrt ( portfolio_df . loc [ idx - 1 , " ewma " ] * 252)
3 return portfolio_vol / a n n _ r e a l i z e d _ v o l * portfolio_df . loc [ idx - 1 , " ewstrat " ] # <<

Listing 10: strat2.py

1 def get_pnl_stats ( date , prev , portfolio_df , instruments , idx , historicals , close_col = "
adj_close " ) :
2 day_pnl = 0
3 nominal_ret = 0
4 for inst in instruments :
5 units_held = portfolio_df . loc [ idx - 1 , inst + " units " . format ( inst ) ]
6 if units_held != 0:
7 inst_pnl = units_held * un it _b a se _v al u e ( instrument = inst , dfs = historicals , date =
prev ) * historicals [ inst ]. loc [ date , " ret " ]
8 day_pnl += inst_pnl
9 nominal_ret += portfolio_df . loc [ idx - 1 , inst + " w " ] * historicals [ inst ]. loc [
date , " ret " ]
10

11 capital_ret = nominal_ret * portfolio_df . loc [ idx - 1 , " leverage " ]

12 portfolio_df . loc [ idx , " capital " ] = portfolio_df . loc [ idx - 1 , " capital " ] + day_pnl
13 portfolio_df . loc [ idx , " daily pnl " ] = day_pnl
14 portfolio_df . loc [ idx , " nominal ret " ] = nominal_ret
15 portfolio_df . loc [ idx , " capital ret " ] = capital_ret
16 portfolio_df . loc [ idx , " ewma " ] = 0.06 * ( capital_ret **2) + 0.94 * portfolio_df . loc [ idx -
1 , " ewma " ] \
17 if nominal_ret != 0 else portfolio_df . loc [ idx - 1 , " ewma " ] # <<
18 return day_pnl , nominal_ret

Listing 11: backtest utils.py

24
and we replace all in the relevant parts of our logic where the currency conversion should take place.
Note that I have added the statements in the eligibility check:

& (self.dfs[inst]["ret"].shift(-1) < 0.30) \

& (self.dfs[inst]["ret"].shift(-1) > -0.30)

This is an obvious look-ahead bias but I have included it for a couple of reasons: a) my backtest is not to test
exactly how profitable it has been in the past but also to test if the mean-reversion effect exists in isolation
of anomalies and b) because it turns out my data source was corrupted and had random numbers sampled
on some days. Obviously, if the strategy was specifically trading tails then this would affect the viability of
my hypothesis checking, but in this case it does not.

We make the relevant changes to account for FX conversions and varying risk estimates, and our script
looks something like this for strategy 2 (note how most of the other code remains the same!)
1 import numpy as np
2 import pandas as pd
3

4 import backt est_util s

5 import dirtylib as dlib
6

7 from dirtylib import g e t _ e q u i t i e s _ d i c t

8 from dirtylib import get_fx_dict
9

10 class Alpha () :
11

12 def init ( self , instruments , dfs , configs ) :

13 self . instruments = instruments
14 self . dfs = dfs
15 self . configs = configs
16

17 def g e t _ t r a d e _ d a t e t i m e _ r a n g e ( self ) :
18 return ( self . configs [ " start " ] , self . configs [ " end " ])
19

20 def compute_metas ( self , index ) :

21 for inst in self . instruments :
22 print ( inst )
23 df = pd . DataFrame ( index = index )
24 self . dfs [ inst ][ " vol " ] = ( -1 + self . dfs [ inst ][ " adj_close " ] / self . dfs [ inst ]. shift
(1) [ " adj_close " ]) . rolling (30) . std ()
25 self . dfs [ inst ][ " loginvvol " ] = np . log (1 / self . dfs [ inst ][ " vol " ]) / np . log (1.3)
26 self . dfs [ inst ] = df . join ( self . dfs [ inst ])
27 self . dfs [ inst ] = self . dfs [ inst ]. fillna ( method = " ffill " ) . fillna ( method = " bfill " )

25
28 self . dfs [ inst ][ " ret " ] = -1 + self . dfs [ inst ][ " adj_close " ] / self . dfs [ inst ]. shift
(1) [ " adj_close " ]
29 self . dfs [ inst ][ " sampled " ] = self . dfs [ inst ][ " adj_close " ] != self . dfs [ inst ]. shift
(1) [ " adj_close " ]
30 self . dfs [ inst ][ " active " ] = self . dfs [ inst ][ " sampled " ]. rolling (5) . apply ( lambda x :
int ( np . any ( x ) ) ) . fillna (0)
31

32 self . dfs [ inst ][ " dd " ] = (1 + self . dfs [ inst ][ " ret " ]) . rolling ( window =12) . apply ( np .
prod ) / (1 + self . dfs [ inst ][ " ret " ]) . rolling ( window =12) . apply ( np . prod ) . cummax () - 1
33 self . dfs [ inst ][ " eligible " ] = \
34 (~ np . isnan ( self . dfs [ inst ][ " dd " ]) ) \
35 & self . dfs [ inst ][ " active " ] \
36 & ( self . dfs [ inst ][ " vol " ] > 0) \
37 & ( self . dfs [ inst ][ " adj_close " ] > 0) \
38 & ( self . dfs [ inst ][ " ret " ]. shift ( -1) < 0.30) \
39 & ( self . dfs [ inst ][ " ret " ]. shift ( -1) > -0.30)
40

41 def i n i t _ p o r t f o l i o _ s e t t i n g s ( self , trade_range ) :

42 portfolio_df = pd . DataFrame ( index = trade_range ) . reset_index () . rename ( columns ={ " index "
: " datetime " })
43 portfolio_df . loc [0 , " capital " ] = 10000
44 portfolio_df . loc [0 , " ewma " ] = 0.001
45 portfolio_df . loc [0 , " ewstrat " ] = 1
46 return portfolio_df
47

48 def c o m p u t e _ e l i g i b l e s ( self , date ) :

49 eligibles = [ inst for inst in self . instruments if self . dfs [ inst ]. loc [ date , " eligible
" ]]
50 non_eligibles = [ inst for inst in self . instruments if not self . dfs [ inst ]. loc [ date , "
eligible " ]]
51 return eligibles , non_eligibles
52

53 def ge t_ s t r a t _ s c a l e r ( self , portfolio_df , lookback , portfolio_vol , idx , default ) :

54 a n n _ r e a l i z e d _v o l = np . sqrt ( portfolio_df . loc [ idx - 1 , " ewma " ] * 252)
55 return portfolio_vol / a n n _ r e a l i z e d _ v o l * portfolio_df . loc [ idx - 1 , " ewstrat " ]
56

57 def run_sim ulation ( self ) :

58 """
59 Settings
60 """
61 portfolio_vol = 0.20
62 t r a d e _ d a t e t i m e _ r a n g e = pd . date_range (
63 start = self . g e t _ t r a d e _ d a t e t i m e _ r a n g e () [0] ,

26
64 end = self . g e t _ t r a d e _ d a t e t i m e _ r a n g e () [1] ,
65 freq = " D "
66 )
67

68 """
69 Compute Metas
70 """
71 self . compute_metas ( index = t r a d e _ d a t e t i m e _ r a n g e )
72

73 """
74 I niti al i sa ti o ns
75 """
76 portfolio_df = self . i n i t _ p o r t f o l i o _ s e t t i n g s ( trade_range = t r a d e _ d a t e t i m e _ r a n g e )
77

78 for i in portfolio_df . index :

79 date = portfolio_df . loc [i , " datetime " ]
80 strat_scalar = 2
81

82 eligibles , non_eligibles = self . c o m p u t e _ e l i g i b l e s ( date = date )

84 if i != 0:
85 date_prev = portfolio_df . loc [ i - 1 , " datetime " ]
86 day_pnl , nominal_ret = ba cktest_ utils . get_pnl_stats (
87 date = date ,
88 prev = date_prev ,
89 portfolio_df = portfolio_df ,
90 instruments = self . instruments ,
91 idx =i ,
92 historicals = self . dfs ,
93 close_col = " adj_close "
94 )
95

96 strat_scalar = self . g e t _ s t r at _ s c a l e r (
97 portfolio_df = portfolio_df ,
98 lookback =30 ,
99 portfolio_vol = portfolio_vol ,
100 idx =i ,
101 default = strat_scalar
102 )
103

104 portfolio_df . loc [i , " ewstrat " ] = 0.06 * strat_scalar + 0.94 * portfolio_df .
loc [ i - 1 , " ewstrat " ] \
105 if nominal_ret != 0 else portfolio_df . loc [ i - 1 , " ewstrat " ]

27
106

107 portfolio_df . loc [i , " strat scalar " ] = strat_scalar

108

109 alpha_scores = {}
110 for inst in eligibles :
111 alpha_scores [ inst ] = -1 * self . dfs [ inst ]. loc [ date , " dd " ]
112

113 alpha_scores = { k : v for k , v in sorted ( alpha_scores . items () , key = lambda pair :

pair [1]) }
114 alphalong = list ( alpha_scores . keys () ) [ - self . configs [ " longsize " ]:]
115 alphashort = list ( alpha_scores . keys () ) [: self . configs [ " shortsize " ]]
116

117 for inst in non_eligibles :

118 portfolio_df . loc [ i , " {} w " . format ( inst ) ] = 0
119 portfolio_df . loc [ i , " {} units " . format ( inst ) ] = 0
120

121 nominal_tot = 0
122 for inst in eligibles :
123 forecast = 1 if inst in alphalong else 0
124 forecast = -1 if inst in alphashort else forecast
125 vol_target = 1 / ( self . configs [ " longsize " ] + self . configs [ " shortsize " ]) \
126 * portfolio_df . loc [i , " capital " ] \
127 * portfolio_vol \
128 / np . sqrt (253)
129 position = strat_scalar * forecast * vol_target * self . dfs [ inst ]. loc [ date , "
loginvvol " ] / bac ktest_u tils . un i t_ ba se _ va lu e ( instrument = inst , dfs = self . dfs , date = date )
130 portfolio_df . loc [i , inst + " units " ] = position
131 nominal_tot += abs ( position * backt est_uti ls . un it _b a se _v al u e ( instrument =
inst , dfs = self . dfs , date = date ) )
132

133 for inst in eligibles :

134 units = portfolio_df . loc [i , " {} units " . format ( inst ) ]
135 nominal_inst = units * bac ktest_ut ils . un i t_ ba se _ va lu e ( instrument = inst , dfs =
self . dfs , date = date )
136 inst_w = nominal_inst / nominal_tot
137 portfolio_df . loc [i , " {} w " . format ( inst ) ] = inst_w
138

139 portfolio_df . loc [i , " nominal " ] = nominal_tot

140 portfolio_df . loc [i , " leverage " ] = portfolio_df . loc [i , " nominal " ] / portfolio_df .
loc [i , " capital " ]
141 print ( portfolio_df . loc [ i ])
142

143 print ((1 + portfolio_df [ " capital ret " ]) . cumprod () )

28
144 portfolio_df . to_csv ( " strat2 . csv " )
145 return portfolio_df
146

147 if name == " main " :

148 ( instruments , dfs ) = get_fx_dict ()
149 alpha = Alpha (
150 instruments = instruments ,
151 dfs = dfs ,
152 configs ={
153 " start " : dlib . TRADE_START ,
154 " end " : dlib . TRADE_END ,
155 " longsize " : 20 ,
156 " shortsize " : 20
157 }
158 )
159 portfolio_df = alpha . run _simulat ion ()

Listing 12: strat2.py

That’s great and now we have two strategies, let’s introduce the third - and final - strategy.

3.4 Time-Series Long-Biased Momentum in Equities With Ex-Post Risk Man-

agement for Crash Risk Factor

In this strategy, we take a time series momentum strategy - and introduce a slight change to the risk
management to help illustrate some variables when it comes to strategy management. Unlike the other
strategies, we do not know beforehand how many instruments are long/short, or the degree of our forecast.
In essence, we have a forecast that is non-binary (1/-1 or 0), and can be any value within a range. Since we
have generalised the portfolio scalar compute, the volatility targeting component still works as desired even
if we vary our forecast’s ranges. We define the moving averages ‘f’, ‘m’, ’s’, ’ss’ to mean fast, medium, slow
and super slow and our alpha computation is as such:

self.dfs[inst]["smaf"] = self.dfs[inst]["adj_close"].rolling(10).mean()
self.dfs[inst]["smam"] = self.dfs[inst]["adj_close"].rolling(30).mean()
self.dfs[inst]["smas"] = self.dfs[inst]["adj_close"].rolling(100).mean()
self.dfs[inst]["smass"] = self.dfs[inst]["adj_close"].rolling(300).mean()
self.dfs[inst]["alphas"] = 0.0 + \
(self.dfs[inst]["smaf"] > self.dfs[inst]["smam"]) + \
(self.dfs[inst]["smaf"] > self.dfs[inst]["smas"]) + \

29
(self.dfs[inst]["smaf"] > self.dfs[inst]["smass"])+ \
(self.dfs[inst]["smam"] > self.dfs[inst]["smas"]) + \
(self.dfs[inst]["smam"] > self.dfs[inst]["smass"])

where the 0.0 casts the boolean pandas Series object into a float value. Our forecast values are all positive,
such that we only go long assets, and have no cut-offs - we just trade larger in sizing for those with larger
forecasts and trade smaller for those with smaller forecasts.

We add new component to the risk factor which prevents us from trading when even ‘pigs are flying’.
That is, we only want to trade our proposed positions if the market is not euphoric, and we do this by setting
a filter that says ‘only trade when less than 60% of the assets in our universe possess short-term momentum’.

After evaluating our positions, we do the following check:

num_trend = 0
for inst in eligibles:
num_trend = num_trend + 1
if self.dfs[inst].loc[date, "smaf"] > self.dfs[inst].loc[date, "smam"]
else num_trend
if len(eligibles) > 0 and num_trend / len(eligibles) > 0.60:
for inst in eligibles:
portfolio_df.loc[i, "{} units".format(inst)] = 0
portfolio_df.loc[i, "{} w".format(inst)] = 0
nominal_tot = 0

Note that this could be some other risk controller, such as the capping of leverage or whatever the trader
can dream off. The following test script would look something like this:

1 import numpy as np
2 import pandas as pd
3

4 import backt est_util s

5 import dirtylib as dlib
6

7 from dirtylib import timeme

8 from dirtylib import g e t _ e q u i t i e s _ d i c t
9 from dirtylib import get_fx_dict
10

11 class Alpha () :
12

30
13 def __init__ ( self , instruments , dfs , configs ) :
14 self . instruments = instruments
15 self . dfs = dfs
16 self . configs = configs
17

18 def g e t _ t r a d e _ d a t e t i m e _ r a n g e ( self ) :
19 return ( self . configs [ " start " ] , self . configs [ " end " ])
20

21 def compute_metas ( self , index ) :

22 for inst in self . instruments :
23 print ( inst )
24 df = pd . DataFrame ( index = index )
25 self . dfs [ inst ][ " vol " ] = ( -1 + self . dfs [ inst ][ " adj_close " ] / self . dfs [ inst ]. shift
(1) [ " adj_close " ]) . rolling (30) . std ()
26 self . dfs [ inst ][ " loginvvol " ] = np . log (1 / self . dfs [ inst ][ " vol " ]) / np . log (1.3)
27 self . dfs [ inst ] = df . join ( self . dfs [ inst ])
28 self . dfs [ inst ] = self . dfs [ inst ]. fillna ( method = " ffill " ) . fillna ( method = " bfill " )
29 self . dfs [ inst ][ " ret " ] = -1 + self . dfs [ inst ][ " adj_close " ] / self . dfs [ inst ]. shift
(1) [ " adj_close " ]
30 self . dfs [ inst ][ " sampled " ] = self . dfs [ inst ][ " adj_close " ] != self . dfs [ inst ]. shift
(1) [ " adj_close " ]
31 self . dfs [ inst ][ " active " ] = self . dfs [ inst ][ " sampled " ]. rolling (5) . apply ( lambda x :
int ( np . any ( x ) ) ) . fillna (0)
32

33 self . dfs [ inst ][ " smaf " ] = self . dfs [ inst ][ " adj_close " ]. rolling (10) . mean ()
34 self . dfs [ inst ][ " smam " ] = self . dfs [ inst ][ " adj_close " ]. rolling (30) . mean ()
35 self . dfs [ inst ][ " smas " ] = self . dfs [ inst ][ " adj_close " ]. rolling (100) . mean ()
36 self . dfs [ inst ][ " smass " ] = self . dfs [ inst ][ " adj_close " ]. rolling (300) . mean ()
37 self . dfs [ inst ][ " alphas " ] = 0.0 + \
38 ( self . dfs [ inst ][ " smaf " ] > self . dfs [ inst ][ " smam " ]) + \
39 ( self . dfs [ inst ][ " smaf " ] > self . dfs [ inst ][ " smas " ]) + \
40 ( self . dfs [ inst ][ " smaf " ] > self . dfs [ inst ][ " smass " ]) + \
41 ( self . dfs [ inst ][ " smam " ] > self . dfs [ inst ][ " smas " ]) + \
42 ( self . dfs [ inst ][ " smam " ] > self . dfs [ inst ][ " smass " ])
43

44 self . dfs [ inst ][ " eligible " ] = \

45 (~ np . isnan ( self . dfs [ inst ][ " smass " ]) ) \
46 & self . dfs [ inst ][ " active " ] \
47 & ( self . dfs [ inst ][ " vol " ] > 0) \
48 & ( self . dfs [ inst ][ " adj_close " ] > 0) \
49 & ( self . dfs [ inst ][ " ret " ]. shift ( -1) < 0.30) \
50 & ( self . dfs [ inst ][ " ret " ]. shift ( -1) > -0.30)
51

31
52 def i n i t _ p o r t f o l i o _ s e t t i n g s ( self , trade_range ) :
53 portfolio_df = pd . DataFrame ( index = trade_range ) . reset_index () . rename ( columns ={ " index "
: " datetime " })
54 portfolio_df . loc [0 , " capital " ] = 10000
55 portfolio_df . loc [0 , " ewma " ] = 0.001
56 portfolio_df . loc [0 , " ewstrat " ] = 1
57 return portfolio_df
58

59 def c o m p u t e _ e l i g i b l e s ( self , date ) :

60 eligibles = [ inst for inst in self . instruments if self . dfs [ inst ]. loc [ date , " eligible
" ]]
61 non_eligibles = [ inst for inst in self . instruments if not self . dfs [ inst ]. loc [ date , "
eligible " ]]
62 return eligibles , non_eligibles
63

64 def ge t_ s t r a t _ s c a l e r ( self , portfolio_df , lookback , portfolio_vol , idx , default ) :

65 a n n _ r e a l i z e d _v o l = np . sqrt ( portfolio_df . loc [ idx - 1 , " ewma " ] * 252)
66 return portfolio_vol / a n n _ r e a l i z e d _ v o l * portfolio_df . loc [ idx - 1 , " ewstrat " ]
67

68 def run_sim ulation ( self ) :

69 """
70 Settings
71 """
72 portfolio_vol = 0.20
73 t r a d e _ d a t e t i m e _ r a n g e = pd . date_range (
74 start = self . g e t _ t r a d e _ d a t e t i m e _ r a n g e () [0] ,
75 end = self . g e t _ t r a d e _ d a t e t i m e _ r a n g e () [1] ,
76 freq = " D "
77 )
78

79 """
80 Compute Metas
81 """
82 self . compute_metas ( index = t r a d e _ d a t e t i m e _ r a n g e )
83

84 """
85 I niti al i sa ti o ns
86 """
87 portfolio_df = self . i n i t _ p o r t f o l i o _ s e t t i n g s ( trade_range = t r a d e _ d a t e t i m e _ r a n g e )
88

89 for i in portfolio_df . index :

90 date = portfolio_df . loc [i , " datetime " ]
91 strat_scalar = 2

32
92

93 eligibles , non_eligibles = self . c o m p u t e _ e l i g i b l e s ( date = date )

95 if i != 0:
96 date_prev = portfolio_df . loc [ i - 1 , " datetime " ]
97 day_pnl , nominal_ret = ba cktest_ utils . get_pnl_stats (
98 date = date ,
99 prev = date_prev ,
100 portfolio_df = portfolio_df ,
101 instruments = self . instruments ,
102 idx =i ,
103 historicals = self . dfs ,
104 close_col = " adj_close "
105 )
106

107 strat_scalar = self . g e t _ s t r at _ s c a l e r (

108 portfolio_df = portfolio_df ,
109 lookback =30 ,
110 portfolio_vol = portfolio_vol ,
111 idx =i ,
112 default = strat_scalar
113 )
114

115 portfolio_df . loc [i , " ewstrat " ] = 0.06 * strat_scalar + 0.94 * portfolio_df .
loc [ i - 1 , " ewstrat " ] \
116 if nominal_ret != 0 else portfolio_df . loc [ i - 1 , " ewstrat " ]
117

118 portfolio_df . loc [i , " strat scalar " ] = strat_scalar

119

120 forecasts = {}
121 for inst in eligibles :
122 forecasts [ inst ] = self . dfs [ inst ]. loc [ date , " alphas " ]
123

124 for inst in non_eligibles :

125 portfolio_df . loc [ i , " {} w " . format ( inst ) ] = 0
126 portfolio_df . loc [ i , " {} units " . format ( inst ) ] = 0
127

128 nominal_tot = 0
129 for inst in eligibles :
130 forecast = forecasts [ inst ]
131 vol_target = 1 / ( len ( eligibles ) ) \
132 * portfolio_df . loc [i , " capital " ] \
133 * portfolio_vol \

33
134 / np . sqrt (253)
135 position = strat_scalar * forecast * vol_target * self . dfs [ inst ]. loc [ date , "
loginvvol " ] / bac ktest_u tils . un i t_ ba se _ va lu e ( instrument = inst , dfs = self . dfs , date = date )
136 portfolio_df . loc [i , inst + " units " ] = position
137 nominal_tot += abs ( position * backt est_uti ls . un it _b a se _v al u e ( instrument =
inst , dfs = self . dfs , date = date ) )
138

139 for inst in eligibles :

140 units = portfolio_df . loc [i , " {} units " . format ( inst ) ]
141 nominal_inst = units * bac ktest_ut ils . un i t_ ba se _ va lu e ( instrument = inst , dfs =
self . dfs , date = date )
142 inst_w = nominal_inst / nominal_tot
143 portfolio_df . loc [i , " {} w " . format ( inst ) ] = inst_w
144

145 num_trend = 0
146 for inst in eligibles :
147 num_trend = num_trend + 1 if self . dfs [ inst ]. loc [ date , " smaf " ] > self . dfs [
inst ]. loc [ date , " smam " ] else num_trend
148 if len ( eligibles ) > 0 and num_trend / len ( eligibles ) > 0.60:
149 for inst in eligibles :
150 portfolio_df . loc [i , " {} units " . format ( inst ) ] = 0
151 portfolio_df . loc [i , " {} w " . format ( inst ) ] = 0
152 nominal_tot = 0
153

154 portfolio_df . loc [i , " nominal " ] = nominal_tot

155 portfolio_df . loc [i , " leverage " ] = portfolio_df . loc [i , " nominal " ] / portfolio_df .
loc [i , " capital " ]
156 print ( portfolio_df . loc [ i ])
157

158 print ((1 + portfolio_df [ " capital ret " ]) . cumprod () )

159 portfolio_df . to_csv ( " strat3 . csv " )
160 return portfolio_df
161

162 if name == " main " :

163 ( instruments , dfs ) = g e t _ e q u i t i e s _ d i c t ()
164 alpha = Alpha (
165 instruments = instruments [:100] ,
166 dfs = dfs ,
167 configs ={
168 " start " : dlib . TRADE_START ,
169 " end " : dlib . TRADE_END ,
170 " longsize " : -1 ,
171 " shortsize " : -1

34
172 }
173 )
174 portfolio_df = alpha . run _simulat ion ()

Listing 13: strat3.py

Note that we ‘roll forward’ volatility estimates when our strategy takes no trades after filtering. This
is to prevent volatility from ‘decaying’ and causing unnecessarily large positions when no trades have been
taken for a long time. This is akin to risk management in event-driven trades. Notably, we would not be
able to achieve the annualized target volatility in calendar days, but we would achieve it when based on the
number of actively traded days. Accordingly, the Sharpe computations have filtered out unactive days since
we assume no risk on those days.

If you have understood the 3 strategies, we are now done! Congratulations - we would now focus on
improving this set of grossly inefficient scripts by adopting better design choices and analysing their CPU
loads.

4 Benchmarking

4.1 Comment on Version Tracking

Before we make any rash decisions and start rewriting chunks of code, it may first be wise to ensure that we
have a way of checking that the rewritten code performs the same logic as our initial proposal - and that it
is in fact getting better without errors. At this point, you MUST have some form of version control. Using
git, the command:

git branch mybackup

would create a version of our current code, and later we can reference it by doing

git checkout mybackup

- do remember that the tools provided by git are tremendously useful to any programmer - we urge readers
who have not adopted version control to their development to check it out.

4.2 Comment on Testing

You should also adopt some form of testing, to make sure that the system performs as expected, where the
tests can be implemented in very granular fashion such as unit-testing or in higher-level, lower overhead (but

35
reduced safety) integration testing. Here I again urge readers to check out the literature on software testing,
and arrive at one that suits your profile. You can even make testing the controller of your development cycle,
such as in TDD (test-driven development).

We do a super light-weight (virtually non-existing) integration test here, where we verify that the
modifications of our code performed by the iterative improvements are correct by comparing the terminal
wealth of our backtests pre and post change. (note that this should be accurate up to a fair number of
decimals at least)

In order to compare their efficiency of CPU usage, we also time our functions - we can create a timer in
dirtylib as follows:

1 def timeme ( func ) :

2 from functools import wraps
3 @wraps ( func )
4 a = time . time ()
5 result = func (* args , ** kwargs )
6 print ( f " @timeme : { func . __name__ } took { time . time () - a } seconds " )
7 return result
8 return timediff

Listing 14: dirtylib.py

We import this into the stratx.py files and add the simple decorator as follows (here is a great writeup
on decorators: link, Real Python):

1 @timeme
2 def run_sim ulation ( self ) :
3 """
4 Settings
5 """
6 portfolio_vol = 0.20
7 t r a d e _ d a t e t i m e _ r a n g e = pd . date_range (
8 start = self . g e t _ t r a d e _ d a t e t i m e _ r a n g e () [0] ,
9 end = self . g e t _ t r a d e _ d a t e t i m e _ r a n g e () [1] ,
10 freq = " D "
11 )
12 ...
13 ...

Listing 15: stratx.py

and we should get the time taken for run simulation to run every time we run our script.

The runtime depends on the size of the asset universe, and the number of assets under consideration

36
for trading each day. Let’s go ahead and add the decorators to our three strategies, fix their configurations
and run the scripts again to get their terminal wealth.

First we give a preview of our results and what we will be rewarded with in the remainder of our work:

Version Strat 1 Strat 2 Strat 3

Original 1338 226 1250

Improved 14 5 12

Table 4: run simulation runtimes, full problem set in SECONDS

Version Strat 1 Strat 2 Strat 3

Original 116 - 81
Improved 10 - 5

Table 5: run simulation runtimes, reduced problem set in SECONDS

4.3 Strat1

Note that we have already changed strat1 strat scalar to the generalised version taking in consideration both
the exponentially weighted, historical strategy scalars as well as the volatility on portfolio leveraged returns.

Time Taken: @timeme: run_simulation took 1337.7771863937378 seconds

Configs:
(instruments, dfs) = get_equities_dict()
alpha = Alpha(
instruments=instruments[:200],
dfs=dfs,
configs={
"start" : dlib.TRADE_START,
"end": dlib.TRADE_END,
"longsize": 50,
"shortsize": 50
}
)

37
Terminal Wealth:
10269.606843

When we are profiling, to save time we will use a reduced problem set with the following configurations:

Time Taken: @timeme: run_simulation took 115.60257959365845 seconds

Configs:
(instruments, dfs) = get_equities_dict()
alpha = Alpha(
instruments=instruments[:20],
dfs=dfs,
configs={
"start" : dlib.TRADE_START,
"end": dlib.TRADE_END,
"longsize": 5,
"shortsize": 5
}
)
portfolio_df = alpha.run_simulation()
portfolio_df.to_csv("sstrat1.csv")

Terminal Wealth:
3179.454924

4.4 Strat2

Time Taken: @timeme: run_simulation took 225.85177397727966 seconds

Configs:
(instruments, dfs) = get_fx_dict()
alpha = Alpha(
instruments=instruments,
dfs=dfs,
configs={
"start" : dlib.TRADE_START,
"end": dlib.TRADE_END,
"longsize": 20,

38
"shortsize": 20
}
)
Terminal Wealth:
43342.458658

4.5 Strat3

Time Taken: @timeme: run_simulation took 1249.9761579036713 seconds

Configs:
(instruments, dfs) = get_equities_dict()
alpha = Alpha(
instruments=instruments,
dfs=dfs,
configs={
"start" : dlib.TRADE_START,
"end": dlib.TRADE_END,
"longsize": -1,
"shortsize": -1
}
)
Terminal Wealth:
23319.836438

As with strategy 1, to save time we use the reduced problem set in our code profiling:

Time Taken: @timeme: run_simulation took 81.3337390422821 seconds.

(instruments, dfs) = get_equities_dict()
alpha = Alpha(
instruments=instruments[:30], #changed to 30 for profiling
dfs=dfs,
configs={
"start" : dlib.TRADE_START,
"end": dlib.TRADE_END,
"longsize": -1,
"shortsize": -1

39
}
)
Terminal Wealth:
19242.269661

5 Parent-Child Relationships - Some OOP & Rewriting our Code

We want to modify our code structure that it is easier to maintain. A noticeable feature of our trading scripts
is that they share a lot of common functionality - they mostly differ in the computation and management
of alpha signals. The rest is the same. This is well suited to a Parent-Child relationship in Object-Oriented
Programming. We refer the details to other literature, and show a simple example for demonstration.

Here we consider a simple example: we have a parent, and when he meet someone, he goes ‘yo’. His
son is cooler, so that the son goes ‘yo wassup’ instead.

1 class Parent () :
2

3 def init ( self ) :

4 pass
5

6 def yo ( self ) :
7 print ( " yo " )
8

9 def testing ( self ) :

10 self . yo ()
11

12 class Son ( Parent ) :

14 def init ( self ) :

15 super () . __init__ ()
16

17 def yo ( self ) :
18 print ( " yo wassup " )
19

20 son = Son ()
21 son . yo () # yo wassup
22 son . testing () # yo wassup

We see how the call trace works in this example. super refers to the parent class Parent, and the child
Son overrides (see literature on method overriding) the function ‘yo’: when calling the function ‘yo’ on a
Son object, we use Son’s implementation. For methods not overridden, we use the parent implementation,

40
by walking up the inherited classes and searching for implemented functionalities. We shall restructure our
code to ‘collect’ similar features in a generic parent Alpha class to make our code more manageable. Our
parent class should have enough flexibility to encompass the logic of all the 3 strategies. For instance, since
we now need to accommodate both cross sectional strategies trading fixed number of long/shorts & time
series strategies with dynamic traded cardinality, we should introduce a variable num traded to divide our
portfolio capital.

Additionally, like we did in our simple example, we want to implement ‘default’ functionality in the
parent class.

5.1 Generic Alpha Unit

1 import numpy as np
2 import pandas as pd
3

4 import backt est_util s

5 import dirtylib as dlib
6

7 from dirtylib import timeme

8 from dirtylib import g e t _ e q u i t i e s _ d i c t
9 from dirtylib import get_fx_dict
10

11 class Alpha () :
12

13 def init ( self , instruments , dfs , configs ) :

14 self . instruments = instruments
15 self . dfs = dfs
16 self . configs = configs
17 self . portfolio_df = None
18

19 def g e t _ t r a d e _ d a t e t i m e _ r a n g e ( self ) :
20 return ( self . configs [ " start " ] , self . configs [ " end " ])
21

22 def compute_metas ( self , index ) :

23 """
24 Inheriting class needs to compute
25 inverse risk estimate , eligibility checks + all functionality used by
compute_forecasts
26 """
27 for inst in self . instruments :
28 df = pd . DataFrame ( index = index )
29 self . dfs [ inst ][ " vol " ] = ( -1 + self . dfs [ inst ][ " adj_close " ] / self . dfs [ inst ]. shift

41
(1) [ " adj_close " ]) . rolling (30) . std ()
30 self . dfs [ inst ] = df . join ( self . dfs [ inst ])
31 self . dfs [ inst ] = self . dfs [ inst ]. fillna ( method = " ffill " ) . fillna ( method = " bfill " )
32 self . dfs [ inst ][ " ret " ] = -1 + \
33 self . dfs [ inst ][ " adj_close " ] / self . dfs [ inst ]. shift (1) [ " adj_close " ]
34 self . dfs [ inst ][ " sampled " ] = self . dfs [ inst ][ " adj_close " ] \
35 != self . dfs [ inst ]. shift (1) [ " adj_close " ]
36 self . dfs [ inst ][ " active " ] = self . dfs [ inst ][ " sampled " ]\
37 . rolling (5)
38 . apply ( lambda x : int ( np . any ( x ) ) )
39 . fillna (0)
40 pass
41

42 def c o m p u t e _ f o r e c a s t s ( self , index , date ) :

43 pass
44

45 def p o s t _ r i s k _ m a n a g e m e n t ( self , index , date , nominal_tot ) :

46 return nominal_tot
47

48 def i n i t _ p o r t f o l i o _ s e t t i n g s ( self , trade_range ) :

49 self . portfolio_df = pd . DataFrame ( index = trade_range ) . reset_index () . rename ( columns ={ "
index " : " datetime " })
50 self . portfolio_df . loc [0 , " capital " ] = 10000
51 self . portfolio_df . loc [0 , " ewma " ] = 0.001
52 self . portfolio_df . loc [0 , " ewstrat " ] = 1
53 return self . portfolio_df
54

55 def c o m p u t e _ e l i g i b l e s ( self , date ) :

56 eligibles = [ inst for inst in self . instruments if self . dfs [ inst ]. loc [ date , " eligible
" ]]
57 non_eligibles = [ inst for inst in self . instruments if not self . dfs [ inst ]. loc [ date , "
eligible " ]]
58 return eligibles , non_eligibles
59

60 def ge t_ s t r a t _ s c a l e r ( self , lookback , portfolio_vol , idx , default ) :

61 a n n _ r e a l i z e d _v o l = np . sqrt ( self . portfolio_df . loc [ idx - 1 , " ewma " ] * 252)
62 return portfolio_vol / a n n _ r e a l i z e d _ v o l * self . portfolio_df . loc [ idx - 1 , " ewstrat " ]
63

64 def set_positions ( self , date , idx , forecasts , eligibles , num_trading , portfolio_vol ,

strat_scalar ) :
65 nominal_tot = 0
66 for inst in eligibles :
67 forecast = forecasts [ inst ]

42
68 vol_target = 1 / num_trading \
69 * self . portfolio_df . loc [ idx , " capital " ] \
70 * portfolio_vol \
71 / np . sqrt (253)
72 position = strat_scalar * \
73 forecast * \
74 vol_target * \
75 self . dfs [ inst ]. loc [ date , " invrisk " ] / \
76 back test_uti ls . un it _ ba se _v a lu e ( instrument = inst , dfs = self . dfs , date = date )
77 self . portfolio_df . loc [ idx , inst + " units " ] = position
78 nominal_tot += abs ( position * b acktest_ utils . u ni t_ b as e_ va l ue ( instrument = inst ,
dfs = self . dfs , date = date ) )
79 return nominal_tot
80

81 def set_weights ( self , idx , date , eligibles , nominal_tot ) :

82 for inst in eligibles :
83 units = self . portfolio_df . loc [ idx , " {} units " . format ( inst ) ]
84 nominal_inst = units * ba cktest_u tils . un i t_ ba s e_ va lu e ( instrument = inst , dfs = self .
dfs , date = date )
85 inst_w = nominal_inst / nominal_tot
86 self . portfolio_df . loc [ idx , " {} w " . format ( inst ) ] = inst_w
87

88 @timeme
89 def run_sim ulation ( self ) :
90 """
91 Settings
92 """
93 portfolio_vol = 0.20
94 t r a d e _ d a t e t i m e _ r a n g e = pd . date_range (
95 start = self . g e t _ t r a d e _ d a t e t i m e _ r a n g e () [0] ,
96 end = self . g e t _ t r a d e _ d a t e t i m e _ r a n g e () [1] ,
97 freq = " D "
98 )
99

100 """
101 Compute Metas
102 """
103 self . compute_metas ( index = t r a d e _ d a t e t i m e _ r a n g e )
104

105 """
106 I niti al i sa ti o ns
107 """
108 self . portfolio_df = self . i n i t _ p o r t f o l i o _ s e t t i n g s ( trade_range = t r a d e _ d a t e t i m e _ r a n g e )

43
109

110 for i in self . portfolio_df . index :

111 date = self . portfolio_df . loc [i , " datetime " ]
112 strat_scalar = 2
113

114 eligibles , non_eligibles = self . c o m p u t e _ e l i g i b l e s ( date = date )

115

116 if i != 0:
117 date_prev = self . portfolio_df . loc [ i - 1 , " datetime " ]
118 day_pnl , nominal_ret = ba cktest_ utils . get_pnl_stats (
119 date = date ,
120 prev = date_prev ,
121 portfolio_df = self . portfolio_df ,
122 instruments = self . instruments ,
123 idx =i ,
124 historicals = self . dfs ,
125 close_col = " adj_close "
126 )
127

128 strat_scalar = self . g e t _ s t r at _ s c a l e r (

129 lookback =30 ,
130 portfolio_vol = portfolio_vol ,
131 idx =i ,
132 default = strat_scalar
133 )
134

135 self . portfolio_df . loc [i , " ewstrat " ] = \

136 0.06 * strat_scalar + 0.94 * self . portfolio_df . loc [ i - 1 , " ewstrat " ] \
137 if nominal_ret != 0 else self . portfolio_df . loc [ i - 1 , " ewstrat " ]
138

139 self . portfolio_df . loc [i , " strat scalar " ] = strat_scalar

140

141 forecasts , num_trading = self . c o m p u t e _ f o r e c a s t s (\

142 index =i , date = date , eligibles = eligibles )
143

144 for inst in non_eligibles :

145 self . portfolio_df . loc [ i , " {} w " . format ( inst ) ] = 0
146 self . portfolio_df . loc [ i , " {} units " . format ( inst ) ] = 0
147

148 nominal_tot = self . set_positions (\

149 date , i , forecasts , eligibles , num_trading , portfolio_vol , strat_scalar )
150 self . set_weights (i , date , eligibles , nominal_tot )
151 nominal_tot = self . p o s t _ r i s k _ m a n a g e m e n t (\

44
152 index =i , date = date , nominal_tot = nominal_tot )
153

154 self . portfolio_df . loc [i , " nominal " ] = nominal_tot

155 self . portfolio_df . loc [i , " leverage " ] = self . portfolio_df . loc [i , " nominal " ] \
156 / self . portfolio_df . loc [i , " capital " ]
157 print ( self . portfolio_df . loc [ i ])
158

159 return self . portfolio_df

Listing 16: alpha.py

Essentially, we have ‘factored’ out common functionalities, and generalised some features. Our Alpha
class asks that any child that inherits the Alpha class implements the eligibility check and the inverse
risk estimate for each asset. Additionally, it needs to implement the compute forecasts functionality that
is the alpha signal generator for that strategy. If the strategy requires some additional information to
calculate the alpha signals, we can do the calculation in the compute metas method and assign them to the
object variables. We implement default post-risk management, which is to not do anything. Unique risk
management is achieved by overriding the parent’s method.

5.2 Strat1

Our strategy implementations are now significantly simplified. We show them as follows:

1 import numpy as np
2 import pandas as pd
3

4 import backt est_util s

5 import dirtylib as dlib
6 import alpha
7

8 from dirtylib import timeme

9 from dirtylib import g e t _ e q u i t i e s _ d i c t
10 from dirtylib import get_fx_dict
11

12 class Strat1 ( alpha . Alpha ) :

14 def init ( self , instruments , dfs , configs ) :

15 super () . __init__ ( instruments , dfs , configs )
16

17 def compute_metas ( self , index ) :

18 for inst in self . instruments :
19 self . dfs [ inst ][ " alive " ] = self . dfs [ inst ][ " adj_close " ]

45
20 . rolling (61)
21 . apply ( lambda x : ~ np . any ( np . isnan ( x ) ) )
22 . fillna (0)
23 super () . compute_metas ( index )
24 retagg = pd . DataFrame ( index = index )
25 for inst in self . instruments :
26 print ( inst )
27 self . dfs [ inst ][ " invrisk " ] = 1 / self . dfs [ inst ][ " vol " ]
28 self . dfs [ inst ][ " eligible " ] = \
29 self . dfs [ inst ][ " alive " ]. astype ( int ) &\
30 self . dfs [ inst ][ " active " ]. astype ( int ) &\
31 ( self . dfs [ inst ][ " vol " ] > 0) . astype ( int ) &\
32 ( self . dfs [ inst ][ " adj_close " ] > 0) . astype ( int )
33

34 retagg [ inst ] = self . dfs [ inst ][ " ret " ]

36 retagg [ " ret " ] = retagg . mean ( axis =1)

37 self . dfs [ " mkt " ] = retagg
38

39 def c o m p u t e _ f o r e c a s t s ( self , index , date , eligibles ) :

40 alpha_scores = {}
41 for inst in eligibles :
42 from sklearn . linear_model import L i n e a r R e g r e s s i on
43 from sklearn . preprocessing import P o l y n o m i a l F e a t u r e s
44 from scipy . stats import skew
45

46 date_start = self . portfolio_df . loc [ index - 60 , " datetime " ]

47 x = self . dfs [ " mkt " ][ " ret " ]. loc [ date_start : date ]. values . reshape (( -1 , 1) )
48 y = self . dfs [ inst ][ " ret " ]. loc [ date_start : date ]
49 model = L i n e a r R e g re s s i o n () . fit (x , y )
50 ypred = model . predict ( x )
51 residuals = y - ypred
52 alpha_scores [ inst ] = -1 * skew ( residuals )
53

54 alpha_scores = {
55 k : v for k , v in sorted ( alpha_scores . items () , key = lambda pair : pair [1])
56 }
57 alphalong = list ( alpha_scores . keys () ) [ - self . configs [ " longsize " ]:]
58 alphashort = list ( alpha_scores . keys () ) [: self . configs [ " shortsize " ]]
59

60 forecasts = {}
61 for inst in eligibles :
62 forecast = 1 if inst in alphalong else 0

46
63 forecast = -1 if inst in alphashort else forecast
64 forecasts [ inst ] = forecast
65

66 return forecasts , self . configs [ " longsize " ] + self . configs [ " shortsize " ]

Listing 17: strat1.py

This is much cleaner than before - if we want to create a new strategy, we do not have repeat the same
∼ 150 or so lines of code. We just need to implement the two functionalities that is characteristic of the
strategy itself.

5.3 Strat2

We repeat the same idea on our second strategy:

1 import numpy as np
2 import pandas as pd
3

4 import backt est_util s

5 import dirtylib as dlib
6 import alpha
7

8 from dirtylib import timeme

9 from dirtylib import g e t _ e q u i t i e s _ d i c t
10 from dirtylib import get_fx_dict
11

12 class Strat2 ( alpha . Alpha ) :

14 def init ( self , instruments , dfs , configs ) :

15 super () . __init__ ( instruments , dfs , configs )
16

17 def compute_metas ( self , index ) :

18 super () . compute_metas ( index )
19 for inst in self . instruments :
20 print ( inst )
21 self . dfs [ inst ][ " invrisk " ] = np . log (1 / self . dfs [ inst ][ " vol " ]) / np . log (1.3)
22 self . dfs [ inst ][ " dd " ] =
23 (1 + self . dfs [ inst ][ " ret " ]) . rolling ( window =12) . apply ( np . prod )
24 / (1 + self . dfs [ inst ][ " ret " ]) . rolling ( window =12) . apply ( np . prod ) . cummax () - 1
25 self . dfs [ inst ][ " eligible " ] = \
26 (~ np . isnan ( self . dfs [ inst ][ " dd " ]) ) \
27 & self . dfs [ inst ][ " active " ] \
28 & ( self . dfs [ inst ][ " vol " ] > 0) \

47
29 & ( self . dfs [ inst ][ " adj_close " ] > 0) \
30 & ( self . dfs [ inst ][ " ret " ]. shift ( -1) < 0.30) \
31 & ( self . dfs [ inst ][ " ret " ]. shift ( -1) > -0.30)
32

33 def c o m p u t e _ f o r e c a s t s ( self , index , date , eligibles ) :

34 alpha_scores = {}
35 for inst in eligibles :
36 alpha_scores [ inst ] = -1 * self . dfs [ inst ]. loc [ date , " dd " ]
37

38 alpha_scores = {
39 k : v for k , v in sorted ( alpha_scores . items () , key = lambda pair : pair [1])
40 }
41 alphalong = list ( alpha_scores . keys () ) [ - self . configs [ " longsize " ]:]
42 alphashort = list ( alpha_scores . keys () ) [: self . configs [ " shortsize " ]]
43

44 forecasts = {}
45 for inst in eligibles :
46 forecast = 1 if inst in alphalong else 0
47 forecast = -1 if inst in alphashort else forecast
48 forecasts [ inst ] = forecast
49

50 return forecasts , self . configs [ " longsize " ] + self . configs [ " shortsize " ]

Listing 18: strat2.py

5.4 Strat3

And now suppose we want to have unique behavior in risk management (such as the crash-risk-factor filter-
ing), we can override the parent method which will be called in the parent’s run simulation function instead
of the default implementation.

1 import numpy as np
2 import pandas as pd
3

4 import backt est_util s

5 import dirtylib as dlib
6 import alpha
7

8 from dirtylib import timeme

9 from dirtylib import g e t _ e q u i t i e s _ d i c t
10 from dirtylib import get_fx_dict
11

12 class Strat3 ( alpha . Alpha ) :

48
13

14 def init ( self , instruments , dfs , configs ) :

15 super () . __init__ ( instruments , dfs , configs )
16

17 def compute_metas ( self , index ) :

18 super () . compute_metas ( index )
19 for inst in self . instruments :
20 print ( inst )
21 self . dfs [ inst ][ " invrisk " ] = np . log (1 / self . dfs [ inst ][ " vol " ]) / np . log (1.3)
22 self . dfs [ inst ][ " smaf " ] = self . dfs [ inst ][ " adj_close " ]. rolling (10) . mean ()
23 self . dfs [ inst ][ " smam " ] = self . dfs [ inst ][ " adj_close " ]. rolling (30) . mean ()
24 self . dfs [ inst ][ " smas " ] = self . dfs [ inst ][ " adj_close " ]. rolling (100) . mean ()
25 self . dfs [ inst ][ " smass " ] = self . dfs [ inst ][ " adj_close " ]. rolling (300) . mean ()
26 self . dfs [ inst ][ " alphas " ] = 0.0 + \
27 ( self . dfs [ inst ][ " smaf " ] > self . dfs [ inst ][ " smam " ]) + \
28 ( self . dfs [ inst ][ " smaf " ] > self . dfs [ inst ][ " smas " ]) + \
29 ( self . dfs [ inst ][ " smaf " ] > self . dfs [ inst ][ " smass " ]) + \
30 ( self . dfs [ inst ][ " smam " ] > self . dfs [ inst ][ " smas " ]) + \
31 ( self . dfs [ inst ][ " smam " ] > self . dfs [ inst ][ " smass " ])
32

33 self . dfs [ inst ][ " eligible " ] = \

34 (~ np . isnan ( self . dfs [ inst ][ " smass " ]) ) \
35 & self . dfs [ inst ][ " active " ] \
36 & ( self . dfs [ inst ][ " vol " ] > 0) \
37 & ( self . dfs [ inst ][ " adj_close " ] > 0) \
38 & ( self . dfs [ inst ][ " ret " ]. shift ( -1) < 0.30) \
39 & ( self . dfs [ inst ][ " ret " ]. shift ( -1) > -0.30)
40

41 def c o m p u t e _ f o r e c a s t s ( self , index , date , eligibles ) :

42 forecasts = {}
43 for inst in eligibles :
44 forecasts [ inst ] = self . dfs [ inst ]. loc [ date , " alphas " ]
45 return forecasts , len ( eligibles )
46

47 def p o s t _ r i s k _ m a n a g e m e n t ( self , index , date , nominal_tot ) :

48 eligibles , _ = super () . c o m p u t e _ e l i g i b l e s ( date )
49 num_trend = 0
50 for inst in eligibles :
51 num_trend = num_trend + 1 \
52 if self . dfs [ inst ]. loc [ date , " smaf " ] > self . dfs [ inst ]. loc [ date , " smam " ] \
53 else num_trend
54 if len ( eligibles ) > 0 and num_trend / len ( eligibles ) > 0.60:
55 for inst in eligibles :

49
56 self . portfolio_df . loc [ index , " {} units " . format ( inst ) ] = 0
57 self . portfolio_df . loc [ index , " {} w " . format ( inst ) ] = 0
58 nominal_tot = 0
59 return nominal_tot

Listing 19: strat3.py

Now, this really helps us make optimizing our code easier - optimizations that we make to Alpha class
is applied to all of our strategies, and we do not need to go into the individual files to make our changes.
Additionally, since the functionalities in Alpha class are most commonly called, perhaps it is wise to focus
our optimizations and making our Alpha class CPU-efficient first?

Now that we have cleaned up the code, we can verify that we have not ‘disturbed’ the logic by running
the simulations again, and we end up with the same terminal wealth. We did this, and we are all good to
go. Let’s introduce another toolset: suppose you have generated 2 output files, say ‘strat1 prechange.csv’
and ’strat1 postchange.csv’, you can test whether the output files generated by the new code is differently
by doing in the terminal (fc on Windows instead of diff):

diff strat1_prechange.csv strat1_postchange.csv

and this should give no outputs - meaning each line is the same!

6 Profiling Our Code

There are two versions of reality - one that is fake (the one that we perceive) and one that is real (the one
that is). We may hypothesize about what are the lines/logic in our code that causes the inefficiencies, but
we would not know for real what actually is causing the code to be inefficient without quantitative probes.

We are after all, systematic traders. We want to systematically improve our code efficiency too, hierar-
chically from the top level view down to the granular. We also want to ‘attack’ the biggest pain points, so
that we get the most bang for our buck in terms of effort in relation to efficiency gain.

In this section, we will demonstrate some effective tools to take a deeper analysis of our code. We
arbitrarily choose Strat3 to start profiling and reducing efficiency bottlenecks.

To get a general view of the run-time, we can use the unix time command:

Command>>
/usr/bin/time --verbose -o temp.txt python3 strat3.py

50
Out>>
Command being timed: "python3 strat3.py"
User time (seconds): 82.18
System time (seconds): 1.57
Percent of CPU this job got: 101%
Elapsed (wall clock) time (h:mm:ss or m:ss): 1:22.29
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 204588
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 54753
Voluntary context switches: 156
Involuntary context switches: 57484
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0

Minor page faults are caused when a program accesses new memory spaces. Memory addressing follow
lazy allocation by the kernel, which on access forces the kernel to pause logic execution. This can slow a
program down substantially, since we lose the data in our cache registers (we discuss this later!) in the CPU.

We see no major page faults, and the CPU was working (albeit inefficiently) at maximum. Note the
’-o’ flag specifies the output file the statistics are written to. We can use the statistics generated from this
command to proxy how well our program is using CPU, whether we are idling on I/O, whether we are doing
costly disk-seeks, and whether we are using our cores efficiently. Users can search what these statistics mean,
but we will go on to another tool that allows us to profile our function calls’ resource usage.

51
6.1 Profiling, Iterating, Rewriting...

6.1.1 Function Profiling

Command:
python3 -m cProfile -o temp.txt strat3.py
...
import pstats
p = pstats.Stats("temp.txt")
p.strip_dirs().sort_stats("cumulative").print_stats(20)

We use the cProfile tool, a profiler for our Python programs. We write the output to the temp.txt file. After
generating the statistics, we use the pstats module to parse the statistics file and generate useful results. We
want to see the 20 functions in our script that took the most time.

Let’s take a look at the statistics:

247514856 function calls (245430737 primitive calls) in 199.494 seconds

Ordered by: cumulative time

List reduced from 7197 to 20 due to restriction <20>

ncalls tottime percall cumtime percall filename:lineno(function)

1282/1 0.010 0.000 199.500 199.500 {built-in method builtins.exec}
1 0.000 0.000 199.500 199.500 strat3.py:1(<module>)
1 0.000 0.000 196.499 196.499 dirtylib.py:14(measure_time)
1 0.244 0.244 196.499 196.499 alpha.py:84(run_simulation)
1765412 9.139 0.000 101.260 0.000 indexing.py:954(__getitem__)
361372 2.424 0.000 68.288 0.000 indexing.py:705(__setitem__)
361372 2.586 0.000 47.174 0.000 indexing.py:1556(_setitem_with_indexer)
1644983 5.681 0.000 46.859 0.000 frame.py:3592(_get_value)
3653 0.630 0.000 44.889 0.012 strat3.py:48(post_risk_management)
3653 1.694 0.000 44.597 0.012 alpha.py:60(set_positions)
361372 2.429 0.000 43.316 0.000 indexing.py:1695(_setitem_with_indexer_split_path)
361372 2.230 0.000 37.188 0.000 indexing.py:1853(_setitem_single_column)
1258912 4.870 0.000 35.897 0.000 datetimes.py:639(get_loc)
3653 0.652 0.000 30.543 0.008 alpha.py:77(set_weights)

52
7306 0.023 0.000 25.906 0.004 alpha.py:51(compute_eligibles)
1 0.004 0.004 25.514 25.514 strat3.py:18(compute_metas)
1 0.003 0.003 25.214 25.214 alpha.py:23(compute_metas)
180 0.001 0.000 25.016 0.139 rolling.py:529(_apply)
180 0.001 0.000 25.014 0.139 rolling.py:434(_apply_blockwise)
180 0.001 0.000 25.013 0.139 rolling.py:415(_apply_series)

The first 4 lines can be ignored - they are just the entry point to our call on run simulation. The
measure time wraps around our simulation function, and the rest are built in Python functions when kicking
off our process.

Interestingly, we see that a total of 101.260 seconds have been spent inside getitem function in the
indexing.py class. Even though each call costs less than 0.000 seconds, we repeat it 1765412 times. This is
followed by the setitem function - what could possibly be causing this?

Referring to the cProfiler documentation (https://docs.python.org/3/library/profile.html), we can check

their callers with the command

p.print_callers(20, "__getitem__")

and we get

Ordered by: cumulative time

List reduced from 7197 to 20 due to restriction <20>
List reduced from 20 to 1 due to restriction <’__getitem__’>

Function was called by...

ncalls tottime cumtime
indexing.py:954(__getitem__) 219180 1.133 12.849 alpha.py:52(<listcomp>)
219180 1.136 12.064 alpha.py:53(<listcomp>)
7304 0.040 0.246 alpha.py:56(get_strat_scaler)
201176 1.115 9.877 alpha.py:60(set_positions)
100588 0.628 3.706 alpha.py:77(set_weights)
21916 0.116 2.405 alpha.py:84(run_simulation)
340576 1.853 21.859 backtest_utils.py:58(unit_base_value)
236952 1.199 9.302 backtest_utils.py:71(get_pnl_stats)
7306 0.025 0.542 format.py:295(_chk_truncate)
109470 0.337 11.338 rolling.py:1338(apply_func)

53
100588 0.524 5.799 strat3.py:42(compute_forecasts)
201176 1.033 11.272 strat3.py:48(post_risk_management)

We see which parts of our logic were calling these functions (ignore the line annotations, they do
not correspond to the line labelling in our code sections). We see that most time was taken inside the
unit base value function in backtest utils.py. Let’s recall its implementation:

1 def u ni t_ b as e_ va l ue ( instrument , dfs , date ) :

Listing 20: backtest utils.py

There should not be anything particularly taxing about the logic here, but it turns out the

dataframe.loc[ a , b ]

operation is expensive!

6.1.2 Replacing Function Calls with More Suitable Choices

Referring to their documentation [2], we should access single scalar values with a ‘.at’ operation instead of
rows with ‘.loc’. For more interested readers, we again refer them to other literature. Here is an interest-
ing discussion: (link: stackoverflow) Now, let’s go ahead and change all the ‘.loc’ operations that we are
accessing/writing to scalars with the ‘.at’ operator. Rerunning the operation, we get:

@timeme: run_simulation took 42.09293055534363 seconds

down from 81.3337390422821 seconds! That’s almost half of the runtime shaved off from a quick-fix by
profiling our code. Let’s rerun the profiling with the commands:

python3 -m cProfile -o temp.txt strat3.py

54
&&
import pstats
p = pstats.Stats("temp.txt")
p.strip_dirs().sort_stats("cumulative").print_stats(20)

giving the output

115665119 function calls (114303606 primitive calls) in 73.093 seconds

Ordered by: cumulative time

List reduced from 7207 to 20 due to restriction <20>

ncalls tottime percall cumtime percall filename:lineno(function)

1282/1 0.009 0.000 73.097 73.097 {built-in method builtins.exec}
1 0.000 0.000 73.097 73.097 strat3.py:1(<module>)
1 0.000 0.000 70.620 70.620 dirtylib.py:14(measure_time)
1 0.130 0.130 70.620 70.620 alpha.py:84(run_simulation)
1644983 2.212 0.000 36.754 0.000 indexing.py:2267(__getitem__)
1644983 2.140 0.000 32.470 0.000 indexing.py:2216(__getitem__)
1644983 3.708 0.000 29.817 0.000 frame.py:3592(_get_value)
1258912 3.300 0.000 22.771 0.000 datetimes.py:639(get_loc)
1 0.003 0.003 18.576 18.576 strat3.py:18(compute_metas)
1 0.003 0.003 18.321 18.321 alpha.py:23(compute_metas)
180 0.001 0.000 18.174 0.101 rolling.py:529(_apply)
180 0.001 0.000 18.173 0.101 rolling.py:434(_apply_blockwise)
180 0.001 0.000 18.171 0.101 rolling.py:415(_apply_series)
180 0.001 0.000 18.152 0.101 rolling.py:564(homogeneous_func)
180 0.001 0.000 18.148 0.101 rolling.py:570(calc)
30 0.000 0.000 18.131 0.604 rolling.py:1822(apply)
30 0.000 0.000 18.131 0.604 rolling.py:1274(apply)
30 0.456 0.015 18.123 0.604 rolling.py:1338(apply_func)
3653 0.977 0.000 12.909 0.004 alpha.py:60(set_positions)
3653 0.341 0.000 12.822 0.004 strat3.py:48(post_risk_management)

we see that the cumulative time spent accessing dataframe cell values have decreased significantly. Before
we do our victory lap - there is much more work to be done.

55
6.1.3 Note on Profilers

We will note that as we add increasingly more granular profilers, the run of our timer on run simulation
will get longer and longer runtimes. This is because the it also measures the time taken for the profiler to
do work - we can remove them later and use the original benchmark to see that we are going in the right
direction!

For more on graphical visualisation of cProfile’s statistics file, see snakeviz. (link : snakeviz)

6.1.4 Profiling Code Lines

The previous table, if anything, demonstrated that by changing our scalar access that is called many times
to a more efficient method, we can reduce our runtimes - but the next on the table to target lies in the
compute metas function. In fact, we see that strat3 ’s compute function took 18.576 seconds, but that was
only marginally greater than the time take in the compute function in the alpha.py file, which suggests that
most of the bottleneck was spent in the parent compute metas function.

Let’s verify this. Luckily, like we mentioned, Python has an amazing community of developers. One
written by Robert Kern allows us to profile each individual line. Let’s install the line profiler module with:

python3 -m pip install line_profiler

In order to tell line profiler to access a function, we use the decorator ‘@profile’. It should look something
as simple as this:

1 @profile
2 def compute_metas ( self , index ) :
3 super () . compute_metas ( index )
4 for inst in self . instruments :
5 print ( inst )
6 self . dfs [ inst ][ " invrisk " ] = np . log (1 / self . dfs [ inst ][ " vol " ]) / np . log (1.3)
7 self . dfs [ inst ][ " smaf " ] = self . dfs [ inst ][ " adj_close " ]. rolling (10) . mean ()
8 self . dfs [ inst ][ " smam " ] = self . dfs [ inst ][ " adj_close " ]. rolling (30) . mean ()
9 self . dfs [ inst ][ " smas " ] = self . dfs [ inst ][ " adj_close " ]. rolling (100) . mean ()
10 self . dfs [ inst ][ " smass " ] = self . dfs [ inst ][ " adj_close " ]. rolling (300) . mean ()
11 self . dfs [ inst ][ " alphas " ] = 0.0 + \
12 ( self . dfs [ inst ][ " smaf " ] > self . dfs [ inst ][ " smam " ]) + \
13 ( self . dfs [ inst ][ " smaf " ] > self . dfs [ inst ][ " smas " ]) + \
14 ( self . dfs [ inst ][ " smaf " ] > self . dfs [ inst ][ " smass " ]) + \
15 ( self . dfs [ inst ][ " smam " ] > self . dfs [ inst ][ " smas " ]) + \
16 ( self . dfs [ inst ][ " smam " ] > self . dfs [ inst ][ " smass " ])
17

56
18 self . dfs [ inst ][ " eligible " ] = \
19 (~ np . isnan ( self . dfs [ inst ][ " smass " ]) ) \
20 & self . dfs [ inst ][ " active " ] \
21 & ( self . dfs [ inst ][ " vol " ] > 0) \
22 & ( self . dfs [ inst ][ " adj_close " ] > 0) \
23 & ( self . dfs [ inst ][ " ret " ]. shift ( -1) < 0.30) \
24 & ( self . dfs [ inst ][ " ret " ]. shift ( -1) > -0.30)

Listing 21: strat3.py

When running the script, we signal to use the script kernprof and run it with the command (‘-l’ is a
line-by-line flag and ‘-v’ is a verbosity flag for printing to console)

python3 -m kernprof -l -v strat3.py

giving the output:

Wrote profile results to strat3.py.lprof

Timer unit: 1e-06 s

Total time: 23.6841 s

File: strat3.py
Function: compute_metas at line 18

Line # Hits Time Per Hit % Time Line Contents

==============================================================
18 @profile
19 def compute_metas(self, index):
20 1 23356109.0 23356109.0 98.6 super().compute_metas(index)
21 31 45.0 1.5 0.0 for inst in self.instruments:
22 30 701.0 23.4 0.0 print(inst)
23 30 38548.0 1284.9 0.2 self.dfs[inst]["invrisk"] = np.log(1 / self
24 30 29125.0 970.8 0.1 self.dfs[inst]["smaf"] = self.dfs[inst]["ad
25 30 28018.0 933.9 0.1 self.dfs[inst]["smam"] = self.dfs[inst]["ad
26 30 27854.0 928.5 0.1 self.dfs[inst]["smas"] = self.dfs[inst]["ad
27 30 28007.0 933.6 0.1 self.dfs[inst]["smass"] = self.dfs[inst]["a
28 180 44278.0 246.0 0.2 self.dfs[inst]["alphas"] = 0.0 + \
29 30 11459.0 382.0 0.0 (self.dfs[inst]["smaf"] > self.dfs[inst

57
30 30 8550.0 285.0 0.0 (self.dfs[inst]["smaf"] > self.dfs[inst
31 30 8152.0 271.7 0.0 (self.dfs[inst]["smaf"] > self.dfs[inst
32 30 6436.0 214.5 0.0 (self.dfs[inst]["smam"] > self.dfs[inst
33 30 6382.0 212.7 0.0 (self.dfs[inst]["smam"] > self.dfs[inst
34
35 30 16898.0 563.3 0.1 self.dfs[inst]["eligible"] = \
36 180 36314.0 201.7 0.2 (~np.isnan(self.dfs[inst]["smass"])) \
37 30 2552.0 85.1 0.0 & self.dfs[inst]["active"] \
38 30 6456.0 215.2 0.0 & (self.dfs[inst]["vol"] > 0) \
39 30 6031.0 201.0 0.0 & (self.dfs[inst]["adj_close"] > 0) \
40 30 12133.0 404.4 0.1 & (self.dfs[inst]["ret"].shift(-1) < 0.
41 30 10065.0 335.5 0.0 & (self.dfs[inst]["ret"].shift(-1) > -0

Note that we have intentionally let the output runoff the page margins - you already know how they
are implemented anyway. We verify that 98% of the time was in fact spent in the parent compute, so let’s
profile that instead by moving the profile decorator to the parent file. We get the output as follows:

Wrote profile results to strat3.py.lprof

Timer unit: 1e-06 s

Total time: 23.3294 s

Function: compute_metas at line 23

Line # Hits Time Per Hit % Time Line Contents

==============================================================
23 @profile
24 def compute_metas(self, index):
25 """
26 Needs to compute
27 invrisk, ret, eligible + all functionality used by
28 """
29 31 72.0 2.3 0.0 for inst in self.instruments:
30 30 8386.0 279.5 0.0 df = pd.DataFrame(index=index)
31 30 66263.0 2208.8 0.3 self.dfs[inst]["vol"] = (-1 + self.dfs[inst]["a
32 30 40912.0 1363.7 0.2 self.dfs[inst] = df.join(self.dfs[inst])

58
33 30 22129.0 737.6 0.1 self.dfs[inst] = self.dfs[inst].fillna(method="
34 30 40410.0 1347.0 0.2 self.dfs[inst]["ret"] = -1 + self.dfs[inst]["ad
35 30 35012.0 1167.1 0.2 self.dfs[inst]["sampled"] = self.dfs[inst]["adj
36 30 23116166.0 770538.9 99.1 self.dfs[inst]["active"] = self.dfs[inst]["samp
37 1 1.0 1.0 0.0 pass

It turns out that 99% of the time in this function was spent on the line

self.dfs[inst]["active"] = self.dfs[inst]["sampled"]
.rolling(5)
.apply(lambda x: int(np.any(x)))
.fillna(0)

That’s interesting - we might not have guessed that profiling the code with our own two eyes, but the
quantitative probe makes it obvious what slowed us down. Imagine if we spent hours making the other lines
virtually ‘instantaneous’, we would have at most reduced the run time here by 1%!.

6.1.5 Inspecting Bytecode

For more advanced programmers - we can see the instruction set disassembled by the Python interpreter
using the dis module. A good rule of thumb is that shorter bytecodes correspond to faster runtimes. Let’s
take a simple look:

1 import numpy as np
2 import dis
3

4 arr = np . array ([1 , 2 , 3])

5 def func1 ( arr ) :
6 sum = 0
7 for i in arr :
8 sum += i
9

10 def func2 ( arr ) :

11 sum = 0
12 for i in arr :
13 sum = sum + i
14

15 def func3 ( arr ) :

16 sum = np . sum ( arr )

59
17

18 dis . dis ( func1 )

19 print ( " - - - - - - - - - - - - - - - - - - " )
20 dis . dis ( func2 )
21 print ( " - - - - - - - - - - - - - - - - - - " )
22 dis . dis ( func3 )

gives the output

13 0 LOAD_CONST 1 (0)
2 STORE_FAST 1 (sum)

14 4 LOAD_FAST 0 (arr)
6 GET_ITER
>> 8 FOR_ITER 12 (to 22)
10 STORE_FAST 2 (i)

15 12 LOAD_FAST 1 (sum)
14 LOAD_FAST 2 (i)
16 INPLACE_ADD
18 STORE_FAST 1 (sum)
20 JUMP_ABSOLUTE 8
>> 22 LOAD_CONST 0 (None)
24 RETURN_VALUE
------------------------------------------------------
18 0 LOAD_CONST 1 (0)
2 STORE_FAST 1 (sum)

19 4 LOAD_FAST 0 (arr)
6 GET_ITER
>> 8 FOR_ITER 12 (to 22)
10 STORE_FAST 2 (i)

20 12 LOAD_FAST 1 (sum)
14 LOAD_FAST 2 (i)
16 BINARY_ADD
18 STORE_FAST 1 (sum)

60
20 JUMP_ABSOLUTE 8
>> 22 LOAD_CONST 0 (None)
24 RETURN_VALUE
------------------------------------------------------
23 0 LOAD_GLOBAL 0 (np)
2 LOAD_METHOD 1 (sum)
4 LOAD_FAST 0 (arr)
6 CALL_METHOD 1
8 STORE_FAST 1 (sum)
10 LOAD_CONST 0 (None)
12 RETURN_VALUE

In other cases - this is not as useful (the bytecodes are the same even though we know they behave
differently in runtime!):
1 import dis
2 import pandas as pd
3

4 def func1 () :
5 df = pd . DataFrame ()
6 df . loc [0 , " hi " ]
7 def func2 () :
8 df = pd . DataFrame ()
9 df . at [0 , " hi " ]
10

11 dis . dis ( func1 )

12 print ( " - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - " )
13 dis . dis ( func2 )

13 0 LOAD_GLOBAL 0 (pd)
2 LOAD_METHOD 1 (DataFrame)
4 CALL_METHOD 0
6 STORE_FAST 0 (df)

14 8 LOAD_FAST 0 (df)
10 LOAD_ATTR 2 (loc)
12 LOAD_CONST 1 ((0, ’hi’))
14 BINARY_SUBSCR
16 POP_TOP

61
18 LOAD_CONST 0 (None)
20 RETURN_VALUE
---------------------------------------------------------------
16 0 LOAD_GLOBAL 0 (pd)
2 LOAD_METHOD 1 (DataFrame)
4 CALL_METHOD 0
6 STORE_FAST 0 (df)

17 8 LOAD_FAST 0 (df)
10 LOAD_ATTR 2 (at)
12 LOAD_CONST 1 ((0, ’hi’))
14 BINARY_SUBSCR
16 POP_TOP
18 LOAD_CONST 0 (None)
20 RETURN_VALUE

7 Numba’s Just in Time Compilers

Numba is a just-in-time (JIT) compiler that is designed to compile numpy code via the LLVM compiler at
runtime. Unlike ‘C’, this is done on-the-run, such that the compilation cycle is hidden from the programmer,
using simple decorators that signal to Numba that we want the function to be compiled. Ahead of time
(AOT) compilers, as opposed to JIT, compile our source code into binary machine code before runtime. A
famous example is the gcc compiler. A static binary is created to run on your machine, and usually gives us
the best speedups - but requires code changes by the programmer to be recompiled to create a new binary.
This is quite a hassle! JIT compiles automatically at runtime - meaning on first parse our code runs slower
due to compilation overhead - but requires little to no overhead on the developer’s part. Note that numba
decorators cannot just be slapped on for a ‘quick win’ ! It’s important to think about where the speedup is
coming from.

Numba allows us to do things like GIL release, and static type inference. This means that computations
with many intermediate variables (especially in loops!) benefit the most from compilation. We introduce
these concepts in the later section on an introduction to computing and memory units, Section 8. Vectorized
logic with little variable intermediates do not benefit as much. The speedup in performance from vectorized
numpy code in numba can be attributed to a number of things - memory contiguity, data bus throughput,
vectorized (SIMD) instructions, CPU caching, compilation and GIL unlocking. These are separate (but

62
related) concepts that improve the efficiency of our code, and we should be aware when each kicks in.

Let’s make a jitted function here, using the argument nopython=True to indicate we want to compile in
nopython mode. Functions with this decorator throws Exception when the code logic access Python objects
such as lists or dictionaries, as opposed to standard jitted functions which fall back on Python object mode
silently when the types are compilation incompatible. Note that we can use inspect types to see the datatype
of each object if we cannot get Numba to compile our jitted code in nopython mode - as well as the error
messages thrown. We can use the assert statements to check runtime that the new functionality and old
logic are equivalent! We can also disable asserts in runtime using flags. Our demonstration is just a peek
into Numba’s suite - we encourage readers to take a deeper dive on their own. Let’s take a look:

1 def compute_metas ( self , index ) :

2 """
3 Needs to compute
4 invrisk , ret , eligible + all functionality used by c o m p u t e _ f o r e c a s t s
5 """
6 @jit ( nopython = True )
7 def numba_any ( x ) :
8 return int ( np . any ( x ) )
9

10 for inst in self . instruments :

11 df = pd . DataFrame ( index = index )
12 self . dfs [ inst ][ " vol " ] =
13 ( -1 + self . dfs [ inst ][ " adj_close " ] / self . dfs [ inst ]. shift (1) [ " adj_close " ])
14 . rolling (30)
15 . std ()
16 self . dfs [ inst ] = df . join ( self . dfs [ inst ])
17 self . dfs [ inst ] = self . dfs [ inst ]. fillna ( method = " ffill " ) . fillna ( method = " bfill " )
18 self . dfs [ inst ][ " ret " ] = -1 + self . dfs [ inst ][ " adj_close " ]
19 / self . dfs [ inst ]. shift (1) [ " adj_close " ]
20 self . dfs [ inst ][ " sampled " ] = self . dfs [ inst ][ " adj_close " ]
21 != self . dfs [ inst ]. shift (1) [ " adj_close " ]
22

23 # assert ( self . dfs [ inst ][" sampled "]. rolling (5) . apply ( numba_any , engine =" numba " , raw =
True ) . equals (
24 # self . dfs [ inst ][" sampled "]. rolling (5) . apply ( lambda x : int ( np . any ( x ) ) ) ) )
25

26 self . dfs [ inst ][ " active " ] =

27 self . dfs [ inst ][ " sampled " ]
28 . rolling (5)
29 . apply ( numba_any , engine = " numba " , raw = True )
30 . fillna (0)

63
31 pass

Listing 22: alpha.py

Let’s rerun the profiling:

Wrote profile results to strat3.py.lprof

Timer unit: 1e-06 s

Total time: 1.89678 s

Function: compute_metas at line 24

Line # Hits Time Per Hit % Time Line Contents

==============================================================
24 @profile
25 def compute_metas(self, index):
26 """
27 Needs to compute
28 invrisk, ret, eligible + all functionality used
29 """
30 1 47.0 47.0 0.0 @jit(nopython=True)
31 1 34402.0 34402.0 1.8 def numba_any(x):
32 return int(np.any(x))
33
34 31 45.0 1.5 0.0 for inst in self.instruments:
35 30 7806.0 260.2 0.4 df = pd.DataFrame(index=index)
36 30 63500.0 2116.7 3.3 self.dfs[inst]["vol"] = (-1 + self.dfs[inst
37 30 36668.0 1222.3 1.9 self.dfs[inst] = df.join(self.dfs[inst])
38 30 20899.0 696.6 1.1 self.dfs[inst] = self.dfs[inst].fillna(meth
39 30 39939.0 1331.3 2.1 self.dfs[inst]["ret"] = -1 + self.dfs[inst]
40 30 34928.0 1164.3 1.8 self.dfs[inst]["sampled"] = self.dfs[inst][
41
42 # assert(self.dfs[inst]["sampled"].rolling(
43 # self.dfs[inst]["sampled"].rolling(5).
44
45 30 1658545.0 55284.8 87.4 self.dfs[inst]["active"] = self.dfs[inst]["

64
46
47 1 0.0 0.0 0.0 pass

and we see that we have reduced the time spent in the rolling function from 99% down to 87.4%. This
runs at

@timeme: run_simulation took 31.71785879135132 seconds

down from 42.09293055534363 seconds by making use of Numba’s Just in Time compilation. Although
it still takes up majority of the time in the compute metas function, the rolling function is not in itself the
most efficient task and we have managed to speed it up considerably.

Reprofiling the function calls with cProfile and

python3 -m cProfile -o temp.txt strat3.py

&&
import pstats
p = pstats.Stats("temp.txt")
p.strip_dirs().sort_stats("cumulative").print_stats(20)

with output

Ordered by: cumulative time

List reduced from 11617 to 20 due to restriction <20>

ncalls tottime percall cumtime percall filename:lineno(function)

1652/1 0.014 0.000 57.856 57.856 {built-in method builtins.exec}
1 0.000 0.000 57.856 57.856 strat3.py:1(<module>)
1 0.000 0.000 55.271 55.271 dirtylib.py:14(measure_time)
1 0.138 0.138 55.271 55.271 alpha.py:94(run_simulation)
1644983 2.338 0.000 37.785 0.000 indexing.py:2267(__getitem__)
1644983 2.203 0.000 33.395 0.000 indexing.py:2216(__getitem__)
1644983 3.755 0.000 30.679 0.000 frame.py:3592(_get_value)
1258912 3.440 0.000 23.574 0.000 datetimes.py:639(get_loc)
3653 0.356 0.000 13.401 0.004 strat3.py:49(post_risk_management)
3653 1.016 0.000 13.059 0.004 alpha.py:70(set_positions)
7306 0.019 0.000 12.270 0.002 alpha.py:61(compute_eligibles)
340576 0.478 0.000 10.157 0.000 backtest_utils.py:58(unit_base_value)

65
1258912 5.583 0.000 7.213 0.000 datetimes.py:700(_maybe_cast_for_get_loc)
361372 0.601 0.000 6.628 0.000 indexing.py:2277(__setitem__)
3653 0.360 0.000 6.376 0.002 alpha.py:87(set_weights)
7306 0.256 0.000 6.292 0.001 alpha.py:62(<listcomp>)
7306 0.232 0.000 5.960 0.001 alpha.py:63(<listcomp>)
361372 1.325 0.000 5.507 0.000 indexing.py:2228(__setitem__)
3652 0.339 0.000 5.338 0.001 backtest_utils.py:71(get_pnl_stats)
1261203 1.112 0.000 4.748 0.000 base.py:3585(get_loc)

the function compute metas has been relegated out of the most ‘demanding’ of our attention for CPU
usage - we now want to shift our attention to speeding up other parts of our code logic. Let’s rerun this with
the original problem set to make sure that our edits have been sound and effective - the terminal wealth
matches and we get run time of

@timeme: run_simulation took 426.5956280231476 seconds

compared to the original runtime of 1249.9761579036713 seconds trial-ed in Section 4.5. We have obtained
roughly a 300% speedup.

Applying the same jit technique to our strategy 2 in the drawdown computations with:

self.dfs[inst]["dd"] = \
(1 + self.dfs[inst]["ret"]).rolling(window=12).apply(np.prod, raw=True, engine="numba") \
/ (1 + self.dfs[inst]["ret"]).rolling(window=12).apply(np.prod, raw=True, engine="numba").cummax() \
- 1

We reduce the runtime from 225.85177397727966 seconds down to

@timeme: run_simulation took 69.975656747818 seconds

We have effectively shown how we first profile the overall CPU requirements of the different functions
in a script run - and then see which functions demand our attention. We may optimize these functions by
either making it run more efficiently or calling them less often all together. We see how we might address
the pain points by reducing the function calls using vectorized operations.

66
8 Vector Operations, CPUs and Memory Fragmentation

8.1 Computing Units

Programming in Python carefully hides us from dealing with memory. We do not need to know where the
variable x = 1 is stored in memory, or how it is retrieved, or how it is edited. Is storing 1 and 1.0 any
different? Where is it stored? Let’s briefly introduce the memory interactions - our script runs thanks to
the CPU doing mathematical operations on data retrieved from memory, which can be thought of to consist
of RAM and hard drive. There are data buses that transfers data from the memory units to the computing
units, but since transferring data over a slow bus is costly - the CPU itself has cache registers that store
data and is closer to the actual ‘calculator’ that does things like flip bits to give us mathematical outputs.
The computing unit’s speed depends on how many instructions it can run in a cycle (IPC), and how many
such cycles can run in a second (clock speed measured in Hz ). A single instruction can run on the CPU on
single data units, or on multiple data units (vectorization), without extra overhead! As transistor technology
becomes increasingly more complex - advancements in CPU design have become slower and slower (refer
to debate on Moore’s law for interested readers). These units run machine code instructions, which are
compiled from assembly code, which are disassembled from high level languages such as ’C’. Languages can
be interpreted and run on virtual machines such as ‘Python’, or take the compilation life cycle like in ’C’,
or a mix of both such as ‘Java’. Compilers do smart optimizations such as branch-prediction, out-of-order
execution and pre-emptively loads data that it thinks our program needs into registers such that it would
already be in cache when needed.

There are non-trivial difficulties when dealing with multiple processes sharing the same memory units
- cache registers need to be mapped to the process it is running, and two processes writing to the same
memory on hardware can corrupt each other. Some of these are handled by the kernel, and others need
to be handled by the code - Python makes it ‘safer’ by using a global interpreter lock (GIL), which makes
sure that a single Python process only has access to the Python interpreter once at a time, regardless of the
number of cores used. Each core competes for the GIL, and to deal with this we can either explicitly spawn
more processes (such that each process has its own GIL) or disable the GIL all together.

This is why threads in Python are only used for I/O heavy tasks - when it is equivalently performant to
do ‘nothing’ since we are waiting for a result. Threads are more lightweight concurrency tools than processes
- but threads do not have their own ‘space’. Multiple threads running on the same process are GIL-bounded,
and take turns to execute!

67
8.2 Memory Units

The memory units (whether it is cache registers, RAM or hard disks) differ in how ‘close’ they are to the
CPU, and differ in read/write throughput. The speed of this also depends on how close or far away each
data requested is stored in physical memory. Reading sequentially is significantly cheaper than jumping
memory addresses, and this can be influenced by how the programmer decides to store data (for instance
with our choice of data types). Memory is typically arranged hierarchically such that the data that we think
we need more often/soon is placed closer to the CPU. Imagine we own a hundred shoes, you would place
your ‘favorite pair’ closer to the door and dump the ugly one in the closet - this reduces search time for your
shoes when leaving the house. Accessing different parts of the memory tiers and transferring them around
requires a data bus - which have different throughputs based on their physical properties, such as the bus
width and bus frequency (similar in concept to CPU Hz and IPC). Instructions for vectorized operations
need all the data to be in the CPU when the operations are performed, hence having a larger bus width
favors vectorizations when data is stored in contiguous memory blocks, while bus frequency is useful when
we jump memory addresses. Using data types such as numpy that forces the memory to keep our data
closer together makes vectorized operations quicker and increase underlying memory transfer throughput.
Although Python makes our life simpler by abstracting away lower level details such as the garbage collection
(free-ing up memory), this means we effectively lose control over where our data is stored and freed - and
which are therefore stored sub-optimally. Data that we often need together are fragmented, which means
we cannot transfer the data to the CPU in ‘one go’. These are the inherent issues in native Python that we
keep in mind and attempt to address in our work.

Note that in Linux we can profile our scripts and probe statistics relevant to cycles, instructions, cache
hit/miss rates using

perf stat -e stat1, stat2, and others python3 strat1.py

. We will not go into the details here - more advanced programmers can dive into this from other literature.
The statistics give us invaluable insight into the low-level profiling and how our application behaves with
respect to the CPU and memory.

8.3 Vectorization in Python

Python does not natively support vectorization - its lists store pointers/references to the data instead of
storing it itself - the data bus cannot transfer required data units from the memory at once. Python
bytecode (similar to assembly language) is not optimized for vectorizations. While store-by-reference can be
useful, especially when working with dynamic datatypes that we know little about its memory requirements,

68
when working with homogeneous data such as batch numerical computations, it is unnecessary! However,
the VM does perform some tricks to try to work with cache misses - such as preemptively moving data from
the RAM into the cache when it predicts related data to the current logic will be accessed. Think about
increment counters in a for loop! This is why grouping together similar logic by simply refactoring our code
can improve performance - even if we perform the same instructions. We should perform overheads such
as disk interactions, memory allocations (C does this explicitly with malloc) - these disrupt the flow of the
program and ‘dirties’ our cache registers.

There are workarounds to Python’s native data structures - by using modules such as array and numpy
we can ask our OS to store data in contiguous chunks of memory. Numpy has underlying optimizations
written in ‘C’ to make use of the vectorized instruction sets with better memory management to give the
CPU the data that it needs when it needs it. Each vectorized instruction in numpy code performs more
computations per single instruction. Moving chunks of data over the data bus to the CPU together reduces
cache misses! We get a double whammy benefit of both memory contiguity (leading to higher cache hit
rates), which enables vectorization (leading to lower total instructions required).

8.4 Vectorization in Our Code, Bit Masking

The previous profiling still left us with the knowledge that operations such as getitem takes up significant
amount of time of the overall CPU usage. This is because we make 1644983 calls to it - we see statements
such as

for inst in instruments:

self.dfs[inst].at[i, "close"] ...

Even though ‘.at’ is relatively more efficient, we know that we often need the same data variable for all the
assets in the universe. We could get it more efficiently by pre-arranging the data in formats by anticipating
the data access patterns.

Most of the difficulty with respect to vectorising our code is in the non-linear logic. Numpy has excellent
support for linear algebra and handling matrices of numbers, but where conditional logic is required, some
trickery is required.

For instance, we want to set weights and units held for non-eligible asset universes to 0. We do this
separately in our non-vectorized code. This is because dealing with non-eligible assets give us a headache.
For instance, including assets that have degenerated volatility would cause a division by zero error - and the
according position size would be infinity. Division by a non-existing null value would result in a NaN value.
We want to avoid these annoyances in our signal calculation by filtering them out beforehand.

69
Here are some quirks with the numpy array and how it deals with invalid operations:

>>> arr1 = np.array([1, 0, np.nan, np.inf])

>>> arr1 / 1
array([ 1., 0., nan, inf])

>>> arr1 / 0
<stdin>:1: RuntimeWarning: divide by zero encountered in divide
<stdin>:1: RuntimeWarning: invalid value encountered in divide
array([inf, nan, nan, inf])

>>> arr1 / np.nan

array([nan, nan, nan, nan])

>>> arr1 / np.inf

array([ 0., 0., nan, nan])

It turns out we can do something quite clever with this - if we have an array of flags of whether each asset
in the universe is ‘eligible’ or ‘not eligible’ for consideration, we can use the boolean array as a bit mask by
converting them to 1 and 0. We can then do something like:

eligible_mask * some calculation / eligible_mask

which for instruments correspond to elementwise operations

1 * some calculation / 1 <---------- (valid instruments)

0 * some calculation / 0 <-------- (invalid instruments)

After the operations, our vector array would contain the desired results for indexes that correspond to eligible
instruments and NaN values for ineligible instruments.

We can follow up with the operation

np.nan_to_num(arr1, nan=0, posinf=0, neginf=0)

to cast the NaN and other invalid positions to 0, which correspond to zero units held in ineligible assets!

Perhaps it is better to illustrate the changes at this point with concrete code implementation. Let’s
make some changes to our code based on the previous discussions:

70
(1.) Group administrative logic computations at the beginning of run simulation and group them in ho-
mogenous dataframes for contiguous access.

(2.) Vectorize computation of position sizing

(3.) Unzip data that we need for each day by using iterrows() and iterate through them together by unrolling
the dataframes together on shared index. (more on row access later!)

(4.) Maintain state for previous units held, FX-adjusted closing prices and weight vectors so that we do not
have to use ‘.loc’ to access them on compute for pnl.

1 import numpy as np
2 import pandas as pd
3 from numba import jit
4

5 import backt est_util s

6 import dirtylib as dlib
7

8 from dirtylib import timeme

9 from dirtylib import g e t _ e q u i t i e s _ d i c t
10 from dirtylib import get_fx_dict
11

12 class Alpha () :
13

14 def init ( self , instruments , dfs , configs ) :

15 self . instruments = instruments
16 self . dfs = dfs
17 self . configs = configs
18 self . portfolio_df = None
19

20 def g e t _ t r a d e _ d a t e t i m e _ r a n g e ( self ) :
21 return ( self . configs [ " start " ] , self . configs [ " end " ])
22

23 def compute_metas ( self , index ) :

24 """
25 Needs to compute
26 invrisk , ret , eligible + all functionality used by c o m p u t e _ f o r e c a s t s
27 """
28 @jit ( nopython = True )
29 def numba_any ( x ) :
30 return int ( np . any ( x ) )
31

32 vols , rets , actives , closes , fxconvs = [] , [] , [] , [] , []

33 for inst in self . instruments :

71
34 df = pd . DataFrame ( index = index )
35 self . dfs [ inst ][ " vol " ] = ( -1 + self . dfs [ inst ][ " adj_close " ] / self . dfs [ inst ]. shift
(1) [ " adj_close " ]) . rolling (30) . std ()
36 self . dfs [ inst ] = df . join ( self . dfs [ inst ])
37 self . dfs [ inst ] = self . dfs [ inst ]. fillna ( method = " ffill " ) . fillna ( method = " bfill " )
38 self . dfs [ inst ][ " ret " ] = -1 + self . dfs [ inst ][ " adj_close " ] / self . dfs [ inst ]. shift
(1) [ " adj_close " ]
39 self . dfs [ inst ][ " sampled " ] = self . dfs [ inst ][ " adj_close " ] != self . dfs [ inst ]. shift
(1) [ " adj_close " ]
40

41 # assert ( self . dfs [ inst ][" sampled "]. rolling (5) . apply ( numba_any , engine =" numba " ,
raw = True ) . equals (
42 # self . dfs [ inst ][" sampled "]. rolling (5) . apply ( lambda x : int ( np . any ( x ) ) ) ) )
43 self . dfs [ inst ][ " active " ] = self . dfs [ inst ][ " sampled " ]. rolling (5) . apply ( numba_any ,
engine = " numba " , raw = True ) . fillna (0)
44 vols . append ( self . dfs [ inst ][ " vol " ])
45 rets . append ( self . dfs [ inst ][ " ret " ])
46 actives . append ( self . dfs [ inst ][ " active " ])
47 closes . append ( self . dfs [ inst ][ " adj_close " ])
48

49 for inst in self . instruments :

50 if inst [ -3:] == " USD " :
51 fxconvs . append ( pd . Series ( index = index , data = np . ones ( len ( index ) ) ) )
52 elif inst [ -3:] + " USD % USD " in self . dfs :
53 fxconvs . append ( self . dfs [ inst [ -3:] + " USD % USD " ][ " adj_close " ])
54 elif " USD " + inst [ -3:] + " % " + inst [ -3:] in self . dfs :
55 fxconvs . append (1 / self . dfs [ " USD " + inst [ -3:] + " % " + inst [ -3:]][ " adj_close "
])
56 else :
57 print ( " no resolution " , inst )
58 exit ()
59

60 self . voldf = pd . concat ( vols , axis =1)

61 self . voldf . columns = self . instruments
62 self . retdf = pd . concat ( rets , axis =1)
63 self . retdf . columns = self . instruments
64 self . activedf = pd . concat ( actives , axis =1)
65 self . activedf . columns = self . instruments
66 closedf = pd . concat ( closes , axis =1)
67 closedf . columns = self . instruments
68 fxconvsdf = pd . concat ( fxconvs , axis =1)
69 fxconvsdf . columns = self . instruments
70 self . baseclosedf = fxconvsdf * closedf

72
71 pass
72

73 def c o m p u t e _ f o r e c a s t s ( self , portfolio_i , date , eligibles_row ) :

74 pass
75

76 def p o s t _ r i s k _ m a n a g e m e n t ( self , index , date , eligibles , nominal_tot , positions , weights ) :

77 return nominal_tot , positions , weights
78

79 def i n i t _ p o r t f o l i o _ s e t t i n g s ( self , trade_range ) :

80 self . portfolio_df = pd . DataFrame ( index = trade_range ) \
81 . reset_index () \
82 . rename ( columns ={ " index " : " datetime " })
83 self . portfolio_df . at [0 , " capital " ] = 10000
84 self . portfolio_df . at [0 , " ewma " ] = 0.001
85 self . portfolio_df . at [0 , " ewstrat " ] = 1
86 return self . portfolio_df
87

88 def ge t_ s t r a t _ s c a l e r ( self , lookback , portfolio_vol , idx , default ) :

89 a n n _ r e a l i z e d _v o l = np . sqrt ( self . portfolio_df . at [ idx - 1 , " ewma " ] * 252)
90 return portfolio_vol / a n n _ r e a l i z e d _ v o l * self . portfolio_df . at [ idx - 1 , " ewstrat " ]
91

92 def set_positions (
93 self ,
94 portfolio_i ,
95 forecasts ,
96 eligibles ,
97 num_trading ,
98 portfolio_vol ,
99 strat_scalar ,
100 invrisk_row ,
101 baseclose_row
102 ):
103 vol_target = 1 \
104 / num_trading \
105 * self . portfolio_df . at [ portfolio_i , " capital " ] \
106 * portfolio_vol \
107 / np . sqrt (253)
108 positions = eligibles \
109 * strat_scalar \
110 * vol_target \
111 * forecasts \
112 * invrisk_row . values \
113 / baseclose_row . values

73
114

115 positions = np . nan_to_num ( positions , nan =0 , posinf =0 , neginf =0)

116 nominal_tot = np . linalg . norm ( positions * baseclose_row . values , ord =1)
117 return positions , nominal_tot
118

119 def set_weights ( self , nominal_tot , positions , baseclose_row ) :

120 nominals = positions * baseclose_row . values
121 weights = np . nan_to_num ( nominals / nominal_tot , nan =0 , posinf =0 , neginf =0)
122 return weights
123

124 @timeme
125 def run_sim ulation ( self ) :
126 """
127 Settings
128 """
129 portfolio_vol = 0.20
130 t r a d e _ d a t e t i m e _ r a n g e = pd . date_range (
131 start = self . g e t _ t r a d e _ d a t e t i m e _ r a n g e () [0] ,
132 end = self . g e t _ t r a d e _ d a t e t i m e _ r a n g e () [1] ,
133 freq = " D "
134 )
135

136 """
137 Compute Metas
138 """
139 self . compute_metas ( index = t r a d e _ d a t e t i m e _ r a n g e )
140

141 """
142 I niti al i sa ti o ns
143 """
144 self . portfolio_df = self . i n i t _ p o r t f o l i o _ s e t t i n g s ( trade_range = t r a d e _ d a t e t i m e _ r a n g e )
145

146 date_prev = None

147 base close_pr ev = None
148 units_held , weights_held = [] , []
149 """
150 self . eligiblesdf = eligiblesdf
151 self . invriskdf = invriskdf ]
152 self . any informa tion required by c o m p u t e _ f o r e c a s t s
153 NOTE : needs to be implemented by the child alpha class self . compute_metas
154 """
155 for ( portfolio_i , portfolio_row ) ,\
156 ( vol_i , vol_row ) ,\

74
157 ( ret_i , ret_row ) ,\
158 ( active_i , active_row ) ,\
159 ( baseclose_i , baseclose_row ) , \
160 ( eligibles_i , eligibles_row ) ,\
161 ( invrisk_i , invrisk_row ) ,\
162 in zip (\
163 self . portfolio_df . iterrows () ,\
164 self . voldf . iterrows () ,\
165 self . retdf . iterrows () ,\
166 self . activedf . iterrows () ,\
167 self . baseclosedf . iterrows () ,\
168 self . eligiblesdf . iterrows () ,\
169 self . invriskdf . iterrows () ,\
170 ):
171

172 strat_scalar = 2
173 eligibles = eligibles_row
174

175 if portfolio_i != 0:
176

177 day_pnl , nominal_ret = ba cktest_ utils . get_pnl_stats (

178 portfolio_df = self . portfolio_df ,
179 last_weights = weights_held [ -1] ,
180 last_units = units_held [ -1] ,
181 idx = portfolio_i ,
182 baseclose_row = baseclose_prev ,
183 ret_row = ret_row
184 )
185

186 strat_scalar = self . g e t _ s t r at _ s c a l e r (

187 lookback =30 ,
188 portfolio_vol = portfolio_vol ,
189 idx = portfolio_i ,
190 default = strat_scalar
191 )
192

193 self . portfolio_df . at [ portfolio_i , " ewstrat " ] =

194 0.06 * strat_scalar
195 + 0.94 * self . portfolio_df . at [ portfolio_i - 1 , " ewstrat " ] \
196 if nominal_ret != 0 else self . portfolio_df . at [ portfolio_i - 1 , " ewstrat "
]
197

198 self . portfolio_df . at [ portfolio_i , " strat scalar " ] = strat_scalar

75
199

200 forecasts , num_trading = self . c o m p u t e _ f o r e c a s t s (

201 portfolio_i = portfolio_i ,
202 date = ret_i ,
203 eligibles_row = eligibles_row
204 )
205 positions , nominal_tot = self . set_positions (
206 portfolio_i = portfolio_i ,
207 forecasts = forecasts ,
208 eligibles = eligibles ,
209 num_trading = num_trading ,
210 portfolio_vol = portfolio_vol ,
211 strat_scalar = strat_scalar ,
212 invrisk_row = invrisk_row ,
213 baseclose_row = baseclose_row
214 )
215

216 weights = self . set_weights ( nominal_tot , positions , baseclose_row )

217 nominal_tot , positions , weights = self . p o s t _ r i s k _ m a n a g e m e n t (
218 index = portfolio_i ,
219 date = ret_i ,
220 eligibles = eligibles ,
221 nominal_tot = nominal_tot ,
222 positions = positions ,
223 weights = weights
224 )
225

226 self . portfolio_df . at [ portfolio_i , " nominal " ] = nominal_tot

227 self . portfolio_df . at [ portfolio_i , " leverage " ] = \
228 self . portfolio_df . at [ portfolio_i , " nominal " ] / self . portfolio_df . at [
portfolio_i , " capital " ]
229

230 print ( self . portfolio_df . loc [ portfolio_i ])

231 date_prev = portfolio_i
232 units_held . append ( positions )
233 weights_held . append ( weights )
234 base close_pr ev = baseclose_row
235

236 return self . portfolio_df

Listing 23: alpha.py

Let’s make the adjustments in our strategy files (note that compute forecasts is still suboptimally writ-
ten!):

76
1 import numpy as np
2 import pandas as pd
3 from numba import jit
4

5 import backt est_util s

6 import dirtylib as dlib
7 import alpha
8

9 from dirtylib import timeme

10 from dirtylib import g e t _ e q u i t i e s _ d i c t
11 from dirtylib import get_fx_dict
12

13 class Strat1 ( alpha . Alpha ) :

15 def init ( self , instruments , dfs , configs ) :

16 super () . __init__ ( instruments , dfs , configs )
17

18 def compute_metas ( self , index ) :

19 @jit ( nopython = True )
20 def n um ba _n o ta ny na n ( x ) :
21 return int (~ np . any ( np . isnan ( x ) ) )
22

23 eligibles = []
24 for inst in self . instruments :
25 self . dfs [ inst ][ " alive " ] = self . dfs [ inst ][ " adj_close " ]\
26 . rolling (61) . apply ( numba_notanynan , engine = " numba " , raw = True ) . fillna (0)
27 super () . compute_metas ( index )
28 retagg = pd . DataFrame ( index = index )
29 for inst in self . instruments :
30 print ( inst )
31 self . dfs [ inst ][ " eligible " ] = \
32 self . dfs [ inst ][ " alive " ]. astype ( int ) &\
33 self . dfs [ inst ][ " active " ]. astype ( int ) &\
34 ( self . dfs [ inst ][ " vol " ] > 0) . astype ( int ) &\
35 ( self . dfs [ inst ][ " adj_close " ] > 0) . astype ( int )
36

37 retagg [ inst ] = self . dfs [ inst ][ " ret " ]

38 eligibles . append ( self . dfs [ inst ][ " eligible " ])
39

40 retagg [ " ret " ] = retagg . mean ( axis =1)

41 self . dfs [ " mkt " ] = retagg
42

43 self . invriskdf = 1 / self . voldf

77
44 self . eligiblesdf = pd . concat ( eligibles , axis =1)
45 self . eligiblesdf . columns = self . instruments
46

47 def c o m p u t e _ f o r e c a s t s ( self , portfolio_i , date , eligibles_row ) :

48 from sklearn . linear_model import L i n e a r R e g r e s s io n
49 from sklearn . preprocessing import P o l y n o m i a l F e a t u r e s
50 from scipy . stats import skew
51

52 alpha_scores = {}
53 temp = eligibles_row . values
54 for i in range ( len ( self . instruments ) ) :
55 if temp [ i ]:
56 date_start = self . portfolio_df . at [ portfolio_i - 60 , " datetime " ]
57 x = self . dfs [ " mkt " ][ " ret " ]. loc [ date_start : date ]. values . reshape (( -1 , 1) )
58 y = self . dfs [ self . instruments [ i ]][ " ret " ]. loc [ date_start : date ]
59 model = L i n e a r Re g r e s s i o n () . fit (x , y )
60 ypred = model . predict ( x )
61 residuals = y - ypred
62 alpha_scores [ self . instruments [ i ]] = -1 * skew ( residuals )
63

64 alpha_scores = { k : v for k , v in sorted ( alpha_scores . items () , key = lambda pair : pair

[1]) }
65 alphalong = list ( alpha_scores . keys () ) [ - self . configs [ " longsize " ]:]
66 alphashort = list ( alpha_scores . keys () ) [: self . configs [ " shortsize " ]]
67 forecaster = lambda inst : 1 if inst in alphalong else ( -1 if inst in alphashort else
0)
68 forecasts = np . array ([ forecaster ( inst ) for inst in self . instruments ])
69 return forecasts , self . configs [ " longsize " ] + self . configs [ " shortsize " ]

Listing 24: strat1.py

Take note the the lines above:

retagg[inst] = self.dfs[inst]["ret"]

where we are creating a dataframe from partial results by appending each column on every iteration of the
loop.

1 import ...
2

3 class Strat2 ( alpha . Alpha ) :

5 def init ( self , instruments , dfs , configs ) :

6 super () . __init__ ( instruments , dfs , configs )

78
7

8 def compute_metas ( self , index ) :

9 super () . compute_metas ( index )
10 dds , eligibles = [] , []
11 for inst in self . instruments :
12 print ( inst )
13 self . dfs [ inst ][ " dd " ] = \
14 (1 + self . dfs [ inst ][ " ret " ]) . rolling ( window =12) . apply ( np . prod , raw = True ,
engine = " numba " ) \
15 / (1 + self . dfs [ inst ][ " ret " ]) . rolling ( window =12) . apply ( np . prod , raw = True ,
engine = " numba " ) . cummax () \
16 - 1
17 self . dfs [ inst ][ " eligible " ] = \
18 (~ np . isnan ( self . dfs [ inst ][ " dd " ]) ) \
19 & self . dfs [ inst ][ " active " ] \
20 & ( self . dfs [ inst ][ " vol " ] > 0) \
21 & ( self . dfs [ inst ][ " adj_close " ] > 0) \
22 & ( self . dfs [ inst ][ " ret " ]. shift ( -1) < 0.30) \
23 & ( self . dfs [ inst ][ " ret " ]. shift ( -1) > -0.30)
24

25 dds . append ( self . dfs [ inst ][ " dd " ])

26 eligibles . append ( self . dfs [ inst ][ " eligible " ])
27

28 self . invriskdf = np . log (1 / self . voldf ) / np . log (1.3)

29 self . dddf = pd . concat ( dds , axis =1)
30 self . dddf . columns = self . instruments
31 self . eligiblesdf = pd . concat ( eligibles , axis =1)
32 self . eligiblesdf . columns = self . instruments
33

34 def c o m p u t e _ f o r e c a s t s ( self , portfolio_i , date , eligibles_row ) :

35 drawdowns = self . dddf . loc [ date ]. values
36 alpha_scores = {}
37 temp = eligibles_row . values
38 for i in range ( len ( self . instruments ) ) :
39 if temp [ i ]:
40 alpha_scores [ self . instruments [ i ]] = -1 * drawdowns [ i ]
41 alpha_scores = { k : v for k , v in sorted ( alpha_scores . items () , key = lambda pair : pair
[1]) }
42 alphalong = list ( alpha_scores . keys () ) [ - self . configs [ " longsize " ]:]
43 alphashort = list ( alpha_scores . keys () ) [: self . configs [ " shortsize " ]]
44 forecaster = lambda inst : 1 if inst in alphalong else ( -1 if inst in alphashort else
0)
45 forecasts = np . array ([ forecaster ( inst ) for inst in self . instruments ])

79
46 return forecasts , self . configs [ " longsize " ] + self . configs [ " shortsize " ]

Listing 25: strat2.py

1 class Strat3 ( alpha . Alpha ) :

3 def init ( self , instruments , dfs , configs ) :

4 super () . __init__ ( instruments , dfs , configs )
5

6 # @profile
7 def compute_metas ( self , index ) :
8 super () . compute_metas ( index )
9 alphas , eligibles , trenders = [] , [] , []
10 for inst in self . instruments :
11 print ( inst )
12 self . dfs [ inst ][ " smaf " ] = self . dfs [ inst ][ " adj_close " ]. rolling (10) . mean ()
13 self . dfs [ inst ][ " smam " ] = self . dfs [ inst ][ " adj_close " ]. rolling (30) . mean ()
14 self . dfs [ inst ][ " smas " ] = self . dfs [ inst ][ " adj_close " ]. rolling (100) . mean ()
15 self . dfs [ inst ][ " smass " ] = self . dfs [ inst ][ " adj_close " ]. rolling (300) . mean ()
16 self . dfs [ inst ][ " alphas " ] = 0.0 + \
17 ( self . dfs [ inst ][ " smaf " ] > self . dfs [ inst ][ " smam " ]) + \
18 ( self . dfs [ inst ][ " smaf " ] > self . dfs [ inst ][ " smas " ]) + \
19 ( self . dfs [ inst ][ " smaf " ] > self . dfs [ inst ][ " smass " ]) + \
20 ( self . dfs [ inst ][ " smam " ] > self . dfs [ inst ][ " smas " ]) + \
21 ( self . dfs [ inst ][ " smam " ] > self . dfs [ inst ][ " smass " ])
22

23 self . dfs [ inst ][ " eligible " ] = \

24 (~ np . isnan ( self . dfs [ inst ][ " smass " ]) ) \
25 & self . dfs [ inst ][ " active " ] \
26 & ( self . dfs [ inst ][ " vol " ] > 0) \
27 & ( self . dfs [ inst ][ " adj_close " ] > 0) \
28 & ( self . dfs [ inst ][ " ret " ]. shift ( -1) < 0.30) \
29 & ( self . dfs [ inst ][ " ret " ]. shift ( -1) > -0.30)
30

31 alphas . append ( self . dfs [ inst ][ " alphas " ])

32 eligibles . append ( self . dfs [ inst ][ " eligible " ])
33 trenders . append ( self . dfs [ inst ][ " smaf " ] > self . dfs [ inst ][ " smam " ])
34

35 self . invriskdf = np . log (1 / self . voldf ) / np . log (1.3)

36 self . alphadf = pd . concat ( alphas , axis =1)
37 self . alphadf . columns = self . instruments
38 self . eligiblesdf = pd . concat ( eligibles , axis =1)
39 self . eligiblesdf . columns = self . instruments
40 self . trendersdf = pd . concat ( trenders , axis =1)

80
41 self . trendersdf . columns = self . instruments
42 self . eligiblesdf . astype ( ’ int8 ’)
43

44 def c o m p u t e _ f o r e c a s t s ( self , portfolio_i , date , eligibles_row ) :

45 return self . alphadf . loc [ date ] , np . sum ( eligibles_row )
46

47 def p o s t _ r i s k _ m a n a g e m e n t ( self , index , date , eligibles , nominal_tot , positions , weights ) :

48 num_trend = np . dot (0 + eligibles , 0 + self . trendersdf . loc [ date ]. values )
49 if np . sum ( eligibles ) > 0 and num_trend / np . sum ( eligibles ) > 0.60:
50 return 0 , np . zeros ( len ( positions ) ) , np . zeros ( len ( weights ) )
51 return nominal_tot , positions , weights

Listing 26: strat3.py

Let’s compare their run times pre & post vectorization (reduced problem set for strategy 1 and 2);

PRE
Strategy 1: @timeme: run_simulation took 115.60257959365845 seconds
Strategy 2: @timeme: run_simulation took 69.975656747818 seconds
Strategy 3: @timeme: run_simulation took 31.71785879135132 seconds

POST
Strategy 1: @timeme: run_simulation took 105.40305471420288 seconds
Strategy 2: @timeme: run_simulation took 10.909979820251465 seconds
Strategy 3: @timeme: run_simulation took 11.441979169845581 seconds

We see that vectorization and each of the incremental tricks gives us immense speedup, but for Strategy
1, our CPU efficiency seems ‘stuck’.

9 Taking a Step Back, Efficiency with Pandas

Like numpy, pandas data-store is also in contiguous memory blocks, making manipulation (relatively) faster
than with Python list objects - which store elements by reference. While it is built on-top of numpy arrays,
Pandas adds its own suite of functionalities and adds some useful data structures such as NaN bit masks to
make null-aware integer/boolean arrays and so on. This is why our dataframe function calls do not complain
as much (with Exceptions for instance) when we do operations on invalid cells. Their are some quirks - for
instance introducing null elements into integer or boolean arrays promotes the Series into a float type, and
we can cast boolean arrays into integer arrays with tricks such as 0+ boolean series. There are also other

81
subtleties, such as how they are stored in memory, for instance ’C’-like row major formats for numpy arrays
and ‘Fortran’-like column major formats for some dataframes. We will not go into the gritty details - Pandas
have done an excellent job of making sure the user is abstracted away from these lower-level implementations.
Interested readers, should of course, dive deeper - there are optimizations to be found by understanding the
Pandas internals!

9.1 Same-Same but Different

There are many ways to perform the same operations in Pandas, but their efficiency depends on the internal
logic’s implementation. Let’s consider a common operation, which is to apply some function either to a row
of data or a rolling column - and reducing it to a scalar.

We go to pseudo-code here for illustrative purposes:

for index in dataframe.index:

row = dataframe.loc[index]
value = apply_func(row)
dataframe.at[index, column] = value

. In each iteration, we access the row by using ‘.loc’ (which we know to be expensive!). Note that numpy
arrays need homogenous datatypes, such that taking column slices return a view (non-copy) while accessing
rows of heterogenous datatypes give us a Pandas dataframe copy. This is time consuming and is very
inefficient. We can be more Pythonic;

for index, row in dataframe.iterrows():

value = apply_func(row)
dataframe.at[index, column] = value

which makes use of generators (generators also use less space!) and is both more compute and memory
efficient - although we still create Series copies we no longer have to do lookups by indices. ‘iterrows’ signal
that we want to access them in sequence!

dataframe[column] = dataframe.apply(lambda row: apply_func(row))

does the job more efficiently without creating intermediate Python variables (although Series objects
are still copied), while

dataframe[column] = dataframe.apply(lambda row: apply_func(row), raw=True)

82
indicates we want to work directly on the raw values and manipulate the underlying numpy data types.
Since we are manipulating in numpy, we can use compilation with numba, which gives us the ability to use
other tricks such as releasing the GIL in nopython compilation mode by doing on jitted functions:

dataframe[column] = dataframe.apply(lambda row: apply_func(row), raw=True, engine="numba")

dataframe[column] = dataframe.apply(lambda row: apply_func(row), raw=True, engine="cython")

if we cython-ized it.

9.2 Making our Code Same Same-ly Different

Let’s profile our function calls on Strategy 1:

COMMAND
python3 -m cProfile -o temp.txt strat1.py
&&
import pstats
p = pstats.Stats("temp.txt")
p.strip_dirs().sort_stats("cumulative").print_stats(20)

OUTPUT
134224183 function calls (132495513 primitive calls) in 143.278 seconds

Ordered by: cumulative time

List reduced from 13727 to 20 due to restriction <20>

ncalls tottime percall cumtime percall filename:lineno(function)

2438/1 0.237 0.000 143.282 143.282 {built-in method builtins.exec}
1 0.000 0.000 143.282 143.282 strat1.py:1(<module>)
1 0.002 0.002 140.478 140.478 dirtylib.py:14(measure_time)
1 0.221 0.221 140.476 140.476 alpha.py:115(run_simulation)
3653 2.210 0.001 124.664 0.034 strat1.py:48(compute_forecasts)
71279 1.068 0.000 51.040 0.001 _base.py:652(fit)
146212 0.368 0.000 30.598 0.000 indexing.py:954(__getitem__)

83
146211 0.500 0.000 29.979 0.000 indexing.py:1169(_getitem_axis)
142558 0.564 0.000 27.751 0.000 indexing.py:1207(_get_slice_axis)
2230525/1675369 3.495 0.000 23.314 0.000 {built-in method numpy.core...
142558 0.894 0.000 22.972 0.000 base.py:495(_validate_data)
285116 3.511 0.000 22.730 0.000 validation.py:619(check_array)
97050 0.237 0.000 16.688 0.000 common.py:55(new_method)
96928 0.206 0.000 15.841 0.000 series.py:5637(_arith_method)
71279 0.237 0.000 15.778 0.000 validation.py:949(check_X_y)
96928 0.888 0.000 15.494 0.000 base.py:1286(_arith_method)
142558 0.508 0.000 15.431 0.000 datetimes.py:715(slice_indexer)
142558 0.308 0.000 14.695 0.000 base.py:6238(slice_indexer)
71279 1.238 0.000 14.345 0.000 _stats_py.py:968(skew)
142558 0.695 0.000 14.323 0.000 base.py:6441(slice_locs)

The culprit is compute forecasts in our strat1.py file.

Taking a look at its implementation -

1 def c o m p u t e _ f o r e c a s t s ( self , portfolio_i , date , eligibles_row ) :

2 from sklearn . linear_model import L i n e a r R e g r e s si o n
3 from sklearn . preprocessing import P o l y n o m i a l F e a t u r e s
4 from scipy . stats import skew
5

6 alpha_scores = {}
7 temp = eligibles_row . values
8 for i in range ( len ( self . instruments ) ) :
9 if temp [ i ]:
10 date_start = self . portfolio_df . at [ portfolio_i - 60 , " datetime " ]
11 x = self . dfs [ " mkt " ][ " ret " ]. loc [ date_start : date ]. values . reshape (( -1 , 1) )
12 y = self . dfs [ self . instruments [ i ]][ " ret " ]. loc [ date_start : date ]
13 model = L i n e a r R e g re s s i o n () . fit (x , y )
14 ypred = model . predict ( x )
15 residuals = y - ypred
16 alpha_scores [ self . instruments [ i ]] = -1 * skew ( residuals )
17

18 alpha_scores = { k : v for k , v in sorted ( alpha_scores . items () , key = lambda pair : pair [1]) }
19 alphalong = list ( alpha_scores . keys () ) [ - self . configs [ " longsize " ]:]
20 alphashort = list ( alpha_scores . keys () ) [: self . configs [ " shortsize " ]]
21 forecaster = lambda inst : 1 if inst in alphalong else ( -1 if inst in alphashort else 0)
22 forecasts = np . array ([ forecaster ( inst ) for inst in self . instruments ])

84
23 return forecasts , self . configs [ " longsize " ] + self . configs [ " shortsize " ]

Listing 27: strat1.py

Recall that it is a better code design to separate data processing logic due to CPU caches. Additionally,
‘.loc’ operations are inefficient compared to rolling ‘.apply’ functions. Could we improve the code design by
moving this to the meta computes? Recall also that we have made repetitive calls to

retagg[inst] = self.dfs[inst]["ret"]

from partial dataframe results. You might have gotten a warning from pandas that your dataframe is ‘highly
fragmented’. It turns out that each concatenation writes into new memory which contains the original
dataframe and a new row! We repeatedly create new dataframe objects and ask the kernel to give us more
memory space - we should build up a list and do it in one go instead. We will fix that here.

Let’s versionate and come with iteratively better compute meta functions. We need to unroll two
dataframes simultaneously to get our predictor and response time-series for regression - but this is not
supported by the pandas toolset. We can get an extension for numpy called numpy ext that allows us to roll
over multiple dataframes together.

9.2.1 Improvement 1 - Rearranging Logic

1 def compute_metas ( self , index ) :

2 @jit ( nopython = True )
3 def n umba _ no ta ny n an ( x ) :
4 return int (~ np . any ( np . isnan ( x ) ) )
5

6 eligibles , returns , skews = [] , [] , []

7 for inst in self . instruments :
8 self . dfs [ inst ][ " alive " ] = self . dfs [ inst ][ " adj_close " ]\
9 . rolling (61) . apply ( numba_notanynan , engine = " numba " , raw = True ) . fillna (0)
10

11 super () . compute_metas ( index )

13 for inst in self . instruments :

14 print ( inst )
15 self . dfs [ inst ][ " eligible " ] = \
16 self . dfs [ inst ][ " alive " ]. astype ( int ) &\
17 self . dfs [ inst ][ " active " ]. astype ( int ) &\
18 ( self . dfs [ inst ][ " vol " ] > 0) . astype ( int ) &\
19 ( self . dfs [ inst ][ " adj_close " ] > 0) . astype ( int )
20

85
21 eligibles . append ( self . dfs [ inst ][ " eligible " ])
22 returns . append ( self . dfs [ inst ][ " ret " ])
23

24 mktrets = pd . concat ( returns , axis =1) . mean ( axis =1)

26 from numpy_ext import rolling_apply as r o l l i n g _ a p p l y _ e x t

27 def idioskew ( mkt , indiv ) :
28 if np . any ( np . isnan ( indiv ) ) :
29 return np . nan
30 x = mkt . reshape (( -1 , 1) )
31 y = indiv
32 model = L i n e a r R e gr e s s i o n () . fit (x , y )
33 ypred = model . predict ( x )
34 residuals = y - ypred
35 return skew ( residuals )
36

37 for inst in self . instruments :

38 roll_skew = r o l l i n g _ a p p l y _ e x t (
39 idioskew , 61 , mktrets . values , self . dfs [ inst ][ " ret " ]. values )
40 skews . append ( pd . Series ( data = roll_skew , index = index ) )
41

42 self . invriskdf = 1 / self . voldf

43 self . eligiblesdf = pd . concat ( eligibles , axis =1)
44 self . eligiblesdf . columns = self . instruments
45 self . skewsdf = pd . concat ( skews , axis =1)
46 self . skewsdf . columns = self . instruments
47

48 def c o m p u t e _ f o r e c a s t s ( self , portfolio_i , date , eligibles_row ) :

49 def s t r a t 1 _ c o m p u t e _ f o r e c a s t s ( instruments , skews , portfolio_i , eligibles_row , longsize ,
shortsize ) :
50 alpha_scores = {}
51 temp = eligibles_row
52 for i in range ( len ( eligibles_row ) ) :
53 if temp [ i ]:
54 alpha_scores [ instruments [ i ]] = -1 * skews [ i ]
55 alpha_scores = { k : v for k , v in sorted ( alpha_scores . items () , key = lambda pair : pair
[1]) }
56 alphalong = list ( alpha_scores . keys () ) [ - longsize :]
57 alphashort = list ( alpha_scores . keys () ) [: shortsize ]
58 forecaster = lambda inst : 1 if inst in alphalong else ( -1 if inst in alphashort else
0)
59 forecasts = np . array ([ forecaster ( inst ) for inst in instruments ])
60 return forecasts , longsize + shortsize

86
61

62 return s t r a t 1 _ c o m p u t e _ f o r e c a s t s (
63 self . instruments ,
64 self . skewsdf . loc [ date ]. values ,
65 portfolio_i ,
66 eligibles_row . values . astype ( np . int32 ) ,
67 self . configs [ " longsize " ] ,
68 self . configs [ " shortsize " ])

Listing 28: strat1.py

This gets our run time down to:

@timeme: run_simulation took 66.71719336509705 seconds

from 105.40305471420288 seconds.

Can we do better?

9.2.2 Improvement 2 - Custom Functions

Taking a look at its source code: (link: scikit-learn Github source code), we see that scikit-learn performs
significant amounts of input validation before calling the same function!

line 750: linalg.lstsq(X, y)

If we can guarantee that our input is correct - which we can do ourself with lower overhead - we can skip
the validation drag.

1 def idioskew ( mkt , indiv ) :

2 if np . any ( np . isnan ( indiv ) ) :
3 return np . nan
4 x = mkt
5 y = indiv
6 ones = np . ones ( mkt . shape [0])
7 A = np . vstack (( x , ones ) ) . T
8 m , c = np . linalg . lstsq (A , y , rcond =1) [0]
9 yhat = m * x + c
10 residuals = y - yhat
11 return skew ( residuals )

Listing 29: strat1.py

takes

87
@timeme: run_simulation took 27.103562355041504 seconds

down from 66.71719336509705 seconds.

Can we do better? We can do the same with the skew computation: see (scipy documentation on skew
computation: (link: scipy skew implementation)

9.2.3 Improvement 3 - Custom Functions

1 def idioskew ( mkt , indiv ) :

12 m1 = np . mean ( residuals )
13 m2 = 1 / len ( residuals ) * np . sum ( np . power ( residuals - m1 , 2) )
14 m3 = 1 / len ( residuals ) * np . sum ( np . power ( residuals - m1 , 3) )
15 if m2 == 0:
16 return np . nan
17 s = m3 / ( m2 **1.5)
18

19 return s

Listing 30: strat1.py

takes

@timeme: run_simulation took 19.585477590560913 seconds

9.2.4 Improvement 4 - Compiling

We have already done the hard work removing ‘jit’ unfriendly calls by creating our custom regression and
skew functions. There are many intermediate variables here such A, m, c, yhat... which are dynamically
typed. We can change the datatype that it refers to midway during the function execution, making it
difficult for our VM to make runtime optimizations of how our code runs at the machine level. However,

88
in our executions, each call of this functions creates the same datatype! We do not need Python’s dynamic
lookup functionality - we can compile this in JIT nopython or Cython. Note that the dynamic mechanism in
runtime does alot of work in the background - for a simple operation such as incrementing integer counters,
it has to check if the add function is implemented - and if it is not walk up the inheritance classes for
function handling. This is stripped away with static typing, allowing us to do the computation in bits and
machine code instead of Python object (objectmode) manipulations.

Adding the ‘@jit(nopython=True)’ decorator to our function above gives

@timeme: run_simulation took 17.828638792037964 seconds

Thats a remarkable series of improvements! From a runtime of 105.40305471420288 seconds, we it-

eratively made optimizations specific to our application so that the script runs in just 17.828638792037964
seconds.

Reprofiling the function calls in strategy 1, we get the output

Ordered by: cumulative time

List reduced from 12991 to 20 due to restriction <20>

ncalls tottime percall cumtime percall filename:lineno(function)

1858/1 0.022 0.000 26.270 26.270 {built-in method builtins.exec}
1 0.000 0.000 26.270 26.270 strat1.py:1(<module>)
1 0.002 0.002 22.944 22.944 dirtylib.py:14(measure_time)
1 0.191 0.191 22.942 22.942 alpha.py:115(run_simulation)
1 0.002 0.002 7.405 7.405 strat1.py:19(compute_metas)
3 0.000 0.000 5.981 1.994 dispatcher.py:388(_compile_for_args)
85/3 0.002 0.000 5.954 1.985 dispatcher.py:915(compile)
28/3 0.000 0.000 5.954 1.985 dispatcher.py:124(compile)
28/3 0.000 0.000 5.954 1.985 dispatcher.py:131(_compile_cached)
28/3 0.001 0.000 5.953 1.984 dispatcher.py:146(_compile_core)
28/3 0.000 0.000 5.953 1.984 compiler.py:690(compile_extra)
1202/76 0.004 0.000 5.757 0.076 compiler_lock.py:32(_acquire_compile_lock)
48/3 0.001 0.000 5.728 1.909 compiler.py:446(compile_extra)
48/3 0.000 0.000 5.726 1.909 compiler.py:515(_compile_bytecode)
48/3 0.001 0.000 5.726 1.909 compiler.py:469(_compile_core)
48/3 0.005 0.000 5.724 1.908 compiler_machinery.py:342(run)
1200/75 0.038 0.000 5.721 0.076 compiler_machinery.py:268(_runPass)

89
3600/225 0.005 0.000 5.679 0.025 compiler_machinery.py:272(check)
59611 0.637 0.000 5.190 0.000 series.py:323(__init__)
20 0.040 0.002 4.897 0.245 numpy_ext.py:374(rolling_apply)

We see that compute forecasts has dropped out of the top 20 demanding tasks in place of compute metas,but
the latter only costs 7.405 seconds compared to the 124.664 seconds in our original version of compute forecasts.

Now that our code is not (grossly) inefficient, let’s revert back to the full problem set on strategy 1 and
strategy 3 - our concern is how the problem scales, since that would cost us the most waiting time in the
first place. We would optimize on the full problem set from here on...

Runtimes (vs full problem set):

PRE
Strategy 1: @timeme: run_simulation took 1337.7771863937378 seconds
Strategy 2: @timeme: run_simulation took 225.85177397727966 seconds
Strategy 3: @timeme: run_simulation took 1249.9761579036713 seconds
POST
Strategy 1: @timeme: run_simulation took 26.82828116416931 seconds
Strategy 2: @timeme: run_simulation took 10.909979820251465 seconds
Strategy 3: @timeme: run_simulation took 18.638883590698242 seconds

Unfortunately, when profiling our last optimization, the run-times seem quite similar among the top 20
contestants demanding our attention. Additionally, most function calls are by code not written by us - such
as those by the compiler files. The high level profiling does not give us enough granularity anymore, so lets
step back into kernprof and profile our run simulation line by line.

COMMAND
python3 -m kernprof -l -v strat1.py (annotate @profile on run_simulation)

OUTPUT
Function: run_simulation at line 113

Line # Hits Time Per Hit % Time Line Contents

==============================================================
113 @profile
114 def run_simulation(self):
115 """

90
116 Settings
117 """
118 1 3.0 3.0 0.0 portfolio_vol = 0.20
119 2 448.0 224.0 0.0 trade_datetime_range = pd.date_range(
120 1 3.0 3.0 0.0 start=self.get_trade_datetime_range()[0],
121 1 2.0 2.0 0.0 end=self.get_trade_datetime_range()[1],
122 1 1.0 1.0 0.0 freq="D"
123 )
124
125 """
126 Compute Metas
127 """
128 1 22900215.0 22900215.0 49.6 self.compute_metas(index=trade_datetime_range)
129
130 """
131 Initialisations
132 """
133 1 4564.0 4564.0 0.0 self.portfolio_df = self.init_portfolio_setting
134
135 1 1.0 1.0 0.0 date_prev = None
136 1 1.0 1.0 0.0 baseclose_prev = None
137 1 1.0 1.0 0.0 units_held, weights_held = [], []
138 """
139 self.eligiblesdf = eligiblesdf
140 self.invriskdf = invriskdf]
141 self.anyinformation required by compute_forecas
142 NOTE: needs to be implemented by the child alph
143 """
144
145 3654 3516396.0 962.3 7.6 for (portfolio_i, portfolio_row),\ ------------
146 3653 12216.0 3.3 0.0 (vol_i, vol_row),\
147 3653 9511.0 2.6 0.0 (ret_i, ret_row),\
148 3653 8177.0 2.2 0.0 (active_i, active_row),\
149 3653 4924.0 1.3 0.0 (baseclose_i, baseclose_row), \

91
150 3653 4866.0 1.3 0.0 (eligibles_i, eligibles_row),\
151 3653 8675.0 2.4 0.0 (invrisk_i, invrisk_row),\
152 2 3.0 1.5 0.0 in zip(\
153 1 2.0 2.0 0.0 self.portfolio_df.iterrows(),\
154 1 2.0 2.0 0.0 self.voldf.iterrows(),\
155 1 2.0 2.0 0.0 self.retdf.iterrows(),\
156 1 2.0 2.0 0.0 self.activedf.iterrows(),\
157 1 3.0 3.0 0.0 self.baseclosedf.iterrows(),\
158 1 2.0 2.0 0.0 self.eligiblesdf.iterrows(),\
159 1 2.0 2.0 0.0 self.invriskdf.iterrows(),\
160 ):
161
162 3653 5464.0 1.5 0.0 strat_scalar = 2
163 3653 8118.0 2.2 0.0 eligibles = eligibles_row
164
165 3653 5381.0 1.5 0.0 if portfolio_i != 0:
166
167 7304 3935557.0 538.8 8.5 day_pnl, nominal_ret = backtest_utils.g
168 3652 6081.0 1.7 0.0 portfolio_df=self.portfolio_df,
169 3652 5737.0 1.6 0.0 last_weights=weights_held[-1],
170 3652 5219.0 1.4 0.0 last_units=units_held[-1],
171 3652 4756.0 1.3 0.0 idx=portfolio_i,
172 3652 4733.0 1.3 0.0 baseclose_row=baseclose_prev,
173 3652 4747.0 1.3 0.0 ret_row=ret_row
174 )
175
176 7304 155474.0 21.3 0.3 strat_scalar = self.get_strat_scaler(
177 3652 5068.0 1.4 0.0 lookback=30,
178 3652 4895.0 1.3 0.0 portfolio_vol=portfolio_vol,
179 3652 4737.0 1.3 0.0 idx=portfolio_i,
180 3652 4728.0 1.3 0.0 default=strat_scalar
181 )
182
183 7304 157674.0 21.6 0.3 self.portfolio_df.at[portfolio_i, "ewst

92
184 4849 25407.0 5.2 0.1 if nominal_ret != 0 else self.portf
185
186 3653 110092.0 30.1 0.2 self.portfolio_df.at[portfolio_i, "strat sc
187
188 3653 4832729.0 1322.9 10.5 forecasts, num_trading = self.compute_forec
189 7306 4719349.0 646.0 10.2 positions, nominal_tot = self.set_positions
190 3653 103556.0 28.3 0.2 capital=self.portfolio_df.at[portfolio_
191 3653 5523.0 1.5 0.0 portfolio_i=portfolio_i,
192 3653 4916.0 1.3 0.0 forecasts=forecasts,
193 3653 4782.0 1.3 0.0 eligibles=eligibles,
194 3653 4891.0 1.3 0.0 num_trading=num_trading,
195 3653 4831.0 1.3 0.0 portfolio_vol=portfolio_vol,
196 3653 4848.0 1.3 0.0 strat_scalar=strat_scalar,
197 3653 4866.0 1.3 0.0 invrisk_row=invrisk_row,
198 3653 4838.0 1.3 0.0 baseclose_row=baseclose_row
199 )
200
201 3653 264251.0 72.3 0.6 weights = self.set_weights(nominal_tot, pos
202 7306 18554.0 2.5 0.0 nominal_tot, positions, weights = self.post
203 3653 5057.0 1.4 0.0 index=portfolio_i,
204 3653 5071.0 1.4 0.0 date=ret_i,
205 3653 4879.0 1.3 0.0 eligibles=eligibles,
206 3653 4830.0 1.3 0.0 nominal_tot=nominal_tot,
207 3653 4894.0 1.3 0.0 positions=positions,
208 3653 4754.0 1.3 0.0 weights=weights
209 )
210
211 3653 181738.0 49.8 0.4 self.portfolio_df.at[portfolio_i, "nominal"
212 3653 121393.0 33.2 0.3 self.portfolio_df.at[portfolio_i, "leverage
213 3653 130809.0 35.8 0.3 self.portfolio_df.at[portfolio_i, "nomi
214
215 3653 4783644.0 1309.5 10.4 print(self.portfolio_df.loc[portfolio_i]) -
216 3653 8099.0 2.2 0.0 date_prev = portfolio_i
217 3653 7350.0 2.0 0.0 units_held.append(positions)

93
218 3653 6181.0 1.7 0.0 weights_held.append(weights)
219 3653 13016.0 3.6 0.0 baseclose_prev = baseclose_row
220
221 1 3.0 3.0 0.0 return self.portfolio_df

The pain points are - compute metas, dataframe unzipping, get pnl stats, compute forecasts, set positions,
and surprisingly - the print statement takes up 10% of our runtime!

10 I/O Reduction, Cython, and More Vectorization

10.1 I/O Reduction

We only need the last print statement to verify our terminal wealth is invariant post code edits. Printing is
a blocking operation that stops our logic flow - we can improve the performance by buffering and printing
in chunks or by removing unnecessary print statements. Moving the print statement for the last date entry
outside the loop (to see only the last terminal wealth entry), we get a runtime of 23.934306859970093
seconds down from 26.82828116416931 seconds. Not bad...

Let’s move on to compute forecasts:

1 def c o m p u t e _ f o r e c a s t s ( self , portfolio_i , date , eligibles_row ) :
2 def s t r a t 1 _ c o m p u t e _ f o r e c a s t s ( instruments , skews , portfolio_i , eligibles_row , longsize ,
shortsize ) :
3 alpha_scores = {}
4 temp = eligibles_row
5 for i in range ( len ( eligibles_row ) ) :
6 if temp [ i ]:
7 alpha_scores [ instruments [ i ]] = -1 * skews [ i ]
8 alpha_scores = { k : v for k , v in sorted ( alpha_scores . items () , key = lambda pair : pair
[1]) }
9 alphalong = list ( alpha_scores . keys () ) [ - longsize :]
10 alphashort = list ( alpha_scores . keys () ) [: shortsize ]
11 forecaster = lambda inst : 1 if inst in alphalong else ( -1 if inst in alphashort else
0)
12 forecasts = np . array ([ forecaster ( inst ) for inst in instruments ])
13 return forecasts , longsize + shortsize
14

15 return s t r a t 1 _ c o m p u t e _ f o r e c a s t s (
16 self . instruments ,
17 self . skewsdf . loc [ date ]. values ,
18 portfolio_i ,

94
19 eligibles_row . values . astype ( np . int32 ) ,
20 self . configs [ " longsize " ] ,
21 self . configs [ " shortsize " ]

Listing 31: strat1.py

10.2 Introducing Cython

Cython is a tool for compiling to Python code to C, compatible with both numpy arrays and Python code. It
allows us to create code that can be compiled into static binary machine code and used like regular functions
in our logic - but in order to prevent call backs into the Python virtual machine we need to make it ‘C’ like,
such as doing static typing on our variable types. The binary machine code contains optimizations run by
the ‘C’ compilers, for instance those performed by gcc. At run time, this has the potential to run in ‘pure C’
level without being Python-aware. ‘C’ like arrays are supported, and numpy supports this by implementing
the memoryview interface. While iterating over numpy arrays in native Python is slow, we can compile it
to make it efficient! We need to write pyx code, then build a binary exectable with setup.py and import it
as if it were any other Python function. Let’s try writing a compiled version of our compute forecasts, which
we saw to be slow.

1 # strat1 . py
2 def c o m p u t e _ f o r e c a s t s ( self , portfolio_i , date , eligibles_row ) :
3 return cythonlib . s t r a t 1 _ c o m p u t e _ f o r e c a s t s (
4 self . instruments ,
5 self . skewsdf . loc [ date ]. values ,
6 portfolio_i ,
7 eligibles_row . values . astype ( np . int32 ) ,
8 self . configs [ " longsize " ] ,
9 self . configs [ " shortsize " ])
10

11 # cythonlib . pyx
12 def s t r a t 1 _ c o m p u t e _ f o r e c a s t s ( instruments , skews , portfolio_i , eligibles_row , longsize ,
shortsize ) :
13 alpha_scores = {}
14 temp = eligibles_row
15 for i in range ( len ( eligibles_row ) ) :
16 if temp [ i ]:
17 alpha_scores [ instruments [ i ]] = -1 * skews [ i ]
18 alpha_scores = { k : v for k , v in sorted ( alpha_scores . items () , key = lambda pair : pair [1]) }
19 alphalong = list ( alpha_scores . keys () ) [ - longsize :]
20 alphashort = list ( alpha_scores . keys () ) [: shortsize ]
21 forecaster = lambda inst : 1 if inst in alphalong else ( -1 if inst in alphashort else 0)

95
22 forecasts = np . array ([ forecaster ( inst ) for inst in instruments ])
23 return forecasts , longsize + shortsize
24

25 # setup . py && python3 setup . py build_ext -- inplace

26 from distutils . core import setup
27 from Cython . Build import cythonize
28 import numpy
29

30 setup (
31 ext_modules = cythonize (
32 " cythonlib . pyx " ,
33 c o m p i l e r _ d i r e c t i v e s ={ " langu age_lev el " : 3}
34 ),
35 include_dirs =[ numpy . get_include () ]
36 )

Takes

@timeme: run_simulation took 22.472733974456787 seconds

This didn’t really help us much - on repeated trials, the runtime on Cython is sometimes slower! (note
that overall runtime is dependent many other factors, such as CPU load from other running processes - it is
difficult to compare efficiency on two programs that have marginal differences in execution time)

We can annotate our Cython code and see how ’C’ compatible it is using the command

python3 -m cython -a cythonlib.pyx

which creates a html file that can be opened in any browser:

Figure 9: cythonlib Python VM Iteractions

96
where increasingly yellow lines indicate that at runtime we have to call back into the Python stack. We
see that we have gotten no speedup - because the Cython code is basically doing the same thing by calling
into the virtual machine. It has no recognition of the data types. We are still doing dynamic typing and
manipulating in Python.

10.3 Back to Numpy - All Functions are Equal, But Some are More Equal than
Others

Before we try to come up with a statically typed function to prevent VM callbacks, let’s see if we can first
make it efficient with vectorization in numpy with some trickery:

10.3.1 Vectorized Forecasting, 1

1 def c o m p u t e _ f o r e c a s t s ( self , portfolio_i , date , eligibles_row ) :

2 alpha_scores = ( - 1 * self . skewsdf . loc [ date ]. values ) / eligibles_row . values
3 offset = np . isnan ( alpha_scores ) . sum () + np . isinf ( alpha_scores ) . sum ()
4 alpha_scores = np . nan_to_num ( alpha_scores , nan = np . inf , neginf = np . inf , posinf = np . inf )
5 args ort_scor es = np . argsort ( alpha_scores )
6 alpha_long = args ort_scor es [ - self . configs [ " longsize " ] - offset : - offset ] if offset else
argso rt_scor es [ - self . configs [ " longsize " ]:]
7 alpha_short = ar gsort_s cores [: self . configs [ " shortsize " ]]
8 long = np . bincount ( alpha_long , minlength = len ( self . instruments ) ) / eligibles_row .
values
9 short = np . bincount ( alpha_short , minlength = len ( self . instruments ) ) / eligibles_row .
values
10 forecast = np . nan_to_num (( long - short ) , nan =0 , neginf =0 , posinf =0) . astype ( np . int32
)
11 return forecast , self . configs [ " longsize " ] + self . configs [ " shortsize " ]

Listing 32: strat1.py

It turns out this actually takes longer at 25.925340414047241 seconds, up from 23.934306859970093
seconds!

10.3.2 Vectorized Forecasting, 2

1 def c o m p u t e _ f o r e c a s t s ( self , portfolio_i , date , eligibles_row ) :

2 skews = self . skewsdf . loc [ date ]. values
3 eligible_args = np . where ( eligibles_row == 1) [0]
4 e li gi ble _a lp ha s = np . take ( -1 * skews , eligible_args )
5 argso rt_alph as = np . argsort ( e li gi bl e _a lp ha s )

97
6 shorts = np . take ( eligible_args , argso rt_alpha s [: self . configs [ " shortsize " ]]) . astype ( "
int32 " )
7 longs = np . take ( eligible_args , arg sort_alp has [ - self . configs [ " longsize " ]:]) . astype ( " int32
")
8 forecasts = np . zeros ( len ( eligibles_row ) )
9 for i in shorts :
10 forecasts [ i ] = -1
11 for i in longs :
12 forecasts [ i ] = 1
13 return forecasts , self . configs [ " longsize " ] + self . configs [ " shortsize " ]

Listing 33: strat1.py

This is better... (profile it and find out what is costly!)

@timeme: run_simulation took 22.466408014297485 seconds

Let’s kernprof this function:

Line # Hits Time Per Hit % Time Line Contents

==============================================================
79 @profile
80 def compute_forecasts(self, portfolio_i, date, elig
81 3653 1631107.0 446.5 40.3 skews = self.skewsdf.loc[date].values
82 3653 1381738.0 378.2 34.1 eligible_args = np.where(eligibles_row == 1)[0]
83 3653 115900.0 31.7 2.9 eligible_alphas = np.take(-1 * skews, eligible_
84 3653 93333.0 25.5 2.3 argsort_alphas = np.argsort(eligible_alphas)
85 3653 61316.0 16.8 1.5 shorts = np.take(eligible_args, argsort_alphas[
86 3653 39761.0 10.9 1.0 longs = np.take(eligible_args, argsort_alphas[-
87 3653 37622.0 10.3 0.9 forecasts = np.zeros(len(eligibles_row))
88 181853 175912.0 1.0 4.3 for i in shorts:
89 178200 176894.0 1.0 4.4 forecasts[i] = -1
90 181853 159193.0 0.9 3.9 for i in longs:
91 178200 170166.0 1.0 4.2 forecasts[i] = 1
92
93 3653 7732.0 2.1 0.2 return forecasts, self.configs["longsize"] + se

About 40% of the time was taken to access the skew values from our dataframes - this is a cost
of us maintaining generalizability of alpha logic, so unfortunately we cannot do better than getting the

98
values via ‘.loc’ since the parent alpha cannot anticpate what data we need to access in our signal logic.
‘np.where(eligibles row == 1)’ also costs us quite abit of surprising compute, and we address this later. A
surprisingly significant amount of time ∼ 16% was spent iterating over the shorts and longs to assign binary
values! It turns out that iterating over numpy values and accessing them individually is not quite efficient
(here is a discussion on looping over numpy arrays -link: stackoverflow).

10.3.3 Cython + Numpy

Let’s perhaps delegate the task to a cythonized function (recall dereferencing in pure Python numpy has
overhead in terms of address computations, while in ’C’ we can directly access the items by doing address
offsets - we know how much space each element takes in a homogenous array type!):

1 def caster ( int [:] shorts , int [:] longs , double [:] forecasts ) :
2 for i in range ( len ( shorts ) ) :
3 forecasts [ shorts [ i ]] = -1
4 for i in range ( len ( longs ) ) :
5 forecasts [ longs [ i ]] = 1
6 return forecasts

Listing 34: cythonlib.pyx

and then replace forecasts variable with:

forecasts = cythonlib.caster(shorts, longs, forecasts)

and kernprof-ing the line:

Line # Hits Time Per Hit % Time Line Contents

==============================================================
79 @profile
80 def compute_forecasts(self, portfolio_i, date, elig
81 3653 1397620.0 382.6 48.4 skews = self.skewsdf.loc[date].values
82 3653 1164603.0 318.8 40.3 eligible_args = np.where(eligibles_row == 1)[0]
83 3653 95901.0 26.3 3.3 eligible_alphas = np.take(-1 * skews, eligible_
84 3653 83539.0 22.9 2.9 argsort_alphas = np.argsort(eligible_alphas)
85 3653 48740.0 13.3 1.7 shorts = np.take(eligible_args, argsort_alphas[
86 3653 32834.0 9.0 1.1 longs = np.take(eligible_args, argsort_alphas[-
87 3653 31946.0 8.7 1.1 forecasts = np.zeros(len(eligibles_row))
88 # for i in shorts:

99
89 # forecasts[i] = -1
90 # for i in longs:
91 # forecasts[i] = 1
92 3653 27173.0 7.4 0.9 forecasts = cythonlib.caster(shorts, longs, for
93 3653 4582.0 1.3 0.2 return forecasts, self.configs["longsize"] + se

we spend just 1% of time setting up the binary casted array. The new runtime squeezed down to

@timeme: run_simulation took 21.171512603759766 seconds

down from 22.466408014297485 seconds. The time spent as a percentage of the total compute time in
run simulation is now 7.5% down from 10.5%, a marginal improvement.

We see that we don’t reference the Python VM as often

Figure 10: cythonlib Python VM Iteractions

However, in multiple experiments their runtimes are almost similar - this is possibly due to the overhead
of converting the cython object back into numpy objects. Profiling the Cython code is difficult because it
occurs outside the Python call trace in machine compiled code, so we should always verify that the changes
have made tangible improvements. Since their performance is only marginally different, but using pure
Python is more readable - we chose to stick with our original solution instead taking 22.466408014297485
seconds to run. Note that this is also an efficiency tradeoff when the code is harder to maintain, and
compilation cycle is a developer overhead.

11 Fighting for Scraps

In the previous profiling, we saw that dataframe unzipping was costly - we remove the unzipping of

self.voldf.iterrows(), self.activedf.iterrows()

100
since we are not referencing any data from it.

We try to make optimizations that could use multiple cores in the compute metas function:
1 import m ul tip ro ce s si ng as mp
2 @staticmethod
3 def doinst ( instdf , index ) :
4 @jit ( nopython = True )
5 def numba_any ( x ) :
6 return int ( np . any ( x ) )
7

8 df = pd . DataFrame ( index = index )

9 instdf [ " vol " ] = ( -1 + instdf [ " adj_close " ] / instdf . shift (1) [ " adj_close " ]) . rolling (30) .
std ()
10 instdf = df . join ( instdf )
11 instdf = instdf . fillna ( method = " ffill " ) . fillna ( method = " bfill " )
12 instdf [ " ret " ] = -1 + instdf [ " adj_close " ] / instdf . shift (1) [ " adj_close " ]
13 instdf [ " sampled " ] = instdf [ " adj_close " ] != instdf . shift (1) [ " adj_close " ]
14 instdf [ " active " ] = instdf [ " sampled " ]. rolling (5) . apply ( numba_any , engine = " numba " , raw =
True ) . fillna (0)
15 return instdf
16

17 ...
18

19 with mp . Pool () as pool :

20 instdfs = pool . starmap (
21 Alpha . doinst ,
22 [( self . dfs [ self . instruments [ i ]] , index ) for i in range ( len ( self . instruments ) ) ]
23 )
24 for i in range ( len ( self . instruments ) ) :
25 instdf = instdfs [ i ]
26 self . dfs [ self . instruments [ i ]] = instdf
27 vols . append ( instdf [ " vol " ])
28 rets . append ( instdf [ " ret " ])
29 actives . append ( instdf [ " active " ])
30 closes . append ( instdf [ " adj_close " ])

Listing 35: alpha.py

but when we kernprof it turns out that the process actually runs slower! It seems that the overhead of creating
memory and kernel interactions outweigh the multi-core utilization. We come to a grinding realization that
making incremental improvements are quite difficult once our code is fairly efficient - we have to fight for
scraps.

Recall that the functions get pnl stats and set positions were rather costly (each costing about 10% of

101
the CPU time in run simulation, so let’s profile both functions:

Line # Hits Time Per Hit % Time Line Contents

==============================================================
94 @profile
95 def set_positions(self, capital, portfolio_i, forec
96 18265 16419.0 0.9 0.4 vol_target = 1 \
97 3653 2753.0 0.8 0.1 / num_trading \
98 3653 1956.0 0.5 0.0 * capital \
99 3653 1912.0 0.5 0.0 * portfolio_vol \
100 3653 20615.0 5.6 0.5 / np.sqrt(253)
101 21918 3983150.0 181.7 87.5 positions = eligibles \
102 3653 1841.0 0.5 0.0 * strat_scalar \
103 3653 3619.0 1.0 0.1 * vol_target \
104 3653 3530.0 1.0 0.1 * forecasts \
105 3653 25222.0 6.9 0.6 * invrisk_row.values \
106 3653 23271.0 6.4 0.5 / baseclose_row.values
107
108 3653 317131.0 86.8 7.0 positions = np.nan_to_num(positions, nan=0, pos
109 3653 147398.0 40.3 3.2 nominal_tot = np.linalg.norm(positions * basecl
110 3653 3202.0 0.9 0.1 return positions, nominal_tot

Line # Hits Time Per Hit % Time Line Contents

==============================================================
94 @profile
95 def get_pnl_stats(portfolio_df, last_weights, last_unit
96 3652 303022.0 83.0 8.1 ret_row = np.nan_to_num(ret_row, nan=0, posinf=0, n
97 3652 2624286.0 718.6 70.3 day_pnl = np.sum(last_units * baseclose_row * ret_r
98 3652 31732.0 8.7 0.9 nominal_ret = np.dot(last_weights, ret_row)
99 3652 99398.0 27.2 2.7 capital_ret = nominal_ret * portfolio_df.at[idx - 1
100 3652 190401.0 52.1 5.1 portfolio_df.at[idx, "capital"] = portfolio_df.at[i
101 3652 111311.0 30.5 3.0 portfolio_df.at[idx, "daily pnl"] = day_pnl
102 3652 102815.0 28.2 2.8 portfolio_df.at[idx, "nominal ret"] = nominal_ret

102
103 3652 99693.0 27.3 2.7 portfolio_df.at[idx, "capital ret"] = capital_ret
104 7304 143375.0 19.6 3.8 portfolio_df.at[idx, "ewma"] = 0.06 * (capital_ret*
105 4849 22599.0 4.7 0.6 if nominal_ret != 0 else portfolio_df.at[idx -
106
107 3652 2590.0 0.7 0.1 return day_pnl, nominal_ret

That’s interesting! It should set off alarm bells in our head - each function has one fairly innocent
looking statement taking up most compute. Doing a Python object data type inspection at runtime, it turns
out that in each of those scenarios, at least one of the variables were pandas.Series objects instead of numpy
arrays. Numpy arrays are quicker! Let’s give that a quick fix by rewriting their references to numpy arrays
immediately after unzipping by using ‘.values’
1 ...
2 for ( portfolio_i , portfolio_row ) ,\
3 ( ret_i , ret_row ) ,\
4 ( baseclose_i , baseclose_row ) , \
5 ( eligibles_i , eligibles_row ) ,\
6 ( invrisk_i , invrisk_row ) ,\
7 in zip (\
8 self . portfolio_df . iterrows () ,\
9 self . retdf . iterrows () ,\
10 self . baseclosedf . iterrows () ,\
11 self . eligiblesdf . iterrows () ,\
12 self . invriskdf . iterrows () ,\
13 ):
14 portfolio_row = portfolio_row . values
15 ret_row = ret_row . values
16 baseclose_row = baseclose_row . values
17 eligibles_row = eligibles_row . values
18 invrisk_row = invrisk_row . values
19 ...

Listing 36: alpha.py(run simulation)

This improves the runtime of strategy 1 to

@timeme: run_simulation took 15.73388385772705 seconds

down from 22.466408014297485 seconds by ensuring that we are manipulating in the ‘efficient’ datatypes.

We are almost done! Let’s just eke out some more performance - we return to kernprof

Line # Hits Time Per Hit % Time Line Contents

103
==============================================================
117 @profile
118 def run_simulation(self):
119 """
120 Settings
121 """
122 1 2.0 2.0 0.0 portfolio_vol = 0.20
123 2 565.0 282.5 0.0 trade_datetime_range = pd.date_range(
124 1 4.0 4.0 0.0 start=self.get_trade_datetime_range()[0],
125 1 2.0 2.0 0.0 end=self.get_trade_datetime_range()[1],
126 1 2.0 2.0 0.0 freq="D"
127 )
128
129 """
130 Compute Metas
131 """
132 1 4123290.0 4123290.0 27.6 self.compute_metas(index=trade_datetime_range)
133
134 """
135 Initialisations
136 """
137 1 5334.0 5334.0 0.0 self.portfolio_df = self.init_portfolio_setting
138
139 1 2.0 2.0 0.0 date_prev = None
140 1 1.0 1.0 0.0 baseclose_prev = None
141 1 1.0 1.0 0.0 units_held, weights_held = [], []
142 """
143 self.eligiblesdf = eligiblesdf
144 self.invriskdf = invriskdf]
145 self.anyinformation required by compute_forecas
146 NOTE: needs to be implemented by the child alph
147 """
148
149 3654 2765613.0 756.9 18.5 for (portfolio_i, portfolio_row),\

104
150 3653 7320.0 2.0 0.0 (ret_i, ret_row),\
151 3653 5294.0 1.4 0.0 (baseclose_i, baseclose_row), \
152 3653 5234.0 1.4 0.0 (eligibles_i, eligibles_row),\
153 3653 5047.0 1.4 0.0 (invrisk_i, invrisk_row),\
154 2 4.0 2.0 0.0 in zip(\
155 1 3.0 3.0 0.0 self.portfolio_df.iterrows(),\
156 1 4.0 4.0 0.0 self.retdf.iterrows(),\
157 1 3.0 3.0 0.0 self.baseclosedf.iterrows(),\
158 1 2.0 2.0 0.0 self.eligiblesdf.iterrows(),\
159 1 2.0 2.0 0.0 self.invriskdf.iterrows(),\
160 ):
161
162 3653 40297.0 11.0 0.3 portfolio_row = portfolio_row.values
163 3653 19793.0 5.4 0.1 ret_row = ret_row.values
164 3653 17726.0 4.9 0.1 baseclose_row = baseclose_row.values
165 3653 17115.0 4.7 0.1 eligibles_row = eligibles_row.values
166 3653 16941.0 4.6 0.1 invrisk_row = invrisk_row.values
167
168 3653 5904.0 1.6 0.0 strat_scalar = 2
169
170 3653 5522.0 1.5 0.0 if portfolio_i != 0:
171
172 7304 1194867.0 163.6 8.0 day_pnl, nominal_ret = backtest_utils.g
173 3652 6202.0 1.7 0.0 portfolio_df=self.portfolio_df,
174 3652 5612.0 1.5 0.0 last_weights=weights_held[-1],
175 3652 5311.0 1.5 0.0 last_units=units_held[-1],
176 3652 4787.0 1.3 0.0 idx=portfolio_i,
177 3652 4805.0 1.3 0.0 baseclose_row=baseclose_prev,
178 3652 4862.0 1.3 0.0 ret_row=ret_row
179 )
180
181 7304 147291.0 20.2 1.0 strat_scalar = self.get_strat_scaler(
182 3652 5115.0 1.4 0.0 lookback=30,
183 3652 4830.0 1.3 0.0 portfolio_vol=portfolio_vol,

105
184 3652 5012.0 1.4 0.0 idx=portfolio_i,
185 3652 4924.0 1.3 0.0 default=strat_scalar
186 )
187
188 7304 163212.0 22.3 1.1 self.portfolio_df.at[portfolio_i, "ewst
189 4182 14071.0 3.4 0.1 if nominal_ret != 0 else self.portf
190
191 3653 111158.0 30.4 0.7 self.portfolio_df.at[portfolio_i, "strat sc
192
193 3653 4892665.0 1339.4 32.7 forecasts, num_trading = self.compute_forec
194 7306 483575.0 66.2 3.2 positions, nominal_tot = self.set_positions
195 3653 100961.0 27.6 0.7 capital=self.portfolio_df.at[portfolio_
196 3653 5455.0 1.5 0.0 portfolio_i=portfolio_i,
197 3653 5105.0 1.4 0.0 forecasts=forecasts,
198 3653 4862.0 1.3 0.0 eligibles=eligibles_row,
199 3653 4881.0 1.3 0.0 num_trading=num_trading,
200 3653 4922.0 1.3 0.0 portfolio_vol=portfolio_vol,
201 3653 4998.0 1.4 0.0 strat_scalar=strat_scalar,
202 3653 4989.0 1.4 0.0 invrisk_row=invrisk_row,
203 3653 4954.0 1.4 0.0 baseclose_row=baseclose_row
204 )
205
206 3653 225311.0 61.7 1.5 weights = self.set_weights(nominal_tot, pos
207 7306 19426.0 2.7 0.1 nominal_tot, positions, weights = self.post
208 3653 5074.0 1.4 0.0 index=portfolio_i,
209 3653 5036.0 1.4 0.0 date=ret_i,
210 3653 5066.0 1.4 0.0 eligibles=baseclose_row,
211 3653 4867.0 1.3 0.0 nominal_tot=nominal_tot,
212 3653 4954.0 1.4 0.0 positions=positions,
213 3653 5014.0 1.4 0.0 weights=weights
214 )
215
216 3653 175155.0 47.9 1.2 self.portfolio_df.at[portfolio_i, "nominal"
217 3653 118534.0 32.4 0.8 self.portfolio_df.at[portfolio_i, "leverage

106
218 3653 128820.0 35.3 0.9 self.portfolio_df.at[portfolio_i, "nomi
219
220 3653 5751.0 1.6 0.0 date_prev = portfolio_i
221 3653 7630.0 2.1 0.1 units_held.append(positions)
222 3653 5925.0 1.6 0.0 weights_held.append(weights)
223 3653 6466.0 1.8 0.0 baseclose_prev = baseclose_row
224
225 1 2213.0 2213.0 0.0 print(self.portfolio_df.loc[portfolio_i])
226
227 1 2.0 2.0 0.0 return self.portfolio_df

We focus on these last few lines

216 3653 175155.0 47.9 1.2 self.portfolio_df.at[portfolio_i, "nominal"] =

217 3653 118534.0 32.4 0.8 self.portfolio_df.at[portfolio_i, "leverage"]
218 3653 128820.0 35.3 0.9 self.portfolio_df.at[portfolio_i, "nominal
219
220 3653 5751.0 1.6 0.0 date_prev = portfolio_i
221 3653 7630.0 2.1 0.1 units_held.append(positions)
222 3653 5925.0 1.6 0.0 weights_held.append(weights)
223 3653 6466.0 1.8 0.0 baseclose_prev = baseclose_row

Compared to the list append operations, we see that the ‘.at’ is significantly more expensive (about 10 times).
Counting the number of ‘.at’ operations that our functions call all together - we have more than 10 of them!
If each of them costs 1% CPU, then cumulatively they take more than 10% of our total execution runtime.
We can do better here by maintaining stateful lists - we often only require the ‘last’ capital or the ‘last’
strategy scalar. Recall that we already did this with getting our last positions held, and our weight vector
for portfolio allocations. Let’s clean up all these ‘.at’ operations by maintaining a list of them and accessing
the ‘last’ with

list[-1]

. We can then reconstruct our portfolio dataframe at the end!

1 def get_pnl_stats ( portfolio_df , last_weights , last_units , idx , baseclose_row , ret_row ,

capitals , leverages , ewmas ) :
2 ret_row = np . nan_to_num ( ret_row , nan =0 , posinf =0 , neginf =0)
3 day_pnl = np . sum ( last_units * baseclose_row * ret_row )

107
4 nominal_ret = np . dot ( last_weights , ret_row )
5 capital_ret = nominal_ret * leverages [ -1]
6

7 capitals . append ( capitals [ -1] + day_pnl )

8 ewmas . append (0.06 * ( capital_ret **2) + 0.94 * ewmas [ -1] if nominal_ret != 0 else ewmas
[ -1])
9 return day_pnl , nominal_ret

Listing 37: backtest utils.py

Cleaning up everywhere in our code takes some time, and we swap the order of a few lines to maintain
logic correctness: (note the swapping of pnl and strategy scalar calculations)

1 def i n i t _ p o r t f o l i o _ s e t t i n g s ( self , trade_range ) :

2 self . portfolio_df = pd . DataFrame ( index = trade_range ) \
3 . reset_index () \
4 . rename ( columns ={ " index " : " datetime " })
5 return 10000 , 0.001 , 1 , self . portfolio_df
6

7 def run_sim ulation ( self , verbose = False ) :

8 """
9 Settings
10 """
11 portfolio_vol = 0.20
12 t r a d e _ d a t e t i m e _ r a n g e = pd . date_range (
13 start = self . g e t _ t r a d e _ d a t e t i m e _ r a n g e () [0] ,
14 end = self . g e t _ t r a d e _ d a t e t i m e _ r a n g e () [1] ,
15 freq = " D "
16 )
17

18 """
19 Compute Metas
20 """
21 self . compute_metas ( index = t r a d e _ d a t e t i m e _ r a n g e )
22

23 """
24 I ni ti ali sa ti on s
25 """
26 capital , ewma , ewstrat , self . portfolio_df = self . i n i t _ p o r t f o l i o _ s e t t i n g s ( trade_range =
trade_datetime_range )
27

28 date_prev = None
29 basec lose_pr ev = None
30 capitals = [ capital ]

108
31 ewmas = [ ewma ]
32 ewstrats = [ ewstrat ]
33 nominalss = []
34 leverages = []
35 strat_scalars = []
36 units_held , weights_held = [] , []
37 """
38 self . eligiblesdf = eligiblesdf
39 self . invriskdf = invriskdf ]
40 self . any informa tion required by c o m p u t e _ f o r e c a s t s
41 """
42

43 for ( portfolio_i , portfolio_row ) ,\

44 ( ret_i , ret_row ) ,\
45 ( baseclose_i , baseclose_row ) , \
46 ( eligibles_i , eligibles_row ) ,\
47 ( invrisk_i , invrisk_row ) ,\
48 in zip (\
49 self . portfolio_df . iterrows () ,\
50 self . retdf . iterrows () ,\
51 self . baseclosedf . iterrows () ,\
52 self . eligiblesdf . iterrows () ,\
53 self . invriskdf . iterrows () ,\
54 ):
55

56 portfolio_row = portfolio_row . values

57 ret_row = ret_row . values
58 baseclose_row = baseclose_row . values
59 eligibles_row = eligibles_row . values
60 invrisk_row = invrisk_row . values
61

62 strat_scalar = 2
63

64 if portfolio_i != 0:
65 strat_scalar = self . g e t _ s t r a t _ sc a l e r (
66 lookback =30 ,
67 portfolio_vol = portfolio_vol ,
68 idx = portfolio_i ,
69 default = strat_scalars [ -1] ,
70 ewmas = ewmas ,
71 ewstrats = ewstrats
72 )
73

109
74 day_pnl , nominal_ret = b acktest_ utils . get_pnl_stats (
75 portfolio_df = self . portfolio_df ,
76 last_weights = weights_held [ -1] ,
77 last_units = units_held [ -1] ,
78 idx = portfolio_i ,
79 baseclose_row = baseclose_prev ,
80 ret_row = ret_row ,
81 capitals = capitals ,
82 leverages = leverages ,
83 ewmas = ewmas
84 )
85

86 ewstrats . append (0.06 * strat_scalar + 0.94 * ewstrats [ -1] if nominal_ret != 0

else ewstrats [ -1])
87

88 strat_scalars . append ( strat_scalar )

90 forecasts , num_trading = self . c o m p u t e _ f o r e c a s t s ( portfolio_i = portfolio_i , date = ret_i ,

eligibles_row = eligibles_row )
91 positions , nominal_tot = self . set_positions (
92 capital = capitals [ -1] ,
93 portfolio_i = portfolio_i ,
94 forecasts = forecasts ,
95 eligibles = eligibles_row ,
96 num_trading = num_trading ,
97 portfolio_vol = portfolio_vol ,
98 strat_scalar = strat_scalars [ -1] ,
99 invrisk_row = invrisk_row ,
100 baseclose_row = baseclose_row
101 )
102

103 weights = self . set_weights ( nominal_tot , positions , baseclose_row )

104 nominal_tot , positions , weights = self . p o s t _ r i s k _ m a n a g e m e n t (
105 index = portfolio_i ,
106 date = ret_i ,
107 eligibles = eligibles_row ,
108 nominal_tot = nominal_tot ,
109 positions = positions ,
110 weights = weights
111 )
112

113 date_prev = portfolio_i

114 base close_pr ev = baseclose_row

110
115 nominalss . append ( nominal_tot )
116 leverages . append ( nominal_tot / capitals [ -1])
117 units_held . append ( positions )
118 weights_held . append ( weights )
119

120 if verbose :
121 capital_ser = pd . Series (
122 data = capitals , index = trade_datetime_range , name = " capital " )
123 stratscal_ser = pd . Series (
124 data = strat_scalars , index = trade_datetime_range , name = " strat_scalar " )
125 nominals_ser = pd . Series (
126 data = nominalss , index = trade_datetime_range , name = " nominal_tot " )
127 leverages_ser = pd . Series (
128 data = leverages , index = trade_datetime_range , name = " leverages " )
129 units = pd . DataFrame (
130 data = units_held , index = trade_datetime_range , columns =[ inst + " units " for inst
in self . instruments ])
131 weights = pd . DataFrame (
132 data = weights_held , index = trade_datetime_range , columns =[ inst + " w " for inst in
self . instruments ])
133 self . portfolio_df = pd . concat ([
134 units ,
135 weights ,
136 stratscal_ser ,
137 nominals_ser ,
138 leverages_ser ,
139 capital_ser ,
140 ] , axis =1)
141 print ( self . portfolio_df )
142 else :
143 print ( capitals )
144 return self . portfolio_df

Listing 38: alpha.py

We get the runtime down to

@timeme: run_simulation took 14.059945106506348 seconds

from 15.73388385772705 seconds. The compute time of the last section in reconstructing our dataframe
is actually quite cheap:

230 1 4.0 4.0 0.0 if verbose:

231 1 1161.0 1161.0 0.0 capital_ser = pd.Series(data=capitals, inde

111
232 1 984.0 984.0 0.0 stratscal_ser = pd.Series(data=strat_scalar
233 1 1219.0 1219.0 0.0 nominals_ser = pd.Series(data=nominalss, in
234 1 834.0 834.0 0.0 leverages_ser = pd.Series(data=leverages, i
235 1 66953.0 66953.0 0.5 units = pd.DataFrame(data=units_held, index
236 1 62550.0 62550.0 0.5 weights = pd.DataFrame(data=weights_held, i
237 3 3505.0 1168.3 0.0 self.portfolio_df = pd.concat([
238 1 2.0 2.0 0.0 units,
239 1 2.0 2.0 0.0 weights,
240 1 1.0 1.0 0.0 stratscal_ser,
241 1 2.0 2.0 0.0 nominals_ser,
242 1 2.0 2.0 0.0 leverages_ser,
243 1 2.0 2.0 0.0 capital_ser,
244 1 2.0 2.0 0.0 ], axis=1)
245 1 295655.0 295655.0 2.3 print(self.portfolio_df)
246 else:
247 print(capitals)
248 1 4.0 4.0 0.0 return self.portfolio_df

With all the above techniques applied, let’s run the other strategies:

@timeme: run_simulation took 14.059945106506348 seconds

@timeme: run_simulation took 5.527076005935669 seconds
@timeme: run_simulation took 11.695970296859741 seconds

12 Other Notable Tools

We have not covered all the available tools to the engineer to make performant Python code. Here, we enu-
merate some of the more well-known/common tools for reference that we believe are worth looking into and
fits well for quantitative crunching (we might cover them in future papers!): OpenMP, NumExpr, Dask,
Swifter, PyPy. Some of these tools can be combined with the solutions discussed earlier to hypercharge
our code - such as Numba and OpenMP, Cython and OpenMP, Dask/Swifter and Numba et cetera.

Other topics you might be interested in: Operating System Schedulers, Distributed Systems,
Bytecode and Assembly Language, Hyperthreading.

112
13 Conclusions

We have introduced some of the ‘quick-and-easy’ ways that Python developers can use to speed-up the
efficiency of their programs. Note that this is just the tip of the iceberg, and there are always more ways
to squeeze more compute from your system, as well as specialised engineering you can do based on your
operating system’s architecture to make your code fly - from high level languages to inspecting assembly
code and down to the machine code running on silicon! We have worked with relatively manageable datasets -
and different optimizations may be required when working with huge datasets, such as distributed algorithms
and data structures.

We started with profiling, finding the quick winners and ‘attacking where it hurts’. We focused on
making our code performant with a pure Python approach, then as we hit the languages’ performance
bounds - we circumvented it with using compilation approaches in Numpy, Numba and Cython. Lastly, our
optimizations gave diminishingly smaller returns, such that greater amounts of refactoring were required to
give comparable returns - a good sign that our code is getting decently fast.

We have also not covered in detail the depths of concurrency in Python - most of the speedups intro-
duced have been associated with runtime optimizations that unlock what Python has hidden away from
programmers to deal with thorny issues such as memory management. I hope that for many beginner (and
some intermediate-level) programmers this was a peek into the world of optimizations in programming, and
that this piques your interest into the path of better programming for better standards. In the future, we
hope to get to introduce more on concurrency and writing parallelism into our systems - with discussions on
inter-process communication, semaphores and memory management!

Until then, sincerely yours - HangukQuant.

References

[1] HangukQuant. Targeting Risk; Volatility and Leverage Management. https://hangukquant.

substack.com/p/targeting-risk-volatility-and-leverage.

[2] pandas. pandas.DataFrame.at. https://pandas.pydata.org/docs/reference/api/pandas.

DataFrame.at.html.

113

C++ Library for Quantitative Finance
No ratings yet
C++ Library for Quantitative Finance
67 pages
Heart Disease Prediction Report
No ratings yet
Heart Disease Prediction Report
113 pages
Python Tutorial Text 2024-1
No ratings yet
Python Tutorial Text 2024-1
82 pages
Icebreake R
No ratings yet
Icebreake R
160 pages
Imm 6814
No ratings yet
Imm 6814
101 pages
Richard Fabian Data Oriented Design Software Engineering For Limited
No ratings yet
Richard Fabian Data Oriented Design Software Engineering For Limited
327 pages
Probabilistic Toolkit
No ratings yet
Probabilistic Toolkit
167 pages
Programming in Scilab
No ratings yet
Programming in Scilab
155 pages
Progscilab v.0.10 en
No ratings yet
Progscilab v.0.10 en
155 pages
Progscilab v.0.10 en
100% (1)
Progscilab v.0.10 en
155 pages
Advanced R Solutions Chapman Amp Hall CRC The R Series 1nbsped 1032007508 9781032007502
No ratings yet
Advanced R Solutions Chapman Amp Hall CRC The R Series 1nbsped 1032007508 9781032007502
302 pages
Book
No ratings yet
Book
510 pages
Testing Ebook Indepth
No ratings yet
Testing Ebook Indepth
564 pages
Chapter1-Foundations For Efficiencies
No ratings yet
Chapter1-Foundations For Efficiencies
5 pages
Software Testing and Analysis - Process, Principles and Techniques by Mauro Pezze ..
No ratings yet
Software Testing and Analysis - Process, Principles and Techniques by Mauro Pezze ..
564 pages
ICP Unit-1 Updated V6
No ratings yet
ICP Unit-1 Updated V6
30 pages
Python Data Mining Guide
No ratings yet
Python Data Mining Guide
103 pages
Optimizer
No ratings yet
Optimizer
593 pages
CreditRisk TOC
No ratings yet
CreditRisk TOC
10 pages
Benchmarking in Optimization - Best Practice and Open Issues
No ratings yet
Benchmarking in Optimization - Best Practice and Open Issues
50 pages
A Field Guide To Federated Optimization
No ratings yet
A Field Guide To Federated Optimization
88 pages
Quyen 2-2016-Scientific Computing With MATLAB-Paul Gribble-Scicomp
No ratings yet
Quyen 2-2016-Scientific Computing With MATLAB-Paul Gribble-Scicomp
319 pages
Workflow of Statistical Data Analysis
No ratings yet
Workflow of Statistical Data Analysis
105 pages
Pratten Thesis
No ratings yet
Pratten Thesis
33 pages
MOSEK Optimizer API For Python PDF
No ratings yet
MOSEK Optimizer API For Python PDF
416 pages
آشنایی با پایتون برای اقتصادسنجی
No ratings yet
آشنایی با پایتون برای اقتصادسنجی
407 pages
ICP Unit-1 Complete Material Correcting Some Typos
No ratings yet
ICP Unit-1 Complete Material Correcting Some Typos
30 pages
Deep Learning Fundamentals Guide
100% (1)
Deep Learning Fundamentals Guide
1,024 pages
d2l en
No ratings yet
d2l en
1,022 pages
Deep Learning Fundamentals Guide
No ratings yet
Deep Learning Fundamentals Guide
997 pages
Essential Fsharp
No ratings yet
Essential Fsharp
207 pages
Clean Architectures in Python
100% (9)
Clean Architectures in Python
171 pages
Howto Functional
No ratings yet
Howto Functional
21 pages
Deep Learning Concepts and Techniques
No ratings yet
Deep Learning Concepts and Techniques
988 pages
D2L PyTorch: Deep Learning Guide
No ratings yet
D2L PyTorch: Deep Learning Guide
979 pages
Deep Learning for Developers
100% (1)
Deep Learning for Developers
1,029 pages
DeepL OK 2022 Dive Into DL With PyTorch d2l Nov.2022
No ratings yet
DeepL OK 2022 Dive Into DL With PyTorch d2l Nov.2022
975 pages
Deep Learning Guide: Installation to MLPs
No ratings yet
Deep Learning Guide: Installation to MLPs
986 pages
Clean Architectures in Python
100% (1)
Clean Architectures in Python
177 pages
Python For Artificial Intelligence
No ratings yet
Python For Artificial Intelligence
298 pages
Automated Algorithmic Trading System
No ratings yet
Automated Algorithmic Trading System
81 pages
Dive Into Deep Learning: Aston Zhang, Zachary C. Lipton, Mu Li, and Alexander J. Smola
No ratings yet
Dive Into Deep Learning: Aston Zhang, Zachary C. Lipton, Mu Li, and Alexander J. Smola
987 pages
Dive Into Deep Learning
No ratings yet
Dive Into Deep Learning
60 pages
Developing Applications With Eclipse: Technical Report Ic-Parc-03-2
No ratings yet
Developing Applications With Eclipse: Technical Report Ic-Parc-03-2
103 pages
Statistics Machine Learning Python
No ratings yet
Statistics Machine Learning Python
415 pages
Python Data Science
100% (1)
Python Data Science
173 pages
Primemover Inspection Check Sheet
No ratings yet
Primemover Inspection Check Sheet
4 pages
Friction Clutches Explained
No ratings yet
Friction Clutches Explained
14 pages
Ultra Low Profile Torx Screws Datasheet
No ratings yet
Ultra Low Profile Torx Screws Datasheet
1 page
Sample Sharebility Academy Ict Edutech Policy
No ratings yet
Sample Sharebility Academy Ict Edutech Policy
32 pages
Accounting Graduate CV for Job Search
No ratings yet
Accounting Graduate CV for Job Search
4 pages
Adidas SWOT Analysis and Marketing Strategies
No ratings yet
Adidas SWOT Analysis and Marketing Strategies
15 pages
Understanding Agency Relationships and Liability
No ratings yet
Understanding Agency Relationships and Liability
75 pages
BOQ of Parkview City Water Treatment Plant - Removed
No ratings yet
BOQ of Parkview City Water Treatment Plant - Removed
6 pages
Coffee Report by Ankit
No ratings yet
Coffee Report by Ankit
24 pages
Tork Submersible Stainless Steel Pumps
No ratings yet
Tork Submersible Stainless Steel Pumps
52 pages
Techstreet Enterprise User Guide
No ratings yet
Techstreet Enterprise User Guide
13 pages
Islamic Auto Financing Options Explained
No ratings yet
Islamic Auto Financing Options Explained
5 pages
Writing About Music - Criticism, Journalism & Professional Development
No ratings yet
Writing About Music - Criticism, Journalism & Professional Development
9 pages
AP Stats 3 Practice Test
No ratings yet
AP Stats 3 Practice Test
5 pages
AFI eANP Vol II PDF
No ratings yet
AFI eANP Vol II PDF
131 pages
Design Calculation Sheet: Date: Sheet No.: Project No.: 1203 Computed By: Alaa Ramadan Approved By: Checked by
No ratings yet
Design Calculation Sheet: Date: Sheet No.: Project No.: 1203 Computed By: Alaa Ramadan Approved By: Checked by
1 page
Containerization in Cloud Computing Comparing Dock
No ratings yet
Containerization in Cloud Computing Comparing Dock
14 pages
DRX-Revolution Mobile X-ray User Guide
No ratings yet
DRX-Revolution Mobile X-ray User Guide
86 pages
CRM-M 22363 2025 02 05 2025 Final Order
No ratings yet
CRM-M 22363 2025 02 05 2025 Final Order
4 pages
Advantages of TDM in T-Carriers
No ratings yet
Advantages of TDM in T-Carriers
6 pages
MABC-11050 Product Brief
No ratings yet
MABC-11050 Product Brief
2 pages
Engineering Measurements - Methods and Intrinsic Errors - WILLEY PDF
No ratings yet
Engineering Measurements - Methods and Intrinsic Errors - WILLEY PDF
195 pages
SEC General Information Sheet - Non-Stock - UROP
No ratings yet
SEC General Information Sheet - Non-Stock - UROP
10 pages
EVERFI FinancialLiteracy SavingsAccounts StudentActivity
No ratings yet
EVERFI FinancialLiteracy SavingsAccounts StudentActivity
2 pages
Modulating Thermostat 1152151
No ratings yet
Modulating Thermostat 1152151
4 pages
Chapter 6 Operations Planning and Control
No ratings yet
Chapter 6 Operations Planning and Control
29 pages
Silverson Bottom Entry Mixer
No ratings yet
Silverson Bottom Entry Mixer
5 pages
Rdbms III Sem
100% (1)
Rdbms III Sem
80 pages
E1 Solar Panel Lay-Out
No ratings yet
E1 Solar Panel Lay-Out
1 page
Kaeser Screw Compressors DSD Series
50% (2)
Kaeser Screw Compressors DSD Series
8 pages