0% found this document useful (0 votes)
63 views35 pages

Bachelor Majorin Justus 2022

This bachelor's thesis by Justus Majorin investigates the impact of daily order imbalance on stock returns in the Finnish stock market from 2010 to 2013, using fixed-effects panel regressions to control for unobservable effects. The findings confirm a positive relationship between current order imbalance and stock returns, while also revealing that lagged order imbalances can predict future returns when controlling for current imbalance. Additionally, the thesis explores the connection between order imbalance and asymmetric information, contributing to the existing literature on market microstructure and stock pricing.

Uploaded by

uday tripurani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views35 pages

Bachelor Majorin Justus 2022

This bachelor's thesis by Justus Majorin investigates the impact of daily order imbalance on stock returns in the Finnish stock market from 2010 to 2013, using fixed-effects panel regressions to control for unobservable effects. The findings confirm a positive relationship between current order imbalance and stock returns, while also revealing that lagged order imbalances can predict future returns when controlling for current imbalance. Additionally, the thesis explores the connection between order imbalance and asymmetric information, contributing to the existing literature on market microstructure and stock pricing.

Uploaded by

uday tripurani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

ORDER IMBALANCE AND STOCK RETURNS:

Evidence from the Finnish stock market

Bachelor’s Thesis
Justus Majorin
Aalto University School of Business
Bachelor’s programme in Finance
Fall 2022
Aalto University, P.O. BOX 11000, 00076 AALTO
[Link]
Abstract of bachelor’s thesis

Author Justus Majorin


Title of thesis Order imbalance and stock returns: Evidence from the Finnish stock market
Degree Bachelor of Science
Degree programme Finance
Thesis advisor(s) Matthijs Lof
Year of approval 2022 Number of pages 27+5 Language English

Abstract
In financial markets, order imbalance refers to whether there is excess buy or sell-initiated orders
in a trading period. This thesis studies the effects of daily order imbalance on both contemporaneous
and future stock returns in the cross-section of Finnish stocks in the years 2010-2013. Contrary to
most previous research regarding order imbalance, I use fixed-effects panel regressions to control
for unobservable effects. In my main analysis, the results confirm the positive relationship between
current order imbalance and stock returns. In contrast to previous literature, when I do not control
for current order imbalance in my regression analysis, my results regarding return predictability are
less confirmatory. In addition, I also examine the relationship between order imbalance and
asymmetric information.

Keywords Order imbalance, Panel regression, Stock returns, Asymmetric information


Table of Contents

1 Introduction ........................................................................................................................................... 1
2 Literature review ................................................................................................................................... 3
2.1 Theoretical models linked to order imbalance .............................................................................. 3
2.2 Empirical results of the effects of daily order imbalance on stock returns ................................... 5
3 Data and methodology .......................................................................................................................... 6
3.1 Initial dataset and inclusion requirements ..................................................................................... 6
3.2 Trade classification ....................................................................................................................... 7
3.3 Measures of order imbalance ........................................................................................................ 7
3.4 Summary statistics ........................................................................................................................ 8
3.5 Regression specifications ............................................................................................................ 10
4 Order imbalance and stock returns...................................................................................................... 12
4.1 Regression results ....................................................................................................................... 12
4.2 Regression results grouped by size terciles................................................................................. 14
4.3 Results grouped by different levels of order imbalance.............................................................. 15
4.4 Results with Nasdaq’s order imbalance reports .......................................................................... 16
5 Order imbalance and information asymmetry..................................................................................... 22
5.1 Order imbalance and VCV.......................................................................................................... 23
5.2 Order imbalance and future returns............................................................................................. 24
6 Conclusion .......................................................................................................................................... 26
Appendix ..................................................................................................................................................... 28
References ................................................................................................................................................... 30
1 Introduction

Contrary to neoclassical finance theory, which assumes the main source of market movements to be new
information, the market microstructure literature tries to identify different sources of price effects moving
the market. These price effects are mostly unexplainable by neoclassical theory. A key issue of financial
economics, what moves market prices, has fueled a vast amount of market microstructure literature which
has studied the effects of trading volume on stock returns (e.g., Hiemstra and Jones, (1994), Lo and Wang
(2000), and Chordia et al. (2001)).

Similar to trading volume, order imbalance has gained academic attention in the market microstructure
literature since it has many implications for individual stock prices and also on the aggregate market. For
example, Chordia et al. (2002) argue that order imbalances are a better measure of trading activity than
volume, since volume does not distinguish the direction of the trade. Consequently, trading volume hides
many of the effects on both liquidity and price for the underlying instrument, for example, stock.

In the stock market, there are sell and buy orders on each stock. Order imbalance means that there are excess
buy trades compared to sell trades, or vice versa. Thus, order imbalance is a result of excess supply or
excess demand. Naturally, order imbalance only makes sense as a concept in an intermediated market, in
which market makers accommodate price pressures from traders, which arise from their buying and selling
activity (Chordia and Subrahmanyam (2004)). Should stock exchanges be non-intermediated, for every
buyer there would be a seller, and an equal number of trades containing an equal number of shares would
be executed in both directions and order imbalance would not exist.

Traditional asset pricing literature argues that assets are priced efficiently, such that all relevant information
has already been priced in the asset’s price. In financial markets, asset prices are constantly revised to reflect
the latest information – thus, the prices of assets cannot be separated from the process in which they become
efficient (Easley, et al. (2002)). If observable order imbalance proxies for unobservable private information-
based trading (Chordia, et al. (2019)), it should affect asset prices.

Furthermore, the market microstructure literature argues that order imbalance has implications on stock
prices also from the market maker’s side, since a large order imbalance forces market makers to change
bid-ask spreads and price quotations (Chordia, et al. (2002)). The market maker should revise the price
upward when there are excess buy orders, and down when there are excess sell orders (Chan and Fong
(2000)). Ultimately, bid-ask spreads compensate market makers for adverse selection (private information)
risk (Kyle (1985), Chordia and Subrahmanyam (2004)) and inventory risk (Stoll (1978)).

1
The purpose of this bachelor’s thesis is to analyze how short-term stock returns in the Finnish stock market
are affected by order imbalance. The analysis will be conducted cross-sectionally and on a daily level. My
research will contribute to the literature in several ways. First, a large part of the previous literature uses
aggregated time-series regressions of varying frequencies, while controlling for different liquidity and other
relevant proxies, to scrutinize the effects of order imbalance. But in this study, I use daily fixed-effects
cross-sectional panel regressions to control for unobservable effects and to better capture the effects of
order imbalance on stock returns. A more thorough explanation of this methodology is in Section 3.5.

Second, in previous research, an algorithm is usually used to determine whether there is order imbalance
or not and the direction of it, i.e., excess supply or demand. These algorithms, such as the Lee and Ready
(1991) algorithm, which determines the direction of the trade, are used to calculate estimates of order
imbalance. Since they are estimates, they usually have some prediction error. Furthermore, trade
classification has become increasingly difficult in modern electronically traded high-frequency markets
(Easley et al. (2012), p.14-15). From my data, I can distinguish whether an executed order was a buy or a
sell-initiated order, so the use of algorithms to determine the direction of the trade is not necessary. Using
this approach, I can measure daily order imbalance more accurately and avoid the estimation error resulting
from using an algorithm. The trade classification is discussed in more detail in Section 3.2.

Third, like most previous research, I will analyze how the effects of order imbalance differ by stocks in
different size terciles, to further investigate the return predictability linked to order imbalance in the Finnish
stock market. Lastly, the relation between order imbalance and asymmetric information will be studied,
providing evidence of the longer-term return effects associated with order imbalance, which have only been
under little scrutiny in the literature.

I show that in the Finnish stock market, order imbalance causes a contemporaneous price pressure, which
reverses on the following days. Contrary to many previous studies, when I only use lagged order imbalances
and do not control for current order imbalance, I find almost no significant predictability of future returns.
However, using Nasdaq Nordic’s own imbalance reports published at the end of each trading day, I find
that order imbalance at the closing auction causes a price pressure, which spills over into the next trading
day, affecting the next day’s return positively. Thus, I document some evidence of lagged order imbalances
predicting future returns.

When analyzing the results of my empirical analysis, I am aware that the bid-ask bounce effect (e.g., Blume
and Stambaugh (1983), and Roll (1984)), resulting from trades executed at either the bid or the ask (and
not the equilibrium price between them), might bias my results. However, the bid-ask bounce effect is
particularly pronounced for the smallest and most illiquid stocks, since the bid and ask quotes are typically

2
farther apart than for more liquid stocks. Also, non-synchronous trading, initially proposed by Fisher
(1966), which arises when certain stocks are less frequently traded and thus closing prices for these assets
might occur at an earlier time during the trading day than for the more liquid stocks. For example, the
market price of a very illiquid stock might not react to new information on the day it becomes public, if
there are no executed trades of that stock on that day. Thus, the closing price might not reflect the intrinsic
value of that asset, causing the return to be biased. Non-synchronous trading is another source of noise to
daily returns, the effect of which is again most notable for the smallest, most illiquid stocks. I address both
issues by removing the smallest and most illiquid stocks from my sample, as explained in Section 3.1.

The rest of this paper is organized as follows: In Section 2, previous theoretical models and empirical
findings are discussed. Section 3 presents the data and methodology used. Section 4 analyzes the regression
results. Section 5 sheds light on how order imbalance relates to asymmetric information, and Section 6
concludes.

2 Literature review

This section will review previous literature regarding order imbalance, focusing mainly on research, in
which the empirical analysis is conducted on a daily level. Section 2.1 examines theoretical models
connected to order imbalance, its causes, and the effects it has on stock returns. Section 2.2 provides an
analysis of the empirical results in previous studies.

2.1 Theoretical models linked to order imbalance

A model developed by Stoll (1978) states that market makers (dealers) have a desired portfolio with a
desired level of risk and return, same as any other investor. When supplying immediate execution to traders,
the market maker’s portfolio moves from her desired portfolio, resulting in the market maker assuming (not
assuming) risk she does not want to (wants to) assume. In the model, the market maker’s revising of price
quotes results in a positive relation between order imbalance and following changes in asset prices.
Consider the following example. For a market maker to have a long position in (more than her desired level
of) a stock, there must have been a negative order imbalance, which would have been caused by more
selling than buying by traders, and conversely, more buying than selling by the market maker who executes
the trader’s orders. Since the market maker has a long position, she would set the quotes so that the prices
would encourage traders to buy (her to sell) and discourage traders to sell (her to purchase) that stock. Thus,
the bid and ask prices would both be lower (more people are willing to buy at that lower price, less people
are willing to sell at that lower price), resulting in the price of that stock to go down, and the returns to be
negative. Assuming no new information emerges, after offloading the inventory, the market maker’s price

3
quotes are reverted to their original levels, which in turn causes a negative relation with order imbalance
and future stock prices and thus returns.

A similar model developed by Roll (1984) argues that the actions of market makers make stock prices
exhibit negative autocorrelation. In a non-intermediated market, assuming that prices are informationally
efficient and that there are no trading costs, the market price of an asset contains all relevant information,
such that a price change can only happen if new information is received by the market participants.
Consequently, consecutive price changes should not be autocorrelated. In an intermediated market, the
market maker must be compensated for executing trades, and thus bid-ask spreads arise. Due to trades being
executed at either the bid or the ask (and not the equilibrium price between them) price changes become
negatively autocorrelated. Their explanation is as follows: an executed market sell (buy) order executed at
the bid (ask) will be followed by a trade at an identical price or at a higher (lower) price, the ask (bid). Since
a sell order would result in (a) a negative order imbalance (order imbalance = buy – sell) and (b) the
following trades being executed at a higher (or equal) price, then order imbalances would have a negative
relation with future returns.

In the well-known Kyle (1985) model, informed traders, who trade on one side of the market causing an
order imbalance, consider the price impact that their orders have on future prices. In the model, these
informed traders want to gain maximum utility from their private information, i.e., maximize their profits.
Thus, with the help of the noise provided by liquidity traders, informed traders split their orders and trade
gradually on one side of the market until the price of the asset has converged to the level which their private
information would suggest. This behavior of the informed traders causes autocorrelation in trades and order
imbalances, which in turn magnifies the above-stated inventory holding effect proposed by Stoll (1978).

Chordia and Subrahmanyam (2004) build on the Kyle (1985) model described above, in which informed
traders want to minimize the price impact of their trades and thus split their orders over time. This causes
autocorrelation in order imbalances (since informed traders trade on one side of the market), which in turn
causes continuing price pressures, increasing the price of the asset. These continuing price pressures have
a positive relation with subsequent returns, which means that lagged order imbalances should exhibit a
positive relation with contemporaneous returns. Assuming that there is no new information available, when
the informed traders have traded their desired amount, which results in no new price pressures arising, the
prices of assets should revert to their original levels.

In their theoretical model, Chordia and Subrahmanyam (2004) also argue that, controlling for current order
imbalance, lagged imbalances should be negatively related to returns. Their intuition is the following:
contemporaneous history-dependent (autocorrelated) and history-independent (innovation) trades are the

4
two components of a price pressure, in which both components, autocorrelated and innovation trades, are
assigned an equal weight, while in fact the history-dependent trades should have a lower weight than the
innovation trades. This is because the information content of these contemporaneous autocorrelated trades
has partially been revealed – and the associated price effect has already been incorporated – by the prior
autocorrelated trades. Thus, the lagged order imbalances compensate for this contemporaneous over-
weighting and consequently, exhibit a negative relation with current returns.

More recent research links variation in order flow to the information asymmetry cost literature arguing that
readily measurable order flow volatility can proxy for private information costs (Chordia et al. (2019)).
Also, Chordia et al. (2019) argue that high order flow volatility indicates higher trading activity by informed
trades, which increases adverse selection (private information) costs. In their model, market makers infer
the likelihood of trading with an informed trader from the order flow imbalances and widen bid-ask spreads
accordingly. High order imbalance volatility, proxying for non-observable asymmetric information, would
cause a short-term reduction in the price of the asset, allowing for future returns to be higher, due to
asymmetric information causing a premium in the return on equity required by investors.

Even more recently, Bogousslavsky and Collin-Dufresne (2022) develop a high frequency inventory model
of order imbalance to study the inventory risk of market makers who operate at high frequencies. They
build a theoretical model in which trading activity does not have an unambiguous effect on spreads:
Increased volume increases the likelihood of an offsetting trade (a trade in the opposite direction) arriving,
reducing the market maker’s average holding period for inventory and consequently risk and spreads. In
contrast, in their model, increased volume can also increase the volatility of shocks to inventory, making
liquidity provision riskier, and thus increasing spreads.

2.2 Empirical results of the effects of daily order imbalance on stock returns

The early literature regarding order imbalance focused on short periods of time, for example Black Monday
(Blume et al. (1989)), specific events such as earnings announcements (Lee (1992)) or a small sample of
stocks (Brown et al. (1997)). Already at that time, the literature concluded that order imbalance has a
significant (statistical) effect on stock returns both contemporaneously and with a lag, but the results had
little theoretical underpinning. The effects of order imbalance were found to affect returns in the next
trading day, and some even found return effects on the trading day after that.

Extending the sample period beyond specific events, Stoll (2000) studies the effects of order imbalance on
stock returns for both NYSE and NASDAQ stocks between December 1997 and February 1998. They find
a positive link between contemporaneous order imbalance and stock returns, with the relation being

5
significant. In their sample, conditional (controlling for contemporaneous order imbalance) lagged order
imbalances have a less significant effect on stock returns.

Chan and Fong (2002) study the volume-volatility relation with order imbalance and its effects on NYSE
and NASDAQ stocks with a relatively short sample period of six months. In their empirical analysis, they
conclude that order imbalance predicts contemporaneous returns positively and significantly when
controlling for weekday-specific effects and past returns.

Chordia et al. (2002) were the first to analyze the effects of order imbalances on stock returns over a longer
sample period. In their model, the theoretical underpinning is the market maker’s inventory problem, related
to the inventory model of Stoll (1978). They find that many factors cause order imbalance, such as changes
in macroeconomic variables or weekday-specific regularities in trading activity. Furthermore, they show
that order imbalance has a strong positive contemporaneous effect on returns, which eventually exhibit
reversals. These reversals are largest and most significant after days with large negative returns.

Chordia and Subrahmanyam (2004) find their empirical results to be consistent with their model described
in Section 2.1: In their sample of NYSE stocks, order imbalances are positively autocorrelated, which
results in unconditional lagged order imbalances having a positive relation with current returns due to price
pressures caused by these imbalances. In addition, they show that, controlling for contemporaneous order
imbalance, lagged order imbalances are negatively related to current stock returns – consistent with their
theory. Their results exhibit strongest explanatory power in the three smallest size quartiles. Shenoy and
Zhang (2007) confirm the results of Chordia and Subrahmanyam (2004) for Chinese stocks, and Hanke and
Wiegerding (2015) for German stocks.

3 Data and methodology

3.1 Initial dataset and inclusion requirements

The data I use in my empirical analysis is the Nordic Nasdaq’s Nordic Equity TotalView ITCH-data, which
is the raw feed of all the Nordic and Baltic Nasdaq exchanges. It reports the added, executed, and cancelled
orders on a millisecond-basis. The sample period is 08.02.2010 – 26.04.2013, which is the full length of
my data, totaling 21 billion rows. For this analysis, only data from the Finnish equities exchange is used
and only from executed orders. Following previous research (e.g., Chordia et al. (2019), p.1522; Chan and
Fong (2000), p.252-253), to mitigate the effects of bid-ask bounce (e.g., Blume and Stambaugh (1983)) and
non-synchronous trading (e.g., Fisher (1966), and Lo and MacKinlay (1990)), I remove the most illiquid
stocks by using only large and mid-cap stocks in the final sample. In addition, stocks for which a value for

6
order imbalance cannot be calculated each day during which they are traded, are considered illiquid and
thus excluded from the final sample. More information on the ITCH-data is in the Appendix. In addition, I
am aware that some large trades might be executed over the counter (OTC), which results in them not
appearing in my data. Thus, due to the unavailability of data on these OTC trades, my value of order
imbalance – which only includes trades executed on the Nasdaq exchange – might not represent the total
trading activity of a stock.

Daily stock returns and the stocks’ market capitalizations are downloaded from Refinitiv Datastream, the
accuracy of which I cannot guarantee. Daily returns are used as explanatory variables in regression analysis
and market capitalizations are used to group stocks into three size terciles.

3.2 Trade classification

To determine the direction of the trade, I assign each limit order as buy or sell-initiated with Odders-White’s
(2000) immediacy definition. This definition assigns the direction of each trade based on the immediacy of
the execution, such that the trader who demands instantaneous execution is the one who initiates the trade.
This means that the trader who places a limit order to sell or buy a stock at a given price, is in fact the non-
initiator, i.e., a passive supplier of liquidity. Conversely, the trader who places a market order – or trades at
the price of a non-initiator – is the one initiating the trade, or the liquidity taker. This definition is
implemented in the data by switching the direction indicator in the executed order messages, such that an
executed sell (buy) limit order is in fact a trade initiated by a trader buying (selling) that stock.

3.3 Measures of order imbalance

Similar to Chordia and Subrahmanyam (2004), two measures of order imbalance (𝐼) are calculated: share-
based (𝐼_𝑆𝐻𝑅) and trade-based (𝐼_𝑁𝑈𝑀) order imbalance. 𝐼_𝑆𝐻𝑅 (𝐼_𝑁𝑈𝑀) is the number of buy-initiated
shares traded (buy-initiated trades) on day 𝑡 of stock 𝑖 less the number of sell-initiated shares traded (sell-
initiated trades) on day 𝑡 of stock 𝑖 as a fraction of the total number of shares traded (total trades) of stock
𝑖 on day 𝑡. These measures are divided by the total number of shares traded (total trades) to remove the
effect of trading volume, since frequently traded stocks have a systematically larger absolute order
imbalance. This way, order imbalance is expressed as a fraction of the total trading volume on each day.
Formally,

𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠ℎ𝑎𝑟𝑒𝑠 𝑡𝑟𝑎𝑑𝑒𝑑 𝑏𝑦 𝑏𝑢𝑦𝑒𝑟𝑠𝑖,𝑡 − 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠ℎ𝑎𝑟𝑒𝑠 𝑡𝑟𝑎𝑑𝑒𝑑 𝑏𝑦 𝑠𝑒𝑙𝑙𝑒𝑟𝑠𝑖,𝑡


𝐼_𝑆𝐻𝑅𝑖,𝑡 = (1)
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠ℎ𝑎𝑟𝑒𝑠 𝑡𝑟𝑎𝑑𝑒𝑑𝑖,𝑡

7
and

𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑟𝑎𝑑𝑒𝑠 𝑖𝑛𝑖𝑡𝑖𝑎𝑡𝑒𝑑 𝑏𝑦 𝑏𝑢𝑦𝑒𝑟𝑠𝑖,𝑡 − 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑟𝑎𝑑𝑒𝑠 𝑖𝑛𝑖𝑡𝑖𝑎𝑡𝑒𝑑 𝑏𝑦 𝑠𝑒𝑙𝑙𝑒𝑟𝑠𝑖,𝑡


𝐼_𝑁𝑈𝑀𝑖,𝑡 = (2)
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑟𝑎𝑑𝑒𝑠𝑖,𝑡

In contrast to my definition above, Li et al (2010, p. 1243) state that the cancellation of a limit sell (buy)
order affects order imbalance in the same way as adding a limit buy (sell) order. When they use this
enhanced definition of order imbalance, which also considers cancelled orders, their model exhibits greater
explanatory power. But in this bachelor’s thesis, I do not use their definition of order imbalance, which
might make the effects in my regression results less pronounced.

3.4 Summary statistics

In Table 1, the number of stocks in the final sample in each subperiod are shown. The number of stocks in
the final sample is distributed evenly, but the number of daily observations is not, since 2010 and 2013 are
not full years in the sample.

Table 1 The number of stocks and daily observations in the final sample
This table reports the number of stocks for each year in the final sample.
Subperiod 2010 2011 2012 2013
Number of stocks 53 56 54 56
Daily observations 6864 12521 11515 4011

Table 2 reports descriptive statistics for the whole sample period, totaling to 34911 daily observations. The
share of positive and negative values indicate that order imbalances are quite well balanced in my sample,
but the median (-0.005 and -0.004) and mean (-0.008 and -0.007) of both 𝐼_𝑆𝐻𝑅 and 𝐼_𝑁𝑈𝑀 are negative,
which indicates that there are more seller-initiated than buyer-initiated trades executed and shares traded.

Both 𝐼_𝑆𝐻𝑅 and 𝐼_𝑁𝑈𝑀 exhibit a notable standard deviation (0.239 and 0.207), suggesting that order
imbalance is quite volatile and that 𝐼_𝑆𝐻𝑅 is more volatile than 𝐼_𝑁𝑈𝑀. Also, the minimum (-0.998 and
-0.938) and maximum (0.995 and 0.888) values suggest that there are observations on which order
imbalances are extreme, i.e., almost all trades of a stock are executed in one direction. While order
imbalances tend to be negative, the average and median daily returns are slightly positive. Moreover, the
share of positive values (0.520) is slightly larger than the share of negative (0.480) values for the daily
returns.

8
Table 2 Descriptive statistics for the final sample
This table presents descriptive statistics of the final sample.

𝐼_𝑆𝐻𝑅 𝐼_𝑁𝑈𝑀 𝐷𝑎𝑖𝑙𝑦 𝑅𝑒𝑡𝑢𝑟𝑛


N 34911 34911 34911
Mean -0.008 -0.007 0.000
Standard deviation 0.239 0.207 0.022
Minimum -0.998 -0.938 -0.260
Median -0.005 -0.004 0.000
Maximum 0.995 0.888 0.349
Share of negative values 0.512 0.508 0.480
Share of positive values 0.488 0.492 0.520

Complementary to Table 2, Figure 1 shows the distribution of both share-based and trade-based order
imbalances. Order imbalances follow a normal-like distribution with a mean of zero, but the trade-based
order imbalance (𝐼_𝑁𝑈𝑀) exhibits more observations close to the mean of the distribution. The
distributions of both measures of imbalance in the three different size terciles are in the Appendix.

Figure 1 Distribution of share-based and trade-based order imbalances


This figure shows the distribution of order imbalance in the final sample for both share-based and trade-based order
imbalances.

Table 3 reports similar descriptive statistics as Table 2, but with the stocks divided into three terciles by the
market capitalization of the company. The companies are divided into the size terciles based on their market
capitalizations at the end of the previous month. The share of positive and negative values for both share-
based and trade-based order imbalance suggest that order imbalances are distributed rather evenly into

9
positive and negative values in all the three size terciles. Interestingly, the average share-based order
imbalance (𝐼_𝑆𝐻𝑅) for large stocks is positive (0.005), while the trade-based order imbalance (𝐼_𝑁𝑈𝑀) is
negative (-0.002). Also, contrary to what Table 2 would suggest, the trade-based order imbalance is more
volatile than share-based order imbalance for large stocks. This would imply that the effects of order
imbalance on stock returns should also be analyzed in each size tercile separately. Furthermore, as one
might expect, the standard deviation and the maximum and minimum values of both 𝐼_𝑆𝐻𝑅 and 𝐼_𝑁𝑈𝑀
decrease as the size of the company increases, with the decrease being larger for share-based order
imbalance. For the daily returns, the standard deviation is relatively similar for all sizes, but the maximum
and minimum returns are most extreme in the smallest tercile. Both the average and median daily returns
are positive in all three terciles.

Table 3 Descriptive statistics for the final sample by size


This table reports the descriptive statistics for the final sample grouped by size terciles of stocks.

𝐼_𝑆𝐻𝑅 𝐼_𝑁𝑈𝑀 𝐷𝑎𝑖𝑙𝑦 𝑅𝑒𝑡𝑢𝑟𝑛


𝑆𝑚𝑎𝑙𝑙 𝑀𝑖𝑑 𝐿𝑎𝑟𝑔𝑒 𝑆𝑚𝑎𝑙𝑙 𝑀𝑖𝑑 𝐿𝑎𝑟𝑔𝑒 𝑆𝑚𝑎𝑙𝑙 𝑀𝑖𝑑 𝐿𝑎𝑟𝑔𝑒
N 11648 11639 11624 11648 11639 11624 11648 11639 11624
Mean -0.027 -0.001 0.005 -0.016 -0.005 -0.002 0.000 0.000 0.000
Standard deviation 0.334 0.217 0.108 0.275 0.199 0.112 0.022 0.023 0.021
Minimum -0.998 -0.972 -0.516 -0.938 -0.937 -0.789 -0.260 -0.192 -0.178
Median -0.033 -0.003 0.003 -0.016 -0.003 -0.001 0.000 0.000 0.000
Maximum 0.995 0.955 0.551 0.888 0.839 0.697 0.349 0.184 0.138
Share of negative values 0.539 0.508 0.488 0.515 0.505 0.503 0.475 0.482 0.484
Share of positive values 0.461 0.492 0.512 0.485 0.495 0.497 0.525 0.518 0.516

3.5 Regression specifications

Following previous studies (e.g., Chordia and Subrahmanyam (2004), and Hanke and Wiegerding (2015)),
I try to predict stock returns (a) using both contemporaneous and lagged order imbalances and (b) using
only lagged order imbalances, to get a more comprehensive understanding of how order imbalance affects
stock returns. Similar to Hanke and Wiegerding (2015), in this thesis, I use fixed-effects panel regressions
to try to capture the effects of order imbalance. Using both day and stock fixed-effects, I can control
unobserved effects, which can be time-varying or stock-specific.

Following Hanke and Wiegerding (2015), who also use fixed-effects panel regressions, I perform a
Hausman (1978) test to evaluate whether the data matches better with a fixed-effects or a random effects
model. As indicated in the Appendix (Sect. B), the fixed-effects and random effects models produce
significantly differing results (at the 0.1% level), suggesting that the usage of the former is more appropriate
with my data. In addition, I calculate cluster-robust standard errors using Cameron et al. (2011) two-way

10
clustering for both days and stocks. Two-way clustering means that the standard errors are allowed to be
dependent, i.e., correlated, within each group (stock) or within each time period (day), but not the
intersection of them.

Contrary to the methodology used in this bachelor’s thesis, some previous studies, such as Hanke and
Wiegerding (2015), and Chordia and Subrahmanyam (2004), use mid-quote closing returns as the
explanatory variable in their regression analysis, to combat the so-called bid-ask bounce effect. But in this
thesis, I do not calculate mid-quote closing returns, which might bias my results. Moreover, I do not control
for past returns in my regressions to avoid possible collinearity issues between order imbalances and
returns, as suggested by Chordia and Subrahmanyam (2004, p. 498).

For the regressions studying the lagged relation conditional on contemporaneous order imbalance, the
fixed-effects regression is defined as

𝑅𝑖,𝑡 = ∑ 𝛽𝑗𝑐 𝐼𝑖,𝑡−𝑗 + 𝛿𝑡 + 𝜑𝑖 + 𝜀𝑖,𝑡 (3)


𝑗=0

where 𝑅𝑖,𝑡 is the daily return, 𝐼𝑖,𝑡−𝑗 ∈ {𝐼_𝑆𝐻𝑅𝑖,𝑡−𝑗 , 𝐼_𝑁𝑈𝑀𝑖,𝑡−𝑗 }, i.e., the share or trade-based order
imbalance defined in Equations 1 and 2 in Section 3.3, and 𝜀𝑖,𝑡 is the error term for stock 𝑖 at time 𝑡. 𝛿𝑡 and
𝜑𝑖 are used to denote time and stock fixed-effects, respectively, where subscript 𝑡 denotes a specific time-
period and 𝑖 denotes a specific stock. In the sum operator, 𝐽 is the number of lags used in the regression.
For t-statistics and p-values, using a two-tailed t-test, the null hypothesis of 𝛽𝑗𝐶 = 0 is tested, where
superscript 𝑐 denotes the conditional relation.

Without controlling for contemporaneous order imbalance, the fixed-effects regression is defined as

𝑅𝑖,𝑡 = ∑ 𝛽𝑗𝑢 𝐼𝑖,𝑡−𝑗 + 𝛿𝑡 + 𝜑𝑖 + 𝜀𝑖,𝑡 , (4)


𝑗=1

using same definitions as described above. Again, the null hypothesis that 𝛽𝑗𝑢 = 0 is tested with the help
of a two-tailed test, where superscript 𝑢 denotes the unconditional relation.

Also, both the unconditional and unconditional regressions are run separately for different size terciles,
such that

𝐽
𝑐
𝑅𝑖,𝑡,𝑠 = ∑ 𝛽𝑗,𝑠 𝐼𝑖,𝑡−𝑗,𝑠 + 𝛿𝑡 + 𝜑𝑖 + 𝜀𝑖,𝑡,𝑠 (5)
𝑗=0

11
and

𝐽
𝑢
𝑅𝑖,𝑡,𝑠 = ∑ 𝛽𝑗,𝑠 𝐼𝑖,𝑡−𝑗,𝑠 + 𝛿𝑡 + 𝜑𝑖 + 𝜀𝑖,𝑡,𝑠 , (6)
𝑗=1

where, in addition to the definitions above, subscript 𝑠 ∈ {1, 2, 3} refers to the size tercile (1 = smallest, 3
= largest) of the company calculated from its market capitalization at the end of the previous month. The
market capitalizations are from Refinitiv Datastream.

4 Order imbalance and stock returns

In this section, I examine whether order imbalance affects the cross-section of daily stock returns in the
Finnish stock market. First in Section 4.1, I analyze the results based on regressions without controlling for
size. Second, in Section 4.2, I expand the analysis to subgroups of different size terciles. Third, Section 4.3
provides robustness to my results by analyzing regressions with different levels of order imbalances.
Finally, Section 4.4 analyzes results from regressions in which order imbalance is calculated from
imbalance reports published at the closing auction at the end of each day.

4.1 Regression results

In Table 4, the results from the conditional and unconditional regressions are reported. Column 2 shows the
conditional regression results (Equation 3), and column 3 shows the unconditional regression results
(Equation 4). In both columns, lags up to the fifth lag are included.

In column 2 of Table 4, the current order imbalance exhibits a significant positive relation with
contemporaneous returns, which is consistent with previous empirical findings discussed in Section 2.2.
Moreover, the conditional lagged order imbalances exhibit a negative relation with current returns, in line
with previous empirical results. These results could be explained by autocorrelation in order imbalances
caused by splitting trades (Kyle (1984)) or Stoll’s (1978) model related to the market maker’s inventory.
The results imply that order imbalance causes a temporary contemporaneous price pressure, which then
reverses during the following days. The reversal effect is most pronounced in the first lag and diminishes
with lags up to the fifth lag.

Previous theories indicate that unconditional lagged order imbalances should exhibit a positive relation with
current stock returns, which is due to autocorrelation in order imbalances causing continuing price pressures
(Chordia and Subrahmanyam (2004)). Even though I find the effects not statistically significant, the
coefficients in my regressions results (column 3 of Table 4) for the first and second lags are positive. In

12
their paper, Chordia and Subrahmanyam (2004) argue that long lags of order imbalance should be
negatively related to contemporaneous returns, due to the persistent price pressures having to eventually
reverse. This is consistent with the negative third lag coefficient in Table 4. However, the significance of
the third lag is weak, and I do not find almost any significance in my later empirical analysis. Also, it is
debatable whether the third lag is considered a long lag in modern markets.

Table 4 Results from the conditional and unconditional regressions


This table reports the results of a fixed-effects panel regression. In column 2, the conditional relation and
in column 3, the unconditional relations are examined. Dependent variable is daily return and
independent variables are order imbalances with lags up to five. Imbalance is calculated as (B-S)/(S+B),
where S is total sell-initiated trades and B is total buy-initiated trades. Standard errors are two-way
clustered by stock and day following Cameron et al. (2011). T-statistics are reported in parentheses and
(****), (***), (**), (*) denote significance at the 0.1%, 1%, 5% and 10% level, respectively.
Variable Conditional (Eq. 3) Unconditional (Eq. 4)
𝐼_𝑆𝐻𝑅t=0 0.019 ****
(12.03)
𝐼_𝑆𝐻𝑅t-1 -0.002 **** 0.0001
(-3.71) (0.19)
𝐼_𝑆𝐻𝑅t-2 -0.001 *abs 0.0003
(-1.67) (0.96)
𝐼_𝑆𝐻𝑅t-3 -0.002 *** -0.001 *
(-3.08) (-1.82)
𝐼_𝑆𝐻𝑅t-4 -0.001 **a -0.001
(-2.10) (-1.38)
𝐼_𝑆𝐻𝑅t-5 -0.0003 aa -0.00003
(-0.54) (-0.06)
Fixed effects
Day yes yes
Stock yes yes
Observations 34911 34911
2
Adjusted R (%) 39.74 36.15

To study the effects of order imbalance more rigorously and to provide robustness to my results, I also run
the regressions for trade-based order imbalance (𝐼_𝑁𝑈𝑀) defined in Equation 2, since it measures the
number of traders who are trying to trade that stock rather than the volume traded. In this case, one high-
volume institutional trader would not cause a greater order imbalance compared to a trader trading just a
few stocks. The trade-based order imbalance regression results are shown below in Table 5.

The results in Table 5 support the results of Table 4 for the conditional relation in column 2:
contemporaneous order imbalance has a significant positive relation, and lagged order imbalances exhibit

13
a significant negative relation with current returns, both of which suggest a contemporaneous price pressure
caused by order imbalance and a reversal in the following days.

For the unconditional relation in column 3 in Table 5, contrary to a positive significant first lag in most
previous empirical findings, the trade-based first lag provides little explanatory power. Moreover, using a
trade-based measure of order imbalance, while being of the same sign as in Table 4, the coefficient of the
third lag of the unconditional relation in Table 5 is not significant.

Table 5 Results from the conditional and unconditional regressions


This table reports the results of a fixed-effects panel regression. In column 2, the conditional
relation and in column 3, the unconditional relations are examined. Dependent variable is daily
return and independent variables are order imbalances with lags up to five. Imbalance is
calculated as (B-S)/(S+B), where S is total sell-initiated orders and B is total buy-initiated orders.
Standard errors are two-way clustered by stock and day following Cameron et al. (2011). 𝑡
statistics are reported in parentheses and (****), (***), (**), (*) denote significance at the 0.1%,
1%, 5% and 10% level, respectively.
Variable Conditional (Eq. 3) Unconditional (Eq. 4)
𝐼_𝑁𝑈𝑀t=0 0.021 ****
(15.51)
𝐼_𝑁𝑈𝑀t-1 -0.004 **** -0.0002
(-6.75) (-0.24)
𝐼_𝑁𝑈𝑀t-2 -0.001 ***a 0.0005
(-3.16) (1.21)
𝐼_𝑁𝑈𝑀t-3 -0.002 ***a -0.001a
(-3.08) (-1.43)
𝐼_𝑁𝑈𝑀t-4 -0.001 **aa -0.001a
(-2.58) (-1.34)
𝐼_𝑁𝑈𝑀t-5 -0.001aaaaa a0.0002
(-0.89) (0.27)
Fixed effects
Day yes yes
Stock yes yes
Observations 34911 34911
Adjusted R2 (%) 39.57 36.14

4.2 Regression results grouped by size terciles

To further analyze the effects of order imbalance in the Finnish stock market, I divide the stocks in the final
sample into three size terciles and analyze the effects of order imbalance in each tercile separately. For
these three terciles, I run the regressions specified in Equations 5 and 6, defined in Section 3.5. The results
from these regressions are shown in Table 6 and 7.

14
Table 6 shows the results from the share-based order imbalance regressions. Like the results above in Table
4, in the conditional relation (Table 6, columns 2-4), contemporaneous order imbalance affects current stock
returns positively and the price pressure reverses on the following days for all the three size terciles. While
the contemporaneous effect is the most significant for the smallest tercile, the largest tercile of stocks has
the largest contemporaneous positive and first lag reversal effects of the three terciles. For small stocks, the
reversal is the longest with lag four being more negative and more significant compared to the larger two
terciles. This could suggest that market makers cannot offload their inventory as quickly for the smaller
stocks, and thus, the reversal of the contemporaneous price pressure is longer. In Table 6 columns 5-7 show
the regression results for the unconditional relation. Comparable to the results in the previous results in
section, in these regressions, lagged order imbalances have little explanatory power.

In Table 7, the results from the trade-based order imbalance regressions are shown. As in Table 6, the
conditional relation (columns 2-4) in Table 7 exhibits the strongest explanatory power. Again, the effect of
both the current order imbalance and the reversal of the first lag are the strongest for large stocks.
Comparing the three size terciles, the reversal is shorter the larger the market value of the company is. Also,
looking at the contemporaneous coefficients in the conditional relation, it seems that order imbalance might
have a U-shaped effect on stock returns with the smallest and largest stocks exhibiting the largest
coefficients.

Furthermore, in the unconditional relation (column 6 in Table 7), the first lag for mid-sized stocks is
significant and negative, which is an interesting outlier and in contrast to the theory by Chordia and
Subrahmanyam (2004). One possible explanation for this is Roll’s (1984) theory that price changes should
be negatively autocorrelated, which would result in current order imbalances having a negative relation
with future stock returns, sometimes referred to as the bid-ask bounce -effect. It is likely that this outlier is
attributable to not using mid-quote returns in the regressions, which might cause this bias.

4.3 Results grouped by different levels of order imbalance

This section provides robustness to my previous results by showing that my results are not driven by
extreme values of order imbalances. In addition, the analysis in this section scrutinizes of the effects of
order imbalance on stock returns even further. In previous literature, many have studied the relation between
different levels of order imbalance and returns, for example Hanke and Wiegerding (2015) cross-sectionally
for German stocks, and Chordia et al. (2002) for aggregated order imbalances from NYSE.

In the empirical analysis of this section, I divide my sample into three groups based on the level of absolute
contemporaneous order imbalance and absolute first lag of order imbalance. The three subsamples include
the smallest and two larger categories of order imbalances (|𝐼| ≤ 0.1, 0.1 < |𝐼| < 0.2, and

15
|𝐼| ≥ 0.2, respectively). This analysis is applied to both share-based and trade-based measures of order
imbalance. For both definitions and each subsample of order imbalance, I run the conditional (Eq. 3) and
unconditional (Eq. 4) regressions, defined in Section 3.5.

Tables 8 and 9 report the results from these regressions. In the conditional relation (columns 2-4), both
measures of order imbalance exhibit similar results compared to those in Tables 4, 5, 6, and 7. In addition,
the contemporaneous coefficient of the conditional relation is smaller for the more extreme values of order
imbalance, which suggests that my results are not driven by these extreme values. Unsurprisingly, the more
extreme contemporaneous order imbalances exhibit the most statistical significance. Also, in Table 8, the
subsequent reversal seems to be longer for the more extreme order imbalances, but this effect is not as
pronounced in Table 9.

Again, in line with my previous results, the unconditional regressions in Tables 8 and 9 exhibit only little
explanatory power. In column 5 of Table 8, lags from second to the fourth lag are significant, which could
be due to autocorrelation in order imbalances, as suggested by the theory of Chordia and Subrahmanyam
(2004). Moreover, the significant negative third lag in column 5, and the fourth lag in column 7 of Table 9
are in line with their theory of the autocorrelated order imbalances having to eventually reverse, but the
statistical significance is weak.

4.4 Results with Nasdaq’s order imbalance reports

In addition to the regressions in the previous sections, I also examine Nordic Nasdaq’s own imbalance
reports, which are published during the closing auction, and are also included in the ITCH-data discussed
in more detail in Section 3.1. The closing auction is arranged at the end of each trading day and allows
traders to execute their trades at the closing price, which is particularly popular with mutual funds and
exchange traded funds tracking an index (Bogousslavsky and Muravyev (2022)).

Since the Finnish Nasdaq exchange provides order imbalance information throughout the closing auction
period, which Jegadeesh and Wu (2021) find to be a significant predictor of the closing price, and the
purpose of study is not to study the closing auction itself, I only use order imbalance information from the
last order imbalance report of each trading day. Additionally, focusing on the return predictability of order
imbalance, and not the short-term mechanisms affecting the closing price, I only examine how imbalance
at the closing auction affects stock returns during the following trading days.

An investor can participate in the closing auction by either submitting an order to trade at the closing price
(market-on-close) or place a limit-order (limit-on-close), the amounts of which are not reported in the data.
Traders might add under- or overpriced limit-on-close orders, suggesting that they do not really want to

16
trade that stock, but rather supply liquidity if the closing price reaches their price. I assume that this is one
reason why there is extreme order imbalances at the closing auction. To combat this, I remove all
observations with a closing auction imbalance over 100 percent of the daily trading volume, to diminish
the effect of outliers and to avoid any bias they might cause. This also increases the robustness of my results.

As stated above, I will only examine the unconditional relation with these regressions. I only use the share-
based definition of order imbalance, since the number of traders willing to trade in each direction are not
reported in the imbalance reports. For the imbalance reports -based regressions, order imbalance is defined
as

𝐼 𝐶𝐿𝑂𝑆𝐸 𝑖,𝑡
, 𝑤ℎ𝑒𝑛 𝐵𝐶𝐿𝑂𝑆𝐸 𝑖,𝑡 > 𝑆 𝐶𝐿𝑂𝑆𝐸 𝑖,𝑡
𝐵𝑒 𝑖,𝑡 + 𝑆 𝑒 𝑖,𝑡
𝐼𝑟 = 𝐼 𝐶𝐿𝑂𝑆𝐸 𝑖,𝑡 , (7)
−1 × 𝑒 𝑒
, 𝑤ℎ𝑒𝑛 𝐵𝐶𝐿𝑂𝑆𝐸 𝑖,𝑡 < 𝑆 𝐶𝐿𝑂𝑆𝐸 𝑖,𝑡
𝐵 𝑖,𝑡 + 𝑆 𝑖,𝑡
{ 0 , 𝑒𝑙𝑠𝑒

where 𝐵𝑒 and 𝑆 𝑒 denote the total number of buy and sell-initiated trades calculated from executed trades
throughout the whole trading day (as in Eqs. 1 and 2), and 𝐼 𝐶𝐿𝑂𝑆𝐸 𝑖,𝑡 denotes the reported imbalance at the
closing auction of stock 𝑖 on day 𝑡. 𝐵𝐶𝐿𝑂𝑆𝐸 𝑖,𝑡 and 𝑆 𝐶𝐿𝑂𝑆𝐸 𝑖,𝑡 denote the number of added orders in each
direction (𝐵 = buy, 𝑆 = sell) in the closing auction. To determine the direction of the order imbalance, the
reported imbalance (which is always positive in the report) is multiplied by −1 when there are more sell-
initiated orders in the closing auction, hence order imbalance being defined as buy-initiated less sell-
initiated orders. Also, this definition of closing auction order imbalance is comparable to the one used by
Jegadeesh and Wu (2021). The unconditional fixed-effects regression from Nasdaq Nordic’s own
imbalance reports is defined as

𝑅𝑖,𝑡 = ∑ 𝛽𝑗𝑢 𝐼 𝑟 𝑖,𝑡−𝑗 + 𝛿𝑡 + 𝜑𝑖 + 𝜀𝑖,𝑡 , (8)


𝑗=1

where all the definitions are same as in Equation 4, but the order imbalance 𝐼 𝑟 is as defined above in
Equation 7.

17
Table 6 Conditional and unconditional regression results by size tercile
This table reports the results of a fixed-effects panel regression. In columns 2-4, the conditional relation and in column 5-7, the unconditional relations are examined. Stocks are divided into size terciles
by their market capitalization at the end of the previous month. Dependent variable is daily return and independent variables are imbalances with lags up to five. Imbalance is calculated as (B-S)/(S+B),
where S is total sell-initiated trades and B is total buy-initiated trades. Standard errors are two-way clustered by stock and day following Cameron et al. (2011). T-statistics are reported in parentheses
and (****), (***), (**), (*) denote significance at the 0.1%, 1%, 5% and 10% level, respectively.
Conditional (Eq. 5) Unconditional (Eq. 6)
Variable Small Mid Large Small Mid Large
𝐼_𝑆𝐻𝑅t=0 0.0170 **** 0.0197**** 0.0391 ****
(11.21) (6.99) (6.02)
𝐼_𝑆𝐻𝑅t-1 -0.0016 ** -0.0034 *** -0.0073**** 0.0003 -0.0013 -0.0016
(-2.24) (-3.00) (-4.68) (0.44) (1.38) (-1.30)
𝐼_𝑆𝐻𝑅t-2 -0.0003 -0.0014 * 0.0001 0.0003 -0.000004 0.0002
(-0.84) (-1.91) (0.05) (0.75) (-0.01) (0.88)
𝐼_𝑆𝐻𝑅t-3 -0.0017 ** -0.0010 -0.0038 ** -0.0011 -0.0004 -0.0027
(-2.49) (-1.44) (-2.60) (-1.59) (-0.50) (-1.60)
𝐼_𝑆𝐻𝑅t-4 -0.0012 * -0.0010 * -0.0024 -0.0010 -0.0008 -0.0016
(-1.79) (-1.72) (-1.56) (-1.32) (-1.11) (-0.95)
𝐼_𝑆𝐻𝑅t-5 -0.0007 0.0001 0.0021 -0.0005 0.0003 0.0034 *
(-1.01) (0.24) (1.40) (-0.72) (0.41) (2.03)
Fixed effects
Day yes yes yes yes yes yes
Stock yes yes yes yes yes yes
Observations 11648 11639 11624 11648 11639 11624
Adjusted R2 (%) 28.85 46.66 51.91 23.07 43.53 48.56

18
Table 7 Conditional and unconditional regression results by size tercile
This table reports the results of a fixed-effects panel regression. In columns 2-4, the conditional relation and in column 5-7, the unconditional relations are examined. Stocks are divided into size terciles
by their market capitalization at the end of the previous month. Dependent variable is daily return and independent variables are imbalances with lags up to five. Imbalance is calculated as (B-S)/(S+B),
where S is total sell-initiated trades and B is total buy-initiated trades. Standard errors are two-way clustered by stock and day following Cameron et al. (2011). T-statistics are reported in parentheses
and (****), (***), (**), (*) denote significance at the 0.1%, 1%, 5% and 10% level, respectively.
Conditional (Eq. 5) Unconditional (Eq. 6)
Variable Small Mid Large Small Mid Large
𝐼_𝑁𝑈𝑀t=0 0.0232 **** 0.0193 **** 0.0244 ****
(12.94) (10.40) (8.57)
𝐼_𝑁𝑈𝑀t-1 -0.0038 **** -0.0048 **** -0.0052*** -0.00002 -0.0016 * -0.0006
(-5.02) (-6.09) (-3.73) (-0.03) (-1.80) (-0.40)
𝐼_𝑁𝑈𝑀t-2 -0.0013 ** -0.0018 ** -0.0007 0.0006 -0.0003 0.0008
(-2.36) (-2.45) (-0.54) (1.03) (-0.36) (0.57)
𝐼_𝑁𝑈𝑀t-3 -0.0020 ** -0.0012 -0.0019 -0.0008 -0.0004 -0.0014
(-2.54) (-1.37) (-1.09) (-0.99) (-0.48) (-0.80)
𝐼_𝑁𝑈𝑀t-4 -0.0017 ** -0.0009 -0.0030 * -0.0011 -0.0003 -0.0022
(-2.33) (-1.21) (-1.82) (-1.44) (-0.35) (-1.30)
𝐼_𝑁𝑈𝑀t-5 -0.0014 * 0.0004 0.0014 -0.0006 0.0008 0.0023 *
(-1.70) (0.65) (1.17) (-0.73) (1.11) (1.92)
Fixed effects
Day yes yes yes yes yes yes
Stock yes yes yes yes yes Yes
Observations 11648 11639 11624 11648 11639 11624
2
Adjusted R (%) 29.84 45.91 49.93 23.05 43.54 48.53

19
Table 8 Regression results grouped by the absolute value of order imbalance
This table reports the results of a fixed-effects panel regression. In columns 2-4, the conditional relation and in column 5-7, the unconditional relations are examined. Stocks are divided into three groups based
on the absolute value of order imbalance of (1) the contemporaneous order imbalance, in columns 2-4 and (2) the first lag of the order imbalance, in columns 5-7. Dependent variable is daily return and independent
variables are imbalances with lags up to five. Imbalance is calculated as (B-S)/(S+B), where S is total sell-initiated shares traded and B is total buy-initiated shares traded. Standard errors are two-way clustered by
stock and day following Cameron et al. (2011). T-statistics are reported in parentheses and (****), (***), (**), (*) denote significance at the 0.1%, 1%, 5% and 10% level, respectively.
Conditional (Eq. 3) Unconditional (Eq. 4)
Variable |𝐼_𝑆𝐻𝑅| ≤ 0.1 0.1 < |𝐼_𝑆𝐻𝑅| ≤ 0.2 |𝐼_𝑆𝐻𝑅| ≥ 0.2 |𝐼_𝑆𝐻𝑅𝑡−1 | ≤ 0.1 0.1 < |𝐼_𝑆𝐻𝑅𝑡−1 | ≤ 0.2 |𝐼_𝑆𝐻𝑅𝑡−1 | ≥ 0.2
𝐼_𝑆𝐻𝑅𝑡=0 0.0546 **** 0.041 **** 0.0173 ****
(10.19) (15.25) (14.30)
𝐼_𝑆𝐻𝑅𝑡−1 -0.0024 *** -0.0036 **** -0.0015 * -0.0019 -0.0003 -0.0001
(-2.74) (-3.47) (-2.59) (-0.79) (-0.16) (-0.27)
𝐼_𝑆𝐻𝑅𝑡−2 -0.0076 -0.0025 *** 0.0001 0.0018 * -0.0009 0.0001
(-1.05) (-2.81) (0.22) (1.84) (-1.02) (0.12)
𝐼_𝑆𝐻𝑅𝑡−3 -0.0011 -0.0015 -0.0018 * -0.0018 ** -0.0011 -0.0008
(-1.54) (-1.62) (-2.27) (-2.09) (-0.76) (-1.61)
𝐼_𝑆𝐻𝑅𝑡−4 -0.0018 * -0.0004 -0.0014 * -0.0022 ** -0.0006 -0.0003
(1.89) (-0.44) (-2.47) (-2.43) (-0.61) (-0.41)
𝐼_𝑆𝐻𝑅𝑡−5 0.0007 -0.0011 -0.0006 -0.0003 -0.0008 0.0003
(0.75) (-1.04) (-0.76) (-0.34) (-0.77) (0.56)
Fixed effects
Day yes yes yes yes yes yes
Stock yes yes yes yes yes yes
Observations 15668 8681 10562 15666 8678 10567
Adjusted R2 (%) 41.68 39.82 39.14 45.14 34.05 26.64

20
Table 9 Regression results grouped by the absolute value of order imbalance
This table reports the results of a fixed-effects panel regression. In columns 2-4, the conditional relation and in column 5-7, the unconditional relations are examined. Stocks are divided into three groups based
on the absolute value of order imbalance of (1) the contemporaneous order imbalance, in columns 2-4 and (2) the first lag of the order imbalance, in columns 5-7. Dependent variable is daily return and
independent variables are imbalances with lags up to five. Imbalance is calculated as (B-S)/(S+B), where S is total sell-initiated trades and B is total buy-initiated trades. Standard errors are two-way clustered by
stock and day following Cameron et al. (2011). T-statistics are reported in parentheses and (****), (***), (**), (*) denote significance at the 0.1%, 1%, 5% and 10% level, respectively.
Conditional (Eq. 5) Unconditional (Eq. 6)
Variable |𝐼_𝑁𝑈𝑀| ≤ 0.1 0.1 < |𝐼_𝑁𝑈𝑀| ≤ 0.2 |𝐼_𝑁𝑈𝑀| ≥ 0.2 |𝐼_𝑁𝑈𝑀𝑡−1 | ≤ 0.1 0.1 < |𝐼_𝑁𝑈𝑀𝑡−1 | ≤ 0.2 |𝐼_𝑁𝑈𝑀𝑡−1 | ≥ 0.2
𝐼_𝑁𝑈𝑀t=0 0.0360 **** 0.0034 **** 0.022 ****
(8.00) (15.31) (17.57)
𝐼_𝑁𝑈𝑀t-1 -0.0050 **** -0.0033 *** -0.0038 **** -0.0022 -0.0014 -0.0008
(-5.66) (-3.00) (-5.07) (-0.88) (-0.91) (-1.25)
𝐼_𝑁𝑈𝑀t-2 -0.0015 ** -0.0016 * -0.0009 0.0009 0.0001 0.0005
(-2.26) (-1.72) (-1.17) (1.03) (0.10) (0.62)
𝐼_𝑁𝑈𝑀t-3 -0.0014 * -0.0007 -0.0027 ** -0.0014 0.0009 -0.0016 *
(-1.79) (-0.69) (-2.34) (-1.80) (0.75) (-1.88)
𝐼_𝑁𝑈𝑀t-4 -0.0017 * -0.0011 -0.0021 *** -0.0015 * -0.0003 0.0007
(-1.68) (-1.27) (-2.71) (-1.50) (-0.33) (-0.08)
𝐼_𝑁𝑈𝑀t-5 -0.0007 -0.0019 * -0.0006 0.0012 -0.0002 0.0008
(0.74) (-1.67) (-0.66) (1.20) (-0.19) (-1.02)
Fixed effects
Day yes yes yes yes yes yes
Stock yes yes yes yes yes yes
Observations 15953 9507 9451 15954 9515 9442
Adjusted R2 (%) 40.41 41.53 41.41 43.42 35.84 26.22

21
Table 10 reports the results from the order imbalance report regressions as defined in Equation 8.
Contrary to the insignificant unconditional regression results in Tables 4 and 5, and consistent with
previous empirical findings in the literature, the first lag in column 2 of Table 10 is positive and
statistically significant. The significant and positive first lag is in line with the theory by Chordia and
Subrahmanyam (2004), which states that informed traders cause autocorrelated price pressures, which in
turn cause order imbalances to affect future returns positively. On the other hand, the significant first lag
might also be institutional trading demands spilling over into the next trading day or the market
incorporating the information revealed by the closing auction to stock prices.

Table 10 Results from imbalance report regressions


This table reports the results of a fixed-effects panel regression. In column
2, the unconditional relation is examined. Dependent variable is daily
return and independent variables are imbalances with lags up to three.
Imbalance is calculated as indicated in Equation 7. Standard errors are two-
way clustered by stock and day following Cameron et al. (2011). 𝑡 statistics
are reported in parentheses and (****), (***), (**), (*) denote significance
at the 0.1%, 1%, 5% and 10% level, respectively. Coefficients are multiplied
by 10 000.
Variable Unconditional (Eq. 8)
𝐼_𝑆𝐻𝑅t=0

𝐼_𝑆𝐻𝑅t-1 0.0109 ***


(12.00)
𝐼_𝑆𝐻𝑅t-2 0.0002 abs
(0.245)
𝐼_𝑆𝐻𝑅t-3 -0.0010 abs
(-1.33)
Fixed effects
Day yes
Stock yes
Observations 43069
Adjusted R2 (%) 18.55

5 Order imbalance and information asymmetry

As an extension to my previous results, in this section, I first evaluate whether a measure of asymmetric
information, proposed by Lof and van Bommel (2022), measures order imbalance. Second, I try to shed
light on whether order imbalance is a proxy for asymmetric information by comparing the longer-term
returns of stocks after extreme order imbalances in both directions. Finally, following Chordia et al. (2019),
I examine the returns caused by different levels of volatilities of order imbalance.

22
5.1 Order imbalance and VCV

In their paper, Lof and van Bommel (2022) derive the Volume Coefficient of Variation (𝑉𝐶𝑉) ratio to
measure asymmetric information, under the assumption that order flow imbalance measures informed
trading. Their model is based on the distinguished Kyle (1985) model, and their intuition is based on the
correlation of orders. If liquidity demands are uncorrelated (correlated) and thus traders are uninformed
(informed), trading volume should follow a normal-like (skewed and dispersed) distribution. If liquidity
demands are correlated, the distribution changes, since the standard deviation increases at a more rapid rate
compared to the mean, assuming that both increase linearly. In their empirical results, they find that 𝑉𝐶𝑉
correlates with other well-known measures of information asymmetry, for example, the probability of
informed trade (𝑃𝐼𝑁) by Easley et al. (1996).

Following Lof and van Bommel (2022), volume for each stock 𝑖 on each day 𝑡 is calculated as volume of
stock 𝑖 divided by total volume on that day:

𝑉𝑖,𝑡
𝑉𝑖,𝑡 = (9)
∑𝐼𝑖 𝑉𝑖,𝑡

Applying Lof and van Bommel’s (2022) definitions to a daily level, the daily Volume Coefficient of
Variation is defined as the rolling one-month backward-looking standard deviation of volume divided by
the average of the rolling one-month backward-looking volume, formally:

𝜎̂𝑉(𝑖,𝑡+𝑑)
𝑉𝐶𝑉𝑖,𝑡 = , (10)
𝜇̂ 𝑉(𝑖,𝑡+𝑑)

where 𝜎̂𝑉 denotes the standard deviation of volume, 𝜇̂ 𝑉 denotes the mean of volume, 𝑑 = [−20, 0], and
volume is calculated as indicated above in Equation 9.

After calculating 𝑉𝐶𝑉, I assign my sample into 10 deciles based on absolute 𝐼_𝑆𝐻𝑅. For this analysis, I
only use share-based order imbalance. For each of these deciles, I calculate the average 𝑉𝐶𝑉, the standard
deviation of it, and the correlation between 𝑉𝐶𝑉 and order imbalance (𝐼_𝑆𝐻𝑅). The results are below in
Table 11. The average 𝑉𝐶𝑉 increases with the absolute imbalance decile, indicating that order imbalance
and 𝑉𝐶𝑉 exhibit a positive relation. In addition, in column 5, the correlations increase with the absolute
order imbalance decile, providing additional evidence of Lof and van Bommel’s (2022) 𝑉𝐶𝑉 measuring
order imbalance, which they assume is a sign of asymmetric information.

23
Table 11 VCV in different order imbalance deciles
This table reports the mean, standard deviation, and correlation between VCV and I_SHR. I_SHR is calculated as (B-
S)/(B-S) where B is buy-initiated trades and S = sell-initiated trades. VCV is calculated as 20-day backward-looking rolling
𝜎̂𝑉 /𝜇̂ 𝑉 . I_SHR deciles are based on the absolute value of I_SHR, since VCV does not have negative values.
𝐼_𝑆𝐻𝑅 decile Standard
Number of Correlation between
(1 = small, 10 = Mean 𝑉𝐶𝑉 deviation of
observations 𝑉𝐶𝑉 and 𝐼_𝑆𝐻𝑅
large) 𝑉𝐶𝑉
1 3452 0.453 0.220 0.000
2 3452 0.453 0.221 -0.028
3 3452 0.464 0.229 0.011
4 3452 0.472 0.227 0.028
5 3452 0.487 0.230 -0.010
6 3452 0.512 0.239 0.045
7 3451 0.550 0.246 0.042
8 3451 0.603 0.267 0.056
9 3451 0.661 0.263 0.069
10 3451 0.721 0.254 0.139

5.2 Order imbalance and future returns

Easley and O’Hara (1992) state that informed traders have two distinct characteristics: they trade on one
side of the market, and they are willing to trade large amounts, both of which are linked to share-based
order imbalance – for order imbalance to occur there must be large amounts of trades on one side of the
market. Since order imbalance could be a result of information-based trading, it could well be a proxy for
asymmetric information.

To test the rationale above, I divide the stocks in my sample into ten deciles each day based on the previous
day’s order imbalance, such that portfolio 1 (P1) is the decile with the smallest share-based order
imbalances and portfolio 10 (P10) is the decile with the largest order imbalances. For each of these
portfolios, future returns for several periods are calculated. Under the assumption that order imbalance is a
sign of asymmetric information, after extreme positive imbalances (P10), the longer-term returns should be
greater than after extreme negative order imbalances (P1).

The intuition is as follows: if an informed trader, who acts rationally and maximizes her returns, is willing
to buy (sell) a stock on a given day, the price of the stock cannot be at a level which incorporates her private
information. Thus, if she is willing to buy or sell, the future returns will have to be different from what the
market currently expects – consequently, the market price will have to adjust later when her private
information will become public.

24
Next, I examine the returns from a trading strategy which buys the P10 portfolio and sells short the P1
portfolio for different longer-term holding periods. I perform a T-tests to determine whether the returns for
the P10 – P1 strategy are significantly different from zero. The results from the T-tests are reported in Table
12, which prove to be statistically insignificant. Thus, this strategy does not produce returns different from
zero, providing no statistical evidence of order imbalance being a sing of asymmetric information-based
trading.

Table 12 Mean returns to an order imbalance -based trading strategy


In columns 2-6, the mean returns for a strategy that holds the P10 – P1 long-short portfolio for different holding periods are
reported. The portfolios are formed each day, based on order imbalance information from the day before. Below the mean
returns are the t-statistics and p-values, testing the null hypothesis that the return is different from zero.
One week Two weeks Three weeks Four weeks Two Month
Mean P10 - P1 -0.0003 0.0004 0.0014 0.0018 0.0041
T-statistic -0.30 0.30 0.85 0.94 1.56
P-value 0.381 0.381 0.277 0.25 0.11

To further scrutinize the effect that order imbalance has on long-term returns, I do a similar trading strategy
as Chordia et al. (2019) and calculate the (rolling 20-day backward-looking) volatility of order imbalance,
which they argue to be a measure of asymmetric information. In their model (Chordia et al. (2019), p. 1548),
they find that, under certain assumptions, both the volatility in order flow imbalances and adverse selection
costs are affected in the same direction by changes in certain exogenous parameters. Consequently, if
investors demand a premium for adverse selection in the required returns of assets, the variability of order
flow should also command a premium.

To test the theory above, I divide my sample into ten portfolios based on the volatility of order imbalance,
with portfolio 1 (P1) containing the decile of stocks with the smallest order imbalance volatility and
portfolio 10 (P10) containing the decile of stocks with the largest order imbalance volatility. The future
returns for these portfolios are depicted below in Figure 2.

Again, I calculate returns for the volatility of order imbalance-based strategy, which goes long in the P10
stocks and sells short the P1 stocks. Under the assumption that order flow volatility proxies for asymmetric
information, I assess the returns of this strategy with T-tests, from which the results are in Table 13. All the
average returns are positive and the returns for a two-week or longer holding period are statistically
significant. This provides evidence of the volatility of order imbalance being a proxy for asymmetric
information, as suggested by both the theory and the empirical findings of Chordia et al. (2019).

25
Figure 2 Returns of two extreme volatility of order imbalance -based portfolios
This figure reports the returns to the P1 and P10 portfolios as defined above.

Table 13 Mean returns to a volatility of order imbalance -based trading strategy


In columns 2-6, the mean returns for a strategy that holds the P10 – P1 long-short portfolio for different holding periods are reported. The
portfolios are formed each day, based on the volatility of order imbalance from the previous day. Below the mean returns are the t-statistics
and p-values, testing the null hypothesis that the return is different from zero.
One week Two weeks Three weeks Four weeks Two Month
Mean P10 – P1 0.001 0.004 0.006 0.007 0.014
T-statistic 0.67 1.88 * 2.34 ** 2.69 ** 4.29 ****
P-value 0.318 0.068 0.026 0.011 0.00004

6 Conclusion

In my main analysis in Section 4, I document that order imbalance causes a contemporaneous price pressure,
which reverses on the following days. The results point to the same direction with both a trade-based and a
share-based measure of order imbalance. Also, I find the reversal to be quicker for larger stocks. Without
controlling for contemporaneous order imbalance, the share-based or trade-based order imbalance measures
do not provide virtually any significant evidence. This result is in contrast with previous literature, which
most often finds a significant first lag. One possible reason for this differing result is a relatively short
sample period encompassing just two full years. Moreover, since I do not use mid-quote returns to combat
the bid-ask bounce -effect, my results might be biased. However, I show that my results are not driven by
extreme observations of order imbalance. When analyzing order imbalance with Nordic Nasdaq’s own

26
imbalance reports at the closing auction at the end of each trading day, I show that order imbalance at the
closing auction causes a price pressure which spills over onto the next day.

In addition to the main analysis, I provide empirical evidence from the Finnish stock market supporting the
validity of the 𝑉𝐶𝑉-measure by Lof and van Bommel (2022). Finally, I assess whether order imbalance
proxies for asymmetric information with its effects on longer-term returns. I find order imbalance itself
insignificant, but its volatility a significant predictor of future returns.

An interesting topic for future research would be to assess how order imbalance affects cross-sectional
returns with higher frequency (intra-day) data, particularly in the Finnish stock market, but also in the other
Nordic countries. Furthermore, the behavior of order imbalance around corporate events in the Nordic
countries would be a fascinating topic of research. Additionally, should the data be available, a similar
methodology to the one applied in this study could be applied to a longer sample of data from the Finnish
stock market. I hope the topics I mentioned and other similar topics are fruitful subjects of future research.

27
Appendix
A. Data

The ITCH data features order and trade-level data of all securities traded in Nordic Nasdaq exchanges. For
a given stock, this data concludes such items as trade volume, the price, and both the date and time added.
The user can track individual orders from addition to deletion or execution. To get the Finnish executed
equities’ limit-orders, the type of message is E or C, the instrument code 181, and the exchange code XHEL.
To calculate order imbalance, BuySell-indicator is flipped, following Odders-White (2000).

Distributions of both trade-based and share-based order imbalance for different size deciles

Distribution of share-based order imbalance

Distribution of trade-based order imbalance

28
B. Regressions
Results from the Hausman test.

H0: random effects model is consistent.

HA: fixed-effects model is consistent.

P-value <<0.001 so I reject the null hypothesis. Fixed effects model is consistent.

Correlation matrix

29
References

Blume, M., MacKinlay, A., and Terker, B. (1989). Order imbalances and stock price movements on 19
and 20, 1987. The Journal of Finance, 44(4), 827-848.

Blume, M., and Stambaugh, R. (1983). Biases in computed returns: an application to the size effect.
Journal of Financial Economics, 12(3), 387-404.

Brown, P., Walsh, D., and Yuen, A. (1997). The interaction between order imbalance and stock price.
Pacific-Basin Finance Journal, 5(5), 539-557.

Bogousslavsky, V., and Collin-Dufresne, P. (2022). Liquidity, Volume, and Order Imbalance Volatility.
Journal of Finance, (forthcoming).

Bogousslavsky, V. and Muravyev, D. (2022). Who trades at the Close? Implications for Price Discovery
and Liquidity. Unpublished working paper. Available at SSRN: [Link]

Cameron, A. C., Gelbach, J. B., and Miller, D. L. (2011). Robust inference with multiway clustering.
Journal of Business & Economic Statistics, 29(2), 238-249.

Chan, K., and Fong, W. (2000). Trade size, order imbalance, and the volatility-volume relation. Journal
of Financial Economics, 57(2), 247-273.

Chordia, T., Subrahmanyam, A., and Anshuman, V. R. (2001). Trading activity and expected stock
returns. Journal of Financial Economics, 59(1), 3-32.

Chordia, T., Roll, R., and Subrahmanyam, A. (2002). Order imbalance, liquidity, and market returns.
Journal of Financial Economics, 65(1), 111-130.

Chordia, T., and Subrahmanyam, A. (2004). Order imbalance and individual stock returns: Theory and
evidence. Journal of Financial Economics, 72(3), 485-518.

Chordia, T., Jianfeng, H., Subrahmanyam, A., and Tong, Q. (2019) Order Flow Volatility and Equity
Costs of Capital. Management science, 65(4), 1520-1551.

Easley D., Hvidkjaer S., and O’Hara M. (2002). Is information risk a determinant of asset returns? The
Journal of Finance, 57(5), 2185–2221.

Easley, D., Kiefer, N. M., O’Hara, M., and Paperman, J. B. (1996). Liquidity, information, and
infrequently traded stocks. Journal of Finance, 51(4), 1405-1436.

30
Easley, D., and O’Hara, M. (1992). Adverse Selection and Large Trade Volume: The Implications for
Market Efficiency. Journal of Financial and Quantitative Analysis, 27(2), 185-208.

Grossman, J., and Miller, M. (1988). Liquidity and Market Structure. Journal of Finance, 43(3), 617-633.

Hanke, M., and Wiegerding, M. (2015). Order flow imbalance effects on the German stock market.
Business research, 8(2), 213-238.

Hausman, J. A. (1978). Specification Tests in Econometrics. Econometrica, 46, 1251-1271.

Hiemstra, C., and Jones, J. (1994). Testing for linear and nonlinear Granger causality in the stock price–
volume relation. Journal of Finance, 49, 1639-1664.

Jegadeesh, N., and Wu, Y. (2021). Closing auctions: Nasdaq versus NYSE. Journal of financial
Economics, 143(3), 1120-1139.

Kyle, A. (1985). Continuous auctions and insider trading. Econometrica, 53, 1315-1335.

Lee, C., and Ready, M. (1991). Inferring trade direction from intraday data. Journal of Finance, 46(2),
733-747.

Lee, C. M. (1992). Earnings news and small traders: An intraday analysis. Journal of Accounting and
Economics, 15(2-3), 265-302.

Lo, A., and MacKinlay, A. (1990). An econometric analysis of nonsynchronous trading. Journal of
Econometrics, 45(1-2), 181-211.

Lo, A., and Wang, J. (2000). Trading volume: definitions, data analysis, and implications of portfolio
theory. Review of Financial Studies, 13, 257-300.

Lof, M., and van Bommel, J. (2022). Asymmetric Information and the Distribution of Trading Volume.
Available at SSRN: [Link]

Odders-White, R. (2000). On the occurrence and consequences of inaccurate trade classification. Journal
of Financial Markets, 3, 259-286.

Roll, R. (1984). A Simple Implicit Measure of the Effective Bid-Ask Spread in an Efficient Market. The
Journal of Finance, 39(4), 1127-1139.

Shenoy, C., and Zhang, Y. J. (2007). Order imbalance and stock returns: Evidence from China. The
Quarterly Review of Economics and Finance, 47(5), 637-650.

31
Stoll, H. R. (1978). The supply of dealer services in securities markets. The Journal of Finance, 33(4),
1133-1151.

Stoll, H. R. (2000). Friction. The Journal of Finance, 55(4), 1479-1514.

32

You might also like