Papers by Daniel P . A . Preve
This thesis is based on the following papers, which are referred to in the text by their Roman nu... more This thesis is based on the following papers, which are referred to in the text by their Roman numerals.

Easley et al. (2002, EHO) proposed a market microstructure model to derive a measure of asymmetri... more Easley et al. (2002, EHO) proposed a market microstructure model to derive a measure of asymmetric information reflecting the relative intensity of informed versus uninformed (liquidity) trades, called the probability of informed trading, PIN. As described in Figure 1, the PIN model assumes that each trading day may be classified as one with news or no news. Furthermore, a day with news can be one with good news or bad news. The daily aggregate number of buyer-and seller-initiated trades (buy and sell orders) are assumed to follow independent Poisson distributions with intensities dependent on whether the trading day is one with good news, bad news or no news. In the model there are two types of traders, informed traders who trade based on relevant news or information, and uninformed traders who trade for reasons not accounted for by relevant information, such as portfolio rebalancing and liquidity needs. Let B d and S d denote the aggregate number of buy-and sell-orders on day d, respectively. In the PIN model, B d and S d are assumed to be independent Poisson random variables, with different intensities for days with bad news (B), good news (G) and no news (N). Let θ E denote the probability of news being released on day d and let θ B denote the probability of bad news, conditional on the release of news. Thus, the daily state probabilities are π B = θ E θ B , π G = θ E (1 − θ B) and π N = 1 − θ E , for a day with bad news, good news and no news, respectively. The means of B d and S d (the intensity parameters) vary according to whether the trading day is one with good news, bad news or no news. In particular, for a day with no news, the means of B d and S d are λ 1 and λ −1 , respectively. For a day with bad news the sell intensity increases by a constant δ, while the buy intensity remains the same as for a day with no news. Similarly, for a day with good news the buy intensity increases by δ, while the sell intensity stays the same as for a no-news day. The PIN model assumes that orders due to informed and uninformed traders are independent.
Matlab code used in 'Estimation of Time Varying Adjusted Probability of Informed Trading and ... more Matlab code used in 'Estimation of Time Varying Adjusted Probability of Informed Trading and Probability of Symmetric Order-Flow Shock' for the estimation of the PIN- and APIN-AACD models and for the computation of the PIN, APIN and PSOS measures. Please note that the code requires Matlab 2009b or later and the optimization toolbox.

Recently Duarte and Young (2009) extended the probability of informed trading (PIN) proposed by E... more Recently Duarte and Young (2009) extended the probability of informed trading (PIN) proposed by Easley et al. (2002) and decomposed it into two components: the adjusted PIN (APIN) as a measure of asymmetric information and the probability of symmetric order-flow shock (PSOS) as a measure of illiquidity. They provided some cross-section estimates of these measures using daily data over annual periods and argued that the APIN is not priced. In this paper we propose a method to estimate daily APIN and PSOS as an extension of Tay et al. (2009) using high-frequency transaction data. Our empirical results indicate that daily APIN is much more stable than daily PIN. In contrast to PIN, daily APIN is not positively correlated with daily variance, while daily PSOS is. Moreover, in comparison with the daily APIN, the daily PSOS exhibits clustering and sporadic bursts over time. Key words and phrases. autoregressive conditional duration, market microstructure, probability of informed trading, probability of symmetric order-flow shock, transaction data.

SSRN Electronic Journal
The standard heterogeneous autoregressive (HAR) model is perhaps the most popular benchmark model... more The standard heterogeneous autoregressive (HAR) model is perhaps the most popular benchmark model for forecasting return volatility. It is often estimated using raw realized variance (RV) and ordinary least squares (OLS). However, given the stylized facts of RV and wellknown properties of OLS, this combination should be far from ideal. One goal of this paper is to investigate how the predictive accuracy of the HAR model depends on the choice of estimator, transformation, and forecasting scheme made by the market practitioner. Another goal is to examine the effect of replacing its high-frequency data based volatility proxy (RV) with a proxy based on free and publicly available low-frequency data (logarithmic range). In an out-of-sample study, covering three major stock market indices over 16 years, it is found that simple remedies systematically outperform not only standard HAR but also state of the art HARQ forecasts, and that HAR models using logarithmic range can often produce forecasts of similar quality to those based on RV.
This Working Paper is brought to you for free and open access by the School of Economics at Insti... more This Working Paper is brought to you for free and open access by the School of Economics at Institutional Knowledge at Singapore Management University. It has been accepted for inclusion in Research Collection School of Economics by an authorized administrator of ...

In this paper we introduce a linear programming estimator (LPE) for the slope parameter in a cons... more In this paper we introduce a linear programming estimator (LPE) for the slope parameter in a constrained linear regression model with a single regressor. The LPE is interesting because it can be superconsistent in the presence of an endogenous regressor and, hence, preferable to the ordinary least squares estimator (LSE). Two different cases are considered as we investigate the statistical properties of the LPE. In the first case, the regressor is assumed to be fixed in repeated samples. In the second, the regressor is stochastic and potentially endogenous. For both cases the strong consistency and exact finite-sample distribution of the LPE is established. Conditions under which the LPE is consistent in the presence of serially correlated, heteroskedastic errors are also given. Finally, we describe how the LPE can be extended to the case with multiple regressors and conjecture that the extended estimator is consistent under conditions analogous to the ones given herein. Finite-sample properties of the LPE and extended LPE in comparison to the LSE and instrumental variable estimator (IVE) are investigated in a simulation study. One advantage of the LPE is that it does not require an instrument.

This note studies robust estimation of the autoregressive (AR) parameter in a nonlinear, nonnegat... more This note studies robust estimation of the autoregressive (AR) parameter in a nonlinear, nonnegative AR model driven by nonnegative errors. It is shown that a linear programming estimator (LPE), considered by Nielsen and Shephard (2003) among others, remains consistent under severe model misspecification. Consequently, the LPE can be used to test for, and seek sources of, misspecification when a pure autoregression cannot satisfactorily describe the data generating process, and to isolate certain trend, seasonal or cyclical components. Simple and quite general conditions under which the LPE is strongly consistent in the presence of serially dependent, non-identically distributed or otherwise misspecified errors are given, and a brief review of the literature on LP-based estimators in nonnegative autoregression is presented. Finite-sample properties of the LPE are investigated in an extensive simulation study covering a wide range of model misspecifications. A small scale empirical study, employing a volatility proxy to model and forecast latent daily return volatility of three major stock market indexes, illustrates the potential usefulness of the LPE.
This extended Appendix provides a technical supplement with supporting results and proofs to comp... more This extended Appendix provides a technical supplement with supporting results and proofs to complement the original note.

In this note we consider certain measure of location-based estimators (MLBEs) for the slope param... more In this note we consider certain measure of location-based estimators (MLBEs) for the slope parameter in a linear regression model with a single stochastic regressor. The median-unbiased MLBEs are interesting as they can be robust to heavy-tailed samples and, hence, preferable to the ordinary least squares estimator (LSE). Two different cases are considered as we investigate the statistical properties of the MLBEs. In the first case, the regressor and error is assumed to follow a symmetric stable distribution. In the second, other types of regressions, with potentially contaminated errors, are considered. For both cases the consistency and exact finite-sample distributions of the MLBEs are established. Some results for the corresponding limiting distributions are also provided. In addition, we illustrate how our results can be extended to include certain heteroskedastic and multiple regressions. Finite-sample properties of the MLBEs in comparison to the LSE are investigated in a simulation study.

Recently Duarte and Young (2009) study the probability of informed trading (PIN) proposed by Easl... more Recently Duarte and Young (2009) study the probability of informed trading (PIN) proposed by Easley et al. (2002) and decompose it into two parts: the adjusted PIN (APIN) as a measure of asymmetric information and the probability of symmetric order-flow shock (PSOS) as a measure of illiquidity. They provide some cross-section estimates of these measures using daily data over annual periods. In this paper we propose a method to estimate daily APIN and PSOS by extending the method in Tay et al. (2009) using high-frequency transaction data. Our empirical results show that while PIN is positively contemporaneously correlated with variance, APIN is not. On the other hand, PSOS is positively correlated with daily average effective spread and variance, which is consistent with the interpretation of PSOS as a measure of illiquidity. Compared to APIN, PSOS exhibits clustering and sporadic bursts over time.

This paper considers a multivariate version of the Diebold-Mariano test (Diebold & Mariano 1995, ... more This paper considers a multivariate version of the Diebold-Mariano test (Diebold & Mariano 1995, DM) for equal predictive ability (EPA). Under the null hypothesis of EPA of two or more non-nested forecasting models, the Wald-type test statistic has an asymptotic chi-squared distribution. The test statistic, S, is shown to be invariant with respect to the ordering of the models for a wide range of covariance matrix estimators. To explore whether the behavior of S in small to large-sized samples can be improved, we also show that the finite-sample correction of Harvey, Leybourne & Newbold (1997, HLN) for the DM test extends to our multivariate setting. Additional higher-order corrections are also developed for further potential improvement. Monte Carlo simulations indicate that S has reasonable size properties in large samples but tends to be oversized in moderate samples. Furthermore, the finite-sample correction of S succeeds in correcting the size of the test, but only partially. For size-adjusted tests power is increasing in the sample size, as expected. It is speculated that further finite-sample improvements can be achieved using Hotelling’s T-square or bootstrap critical values.

In this paper we introduce a linear programming estimator (LPE) for the slope parameter in a cons... more In this paper we introduce a linear programming estimator (LPE) for the slope parameter in a constrained linear regression model with a single regressor. The LPE is interesting because it can be superconsistent in the presence of an endogenous regressor and, hence, preferable to the ordinary least squares estimator (LSE). Two different cases are considered as we investigate the statistical properties of the LPE. In the first case, the regressor is assumed to be fixed in repeated samples. In the second, the regressor is stochastic and potentially endogenous. For both cases the strong consistency and exact finite-sample distribution of the LPE is established. Conditions under which the LPE is consistent in the presence of serially correlated, heteroskedastic errors are also given. Finally, we describe how the LPE can be extended to the case with multiple regressors and conjecture that the extended estimator is consistent under conditions analogous to the ones given herein. Finite-sample properties of the LPE and extended LPE in comparison to the LSE and instrumental variable estimator (IVE) are investigated in a simulation study. One advantage of the LPE is that it does not require an instrument.
Talks by Daniel P . A . Preve

Recently Duarte & Young (2009) extended the probability of informed trading (PIN) proposed by Eas... more Recently Duarte & Young (2009) extended the probability of informed trading (PIN) proposed by Easley, Hvidkjaer & O’Hara (2002) to incorporate two components: the adjusted PIN (APIN) as a measure of asymmetric information and the probability of symmetric order-flow shock (PSOS) as a measure of illiquidity. They provided some estimates of these measures using daily data over annual periods and argued that the APIN is not priced. In this paper we propose a method to estimate daily APIN and PSOS as an extension of Tay, Ting, Tse & Warachka (2009) using high-frequency transaction data. Our empirical results indicate that daily APIN is much more stable than daily PIN. In contrast to PIN, APIN is negatively correlated with integrated conditional variance (ICV), while PSOS is positively correlated with ICV. Moreover, in comparison with the daily APIN, the daily PSOS exhibits clustering and sporadic volatility over time.

This paper describes a simple multivariate version of the Diebold-Mariano test (Diebold & Mariano... more This paper describes a simple multivariate version of the Diebold-Mariano test (Diebold & Mariano 1995, DM) for equal predictive accuracy (EPA). Under the null hypothesis of EPA of two or more non-nested forecasting procedures, the proposed Wald-type test statistic has an asymptotic chi-squared distribution. The test statistic is shown to be invariant with respect to the ordering of the forecasting procedures for a wide range of variance-covariance matrix estimators. To explore whether the behavior of the test in small to large-sized samples can be improved, we also show that the finite-sample correction of Harvey, Leybourne & Newbold (1997, HLN) for the DM test extends to our multivariate setting. Additional higher-order corrections are also developed for further potential improvement. Monte Carlo simulations indicate that the proposed test has reasonable size properties in large samples but tends to be oversized in moderate samples. Furthermore, the finite-sample correction of the test succeeds in correcting the size of the test, but only partially. For size-adjusted tests power is increasing in the sample size, as expected. It is speculated that further finite-sample improvements can be achieved using bootstrap critical values.
In this note we introduce a linear programming estimator (LPE) for the slope parameter in a const... more In this note we introduce a linear programming estimator (LPE) for the slope parameter in a constrained linear regression model with a single regressor. The LPE is interesting because it can be superconsistent in the presence of an endogenous regressor and, hence, preferable to the ordinary least squares estimator (LSE). Two different cases are considered as we investigate the statistical properties of the LPE. In the first case, the regressor is assumed to be fixed in repeated samples. In the second, the regressor is stochastic and potentially endogenous. For both cases the strong consistency and exact finite sample distribution of the LPE is established.
Teaching Documents by Daniel P . A . Preve
Conference Presentations by Daniel P . A . Preve

This note studies robust estimation of the autoregressive (AR) parameter in a nonlinear, nonnegat... more This note studies robust estimation of the autoregressive (AR) parameter in a nonlinear, nonnegative AR model driven by nonnegative errors. It is shown that a linear programming estimator (LPE), considered by Nielsen and Shephard (2003) among others, remains consistent under severe model misspecification. Consequently, the LPE can be used to test for, and seek sources of, misspecification when a pure autoregression cannot satisfactorily describe the data generating process, and to isolate certain trend, seasonal or cyclical components. Simple and quite general conditions under which the LPE is strongly consistent in the presence of serially dependent, non-identically distributed or otherwise misspecified errors are given, and a brief review of the literature on LP-based estimators in nonnegative autoregression is presented. Finite-sample properties of the LPE are investigated in an extensive simulation study covering a wide range of model misspecifications. A small scale empirical study, employing a volatility proxy to model and forecast latent daily return volatility of three major stock market indexes, illustrates the potential usefulness of the LPE.
Uploads
Papers by Daniel P . A . Preve
Talks by Daniel P . A . Preve
Teaching Documents by Daniel P . A . Preve
Conference Presentations by Daniel P . A . Preve