Papers by JOEL DANIEL AZCORRA PEñA
Problemas del Desarrollo. Revista Latinoamericana de Economía, 2013
En este artículo se introduce el modelo factorial dinámico threshold, el cual permite analizar si... more En este artículo se introduce el modelo factorial dinámico threshold, el cual permite analizar sistemas de series temporales que presenten comportamientos no lineales del tipo umbral. Se propone un método de estimación que combina el algoritmo EM con un procedimiento de búsqueda directa utilizando los algoritmos del filtro y de suavización de Kalman. El procedimiento estima factores comunes con comportamientos que cambian de régimen de acuerdo con una variable umbral. Palabras clave: series de tiempo no lineales, análisis factorial, modelo threshold, algoritmo EM , filtro de Kalman.

In this paper we present a generalized dynamic factor model for a vector of time series which see... more In this paper we present a generalized dynamic factor model for a vector of time series which seems to provide a general framework to incorporate all the common information included in a collection of variables. The common dynamic structure is explained through a set of common factors, which may be stationary or nonstationary, as in the case of cornmon trends. AIso, it may exist a specific structure for each variable. Identification of the nonstationary I(d) factors is made through the cornmon eigenstructure of the generalized covariance matrices, properly normalized. The number of common trends, or in general I(d) factors, is the number of nonzero eigenvalues of the above matrices. It is also proved that these nonzero eigenvalues are strictIy greater than zero almost sure. Randomness appears in the eigenvalues as well as the eigenvectors, but not on the subspace spanned by the eigenvectors.

Este trabajo analiza como la creciente accesibilidad inmediata a grandes masas de datos, la mayor... more Este trabajo analiza como la creciente accesibilidad inmediata a grandes masas de datos, la mayor potencia con menor coste de los metodos de calculo, el crecimiento de la demanda por las tecnicas estadfsticas y la emergente necesidad de combinar distintos tipos de informaci6n van a impulsar el avance de nuevos metodos estadfsticos. Se concluye que es previsible un desarrollo importante de tecnicas exploratorias multivariantes, de metodos de estimaci6n mas generales y flexibles, de una mejor evaluaci6n de la incertidumbre al construir modelos estadfsticos, de metodos de predicci6n multivariante y de metodologfas para Meta-Analisis. Se estudian algunas implicaciones de estos avances en la estadfstica oficial, y en especial en la actividad de un Instituto Nacional de Estadfstica (lNE), resaltando como pueden contribuir a mejorar la calidad de la informaci6n suministrada por el INE y su servicio a la sociedad espafiola.
How to combine information from different sources is becoming an important statistical area of re... more How to combine information from different sources is becoming an important statistical area of research under the name of Meta Analysis. This paper shows that the estimation of a parameter or the forecast of a random variable can also be seen as a process of combining information. It is shown that this approach can provide sorne useful insights on the robustness properties of sorne statistical procedures, and it also allows the comparison of statistical models within a common framework. Sorne general combining rules are illustrated using examples from ANOVA analysis, diagnostics in regression, time series forecasting, missing value estimation and recursive estimation using the Kalman Filter.
Proyecciones de demanda de educaci�n en Espa�a
Sobre la robustificacion interna del algoritmo de Plackett-Kalman para la estimacion recursiva del modelo de regresion lineal
Trabajos de Estadistica y de Investigacion Operativa, 1985
We propose a periodogram-based metric for classification and clustering of time series with diffe... more We propose a periodogram-based metric for classification and clustering of time series with different sample sizes. For such cases, we know that the Euclidean distance between the periodogram ordinates cannot be used. One possible way to deal with this problem is to interpolate lineary one of the periodograms in order to estimate ordinates of the same frequencies.
Statistics & Probability Letters, 2005
In this note we analyze the relationship between one-step ahead prediction errors and interpolati... more In this note we analyze the relationship between one-step ahead prediction errors and interpolation errors in time series. We obtain an expression of the prediction errors in terms of the interpolation errors and then we show that minimizing the sum of squares of the one step-ahead standardized prediction errors is equivalent to minimizing the sum of squares of standardized interpolation errors.
Alatriste. Luz barroca
CAMERAMAN, 2006
... Un equipo de 250 personas recorrió España para retratar la biografía del Capitán Alatriste, u... more ... Un equipo de 250 personas recorrió España para retratar la biografía del Capitán Alatriste, un soldado de fortuna convertido en asesino a sueldo, espadachín y aventurero. Agistín Díaz Yanes es el director y guionista de la película, y Paco Femenía el director de fotografía. ...
A Bayesian Approach for Predicting With Polynomial Regression of Unknown Degree
Technometrics, 2005
A New Statistic for Influence in Linear Regression
Technometrics, 2005

Journal of the American Statistical Association, 1999
We propose a procedure for computing a fast approximation to regression estimates based on the mi... more We propose a procedure for computing a fast approximation to regression estimates based on the minimization of a robust scale. The procedure can be applied with a large number of independent variables where the usual algorithms require an unfeasible or extremely costly computer time. Also, it can be incorporated in any high-breakdown estimation method and may improve it with just little additional computer time. The procedure minimizes the robust scale over a set of tentative parameter vectors estimated by least squares after eliminating a set of possible outliers, which are obtained as follows. We represent each observation by the vector of changes of the least squares forecasts of the observation when each of the data points is deleted. Then we obtain the sets of possible outliers as the extreme points in the principal components of these vectors, or as the set of points with large residuals. The good performance of the procedure allows identification of multiple outliers, avoiding masking effects. We investigate the procedure's efficiency for robust estimation and power as an outlier detection tool in a large real dataset and in a simulation study.
Journal of Statistical Planning and Inference, 2009
This paper deals with the problem of robustness of Bayesian regression with respect to the data. ... more This paper deals with the problem of robustness of Bayesian regression with respect to the data. We first give a formal definition of Bayesian robustness to data contamination, prove that robustness according to the definition cannot be obtained by using heavy-tailed error distributions in linear regression models and propose a heteroscedastic approach to achieve the desired Bayesian robustness.
Journal of Statistical Planning and Inference, 2006
In this paper, we present a procedure to build a dynamic factor model for a vector of time series... more In this paper, we present a procedure to build a dynamic factor model for a vector of time series. We assume a model in which the common dynamic structure of the time series vector is explained through a set of common factors, which may be nonstationary, as in the case of common trends. Identification of the nonstationary I (d) factors is made through the common eigenstructure of the generalized covariance matrices, properly normalized. The number of common nonstationary factors is the number of nonzero eigenvalues of the above matrices. A chi-square statistic is proposed to test for the number of factors, stationary or not. The estimation of the model is carried out in state space form. This proposal is illustrated through several simulations and a real data set.

Journal of Econometrics, 2004
In this paper we analyze the structure and the forecasting performance of the dynamic factor mode... more In this paper we analyze the structure and the forecasting performance of the dynamic factor model. It is shown that the forecasts obtained by the factor model imply shrinkage pooling terms, similar to the ones obtained from hierarchical Bayesian models that have been applied successfully in the econometric literature. Thus, the results obtained in this paper provide an additional justiÿcation for these and other types of pooling procedures. The expected decrease in MSE for using a factor model versus univariate ARIMA and shrinkage models are studied for the one factor model. Monte Carlo simulations are presented to illustrate this result. A factor model is also built to forecast GNP of European countries and it is shown that the factor model can provide a substantial improvement in forecasts with respect to both univariate and shrinkage univariate forecasts.

Journal of Chemometrics, 2009
Partial least squares (PLS) regression is a linear regression technique developed to relate many ... more Partial least squares (PLS) regression is a linear regression technique developed to relate many regressors to one or several response variables. Robust methods are introduced to reduce or remove the effect of outlying data points. In this paper, we show that if the sample covariance matrix is properly robustified further robustification of the linear regression steps of the PLS algorithm becomes unnecessary. The robust estimate of the covariance matrix is computed by searching for outliers in univariate projections of the data on a combination of random directions (Stahel—Donoho) and specific directions obtained by maximizing and minimizing the kurtosis coefficient of the projected data, as proposed by Peña and Prieto [1]. It is shown that this procedure is fast to apply and provides better results than other methods proposed in the literature. Its performance is illustrated by Monte Carlo and by an example, where the algorithm is able to show features of the data which were undete...
Environmetrics, 2008
The Bayesian estimation of a dynamic factor model where the factors follow a multivariate autoreg... more The Bayesian estimation of a dynamic factor model where the factors follow a multivariate autoregressive model is presented. We derive the posterior distributions for the parameters and the factors and use Monte Carlo methods to compute them. The model is applied to study the association between air pollution and mortality in the city of São Paulo, Brazil. Statistical analysis was performed through a Bayesian analysis of a dynamic factor model. The series considered were minimal temperature, relative humidity, air pollutant of PM 10 and CO, mortality circulatory disease and mortality respiratory disease. We found a strong association between air pollutant (PM 10), Humidity and mortality respiratory disease for the city of São Paulo.
Computational Statistics & Data Analysis, 2007
Quality control using continuous monitoring from images is emerging as an active research area. T... more Quality control using continuous monitoring from images is emerging as an active research area. These applications require adaptive statistical techniques in order to detect and isolate process abnormalities. A novel approach is introduced for monitoring schemes in the setting of image data when the quality is associated with uniform pixel gray-scales. The proposed approach requires the definition of a statistic which takes into account both the spatial dependency and the changes in local variability. An application on paper surface demonstrates how the monitoring scheme performs in practical applications.
Computational Statistics & Data Analysis, 2006
The statistical discrimination and clustering literature has studied the problem of identifying s... more The statistical discrimination and clustering literature has studied the problem of identifying similarities in time series data. Some studies use non-parametric approaches for splitting a set of time series into clusters by looking at their Euclidean distances in the space of points. A new measure of distance between time series based on the normalized periodogram is proposed. Simulation results comparing this measure with others parametric and non-parametric metrics are provided. In particular, the classification of time series as stationary or as non-stationary is discussed. The use of both hierarchical and non-hierarchical clustering algorithms is considered. An illustrative example with economic time series data is also presented.
Uploads
Papers by JOEL DANIEL AZCORRA PEñA