Missing Data Imputation
17,284 Followers
Recent papers in Missing Data Imputation
Jan. 8, 2016: Perhaps wherever I noted "the estimated standard error of the random factors of the estimated residuals," I should have said "the estimated standard deviation of the random factors of the estimated residuals." I saw some... more
Intensive agricultural practices represent a major threat to aquatic ecosystems because they impair water quality. However, this can be ameliorated by farmers improving crop management provided they are aware of their contribution to... more
The principles of the iterative approach to deal with missing data are presented and the performance of this approach in multivariate methods such as PCA, PCR, PLS and N-way is studied. Matlab codes for these analyses are appended.
This article urges counseling psychology researchers to recognize and report how missing data are handled, because consumers of research cannot accurately interpret findings without knowing the amount and pattern of missing data or the... more
Sales forecasting became crucial for industries in past decades with rapid globalization, widespread adoption of information technology towards e-business, understanding market fluctuations, meeting business plans, and avoiding loss of... more
This paper introduces a new algorithm for gap filling in univariate time series by using SSA. In this algorithm, the data before the missing values and the data after the missing values (in reverse order) are treated as two separate time... more
The present article discusses various preprocessing techniques suitable for dealing with time series data for environmental science-related studies. The errors or noises due to electronic sensor fault, fault in the communication channel,... more
The book (in Bulgarian) analyses relative advantages and disadvantages, challenges and approaches in sample optimisation in case of missing data. През последните 10–12 години в емпиричната социология в България извадковите изследвания... more
Raw data collected through surveys, experiments, coding of textual artifacts or other quantitative means may not meet the assumptions upon which statistical analyses rely. The presence of univariate or multivariate outliers, skewness or... more
This paper examines the effectiveness of different methods in imputing mileage incurred by commercial motor carriers (used as exposure measures in deriving safety indices of carriers), by using an administrative dataset on motor carriers.... more
Branding is a strategy designed by companies to help patrons or consumers quickly identify their products or organisations and give them a reason to choose their products or organisations over other competitors. In the Old Testament, God... more
The purpose of this study is to impute missing monthly maximum temperature data from the department of Valle del Cauca, located in Colombia, over a period of two years (2013 y 2014). For this, a geostatistical technique will be used,... more
Nowadays applications and technological evolution have caused the production and storage of huge volumes of data. This scenario facilitated the increased occurrence of missing values in data sets. Missing data is harmful for statistical... more
Missing data are often encountered in many areas of research. Complete case analysis and indicator method can lead to serious bias. One of the comforting methods is implementation of imputation methods. The main purpose of this paper is... more
La presencia en las bases de datos de registros sin información (missing values) y de valores extremos (outliers) es muy frecuente (ciencias sociales y otras) y no tomar en cuenta estos valores puede generar situaciones no deseadas en la... more
This article urges counseling psychology researchers to recognize and report how missing data are handled, because consumers of research cannot accurately interpret findings without knowing the amount and pattern of missing data or the... more
Missing data are unavoidable in wireless sensor networks, due to issues such as network communication outage, sensor maintenance or failure, etc. Although a plethora of methods have been proposed for imputing sensor data, limitations... more
In developing regions missing data are prevalent in historical hydrological datasets, owing to financial, institutional, operational and technical challenges. If not tackled, these data shortfalls result in uncertainty in flood frequency... more
This study examines the prevalence and correlates of psychiatric disorders and mental health problems among undocumented Mexican immigrants using the National Latino and Asian American Study (NLAAS). Two approaches were used to obtain... more
Researchers are often interested in the relationship between two variables, with no single data set containing both. A common strategy is to use proxies for the dependent variable that are common to two surveys to impute the dependent... more
The purpose of this study is to investigate the psychometric properties of scales with different missing data techniques. For this purpose 100 data sets were generated under different conditions for sample sizes (250, 500 and 1000) and... more
Data collection is a fundamental component in the study of energy and buildings. Errors and inconsistencies in the data collected from test environment can negatively influence the energy consumption modelling of a building and other... more
Imputation is a common method for replacing a missing value with one or more fabricated values. The terminology and methodology of imputation is often confusing because no general framework exists. This paper is an attempt to develop such... more
ourly measured PM10 concentration at eight monitoring stations within peninsular Malaysia in 2006 was used to conduct the simulated missing data. The gap lengths of the simulated missing values are limited to ≤12 hours since the actual... more
The integration of XML data sources which have different schemas/DTD can originate structural and vocabular heterogeneity. In this context, it is difficult to write satisfiable queries. As a solution, many Information Systems focus on... more
With large amounts of unstructured data being produced every day, organizations are trying to extract as much relevant information as possible. This massive quantity of data is collected from a variety of sources, and data analysts and... more
ABSTRAK Curah hujan adalah informasi penting di bidang transportasi, pertanian, industri dll. Dengan mengetahui informasi curah hujan, tindakan dapat diambil secara tepat di beberapa bidang tersebut. sehingga tidak ada kerugian karena... more
The purpose of this study is to investigate the psychometric properties of scales with different missing data techniques. For this purpose 100 data sets were generated under different conditions for sample sizes (250, 500 and 1000) and... more
We evaluated alternative approaches to imputation for univariate estimates and multivariate regression analyses of physiological health measures collected in the 2003-2004 National Health and Nutrition Examination Survey (NHANES). From... more
I analyze a series of techniques designed for replacing missing data. From the extensive literature on political values in postcommunist countries, I selected one of the most discussed models – the one proposed by Reisinger et al. (1994).... more
Unlike many other places around the globe, Hong Kong is a small city with a high population density. Some housing units are built near the sources of an externality, such as a landfill site. As the blocks of buildings are particularly... more
The purpose of this study is to examine the effect of different missing data techniques on the item parameters estimated for Classical Test Theory (CTT) and Item Response Theory (IRT) comparatively through simulated and real data sets.... more
This is a letter to the editor of the Journal of Official Statistics (JOS). It addresses an article in the previous issue of JOS on cutoff sampling, which referenced this author, and attempts to clarify some positions, including that with... more
Huntington’s disease HD is a progressive neurodegenerative disorder caused by an expansion of CAG repeats in the IT15 gene. The age-at-onset AAO of HD is inversely related to the CAG repeat length and the minimum length thought to... more
Rainfall amounts and water surface elevation are considered as one of the most important climatic parameters. Because these two parameters will have a direct impact on water resources management decisions such as meet the water needs and... more