Logo Logo
Hilfe
Hilfe
Switch Language to English

Faisal, Shahla und Tutz, Gerhard (2021): Multiple imputation using nearest neighbor methods. In: Information Sciences, Bd. 570: S. 500-516

Volltext auf 'Open Access LMU' nicht verfügbar.

Abstract

Missing values are a major problem in medical research. As the complete case analysis dis-cards useful information, estimation and inference may suffer strongly. Multiple imputa-tion has been shown to be a useful strategy to handle missing data problems and account for the uncertainty of imputation. In the presence of high-dimensional data (p >> n), the missing values raise even more serious problems as the existing software packages tend to fail. We present multiple imputation methods based on nearest neigh-bors. The distances are computed using the information of correlation among the target and candidate predictors. Thus only the relevant predictors contribute for computing dis-tances. The method successfully imputes missing values also in high-dimensional settings. Using a variety of simulated data with MCAR and MAR missing patterns, the proposed algo-rithm is compared to existing methods. Various measures are used to compare the perfor-mance of methods, including MSE for imputation, MSE of estimated regression coefficients, their standard errors, confidence intervals, and their coverage probabilities. The simulation results, for both cases n < p and n > p, show that the sequential imputation using weighted nearest neighbors can be successfully applied to a wide range of data settings and outper-forms or is close to the best when compared to existing methods. (c) 2021 Elsevier Inc. All rights reserved.

Dokument bearbeiten Dokument bearbeiten