Logo Logo
Hilfe
Hilfe
Switch Language to English

Razzak, Humera und Heumann, Christian (2020): The Ability of Different Imputation Methods to Capture Complex Dependencies in High Dimensions. In: Romanian Statistical Review, Nr. 1: S. 55-75

Volltext auf 'Open Access LMU' nicht verfügbar.

Abstract

Multiple-imputation (MI) is a method for treating the problem of missing data. There are various competing computational algorithms available in the R environment to address missing data problems of categorical and continuous variables. In the case of a high amount of missing information, large sample sizes and complex dependency structures among categorical variables, the utility of the provided R packages is some-what limited. A computationally expedient, fully Bayesian, joint modeling (JM) approach known as Dirichlet process mixtures of multinomial distributions" (DPMD), automatically models complex dependencies among variables. But this approach is limited to categorical variables only. We propose a simple and easy to implement combining algorithm which imputes continuous variables using various algorithms and uses the JM approach to detect complex dependency structures among categorical variables. We review, describe and evaluate software packages commonly available in R and compare the results with the proposed MI method by using as example an artificial data set. The results suggest that the MI approach which combines the JM approach and various algorithms based on generalized linear models dominates various algorithms when applied solely.

Dokument bearbeiten Dokument bearbeiten