Abstract
The integration of different data sharing only a subset of variables will become even more relevant in the future. With the aid of data fusion techniques, already existing data can be exploited to carry out new statistical analyses, circumventing the expensive collection of new data. This paper presents a new statistical matching method for categorical data based on a conditional independence assumption. The method uses undirected graphical models to visualize dependencies among variables, and obtains a powerful factorization of their joint distribution. It is used to estimate the probability components of the joint distribution despite the underlying identification problem. We embed the problem of statistical matching into the theory of log-linear Markov networks and show an exemplary application of this new method based on data of the German General Social Survey. The results indicate that the joint distribution can be reconstructed fairly well through the proposed statistical matching method.
Dokumententyp: | Paper |
---|---|
Keywords: | conditional independence; log-linear model; Markov random field; probabilistic graphical model; statistical matching |
Fakultät: | Mathematik, Informatik und Statistik > Statistik > Technische Reports |
Themengebiete: | 500 Naturwissenschaften und Mathematik > 510 Mathematik |
URN: | urn:nbn:de:bvb:19-epub-61678-9 |
Sprache: | Englisch |
Dokumenten ID: | 61678 |
Datum der Veröffentlichung auf Open Access LMU: | 17. Apr. 2019, 12:46 |
Letzte Änderungen: | 04. Nov. 2020, 13:39 |