Abstract
Graphical models can prove quite powerful for statistical matching, making secondary data analysis feasible also in situations where joint information about variables that were not collected together is sought. Without any constraints regarding the direction of influence of variables, we develop a method that uses the graphical Ising model to merge two or more data files containing binary data only. To this end, we rely on the conditional independence assumption commonly made in statistical matching to learn a joint Markov network graph structure over all variables from the given data. Based on this joint graph, the probability distribution is estimated by an adapted version of the Ising model. The quality of our new data fusion method is assessed on basis of a simulation study, sampling data from random Ising models. We investigate which parameters influence the quality of data integration, and how violations of the conditional independence assumption affect the results.
Dokumententyp: | Paper |
---|---|
Keywords: | statistical matching; data fusion; Markov network; Ising model; conditional independence, , , , |
Fakultät: | Mathematik, Informatik und Statistik > Statistik > Technische Reports |
Themengebiete: | 500 Naturwissenschaften und Mathematik > 510 Mathematik |
URN: | urn:nbn:de:bvb:19-epub-61732-0 |
Dokumenten ID: | 61732 |
Datum der Veröffentlichung auf Open Access LMU: | 23. Apr. 2019, 10:13 |
Letzte Änderungen: | 23. Apr. 2019, 10:13 |