Abstract
Graphical models can prove quite powerful for statistical matching, making secondary data analysis feasible also in situations where joint information about variables that were not collected together is sought. Without any constraints regarding the direction of influence of variables, we develop a method that uses the graphical Ising model to merge two or more data files containing binary data only. To this end, we rely on the conditional independence assumption commonly made in statistical matching to learn a joint Markov network graph structure over all variables from the given data. Based on this joint graph, the probability distribution is estimated by an adapted version of the Ising model. The quality of our new data fusion method is assessed on basis of a simulation study, sampling data from random Ising models. We investigate which parameters influence the quality of data integration, and how violations of the conditional independence assumption affect the results.
Item Type: | Paper |
---|---|
Keywords: | statistical matching; data fusion; Markov network; Ising model; conditional independence, UNSPECIFIED, UNSPECIFIED, UNSPECIFIED, UNSPECIFIED |
Faculties: | Mathematics, Computer Science and Statistics > Statistics > Technical Reports |
Subjects: | 500 Science > 510 Mathematics |
URN: | urn:nbn:de:bvb:19-epub-61732-0 |
Item ID: | 61732 |
Date Deposited: | 23. Apr 2019, 10:13 |
Last Modified: | 23. Apr 2019, 10:13 |