In: PLOS ONE
9(10), e107955
[PDF, 1MB]
Abstract
Dependence measures and tests for independence have recently attracted a lot of attention, because they are the cornerstone of algorithms for network inference in probabilistic graphical models. Pearson's product moment correlation coefficient is still by far the most widely used statistic yet it is largely constrained to detecting linear relationships. In this work we provide an exact formula for the ith nearest neighbor distance distribution of rank-transformed data. Based on that, we propose two novel tests for independence. An implementation of these tests, together with a general benchmark framework for independence testing, are freely available as a CRAN software package (http://cran.r-project.org/web/packages/knnIndep). In this paper we have benchmarked Pearson's correlation, Hoeffding's D, dcor, Kraskov's estimator for mutual information, maximal information criterion and our two tests. We conclude that no particular method is generally superior to all other methods. However, dcor and Hoeffding's D are the most powerful tests for many different types of dependence.
Item Type: | Journal article |
---|---|
Faculties: | Medicine > Institute for Medical Information Processing, Biometry and Epidemiology |
Subjects: | 600 Technology > 610 Medicine and health |
URN: | urn:nbn:de:bvb:19-epub-33412-8 |
ISSN: | 1932-6203 |
Language: | English |
Item ID: | 33412 |
Date Deposited: | 15. Feb 2017, 14:44 |
Last Modified: | 04. Nov 2020, 13:11 |