Abstract
This paper is motivated by a question at the heart of unsupervised learning approaches. Assume we are collecting a number K of (subjective) opinions about some event E from K different agents. Can we infer E from them? Prima facie this seems impossible, since the agents may be lying. We model this task by letting the events be distributed according to some distribution p and the task is to estimate p under unknown noise. Again, this is impossible without additional assumptions. We report here the finding of very natural such assumptions-the availability of multiple copies of the true data, each under independent and invertible (in the sense of matrices) noise, is already sufficient. If the true distribution and the observations are modeled on the same finite alphabet, then the number of such copies needed to determine p to the highest possible precision is exactly three! This result can be seen as a counterpart to independent component analysis. Therefore, we call our approach "dependent component analysis." In addition, we present generalizations of the model to different alphabet sizes at an input and an output. A second result is found: the "activation" of invertibility through multiple parallel uses.
Item Type: | Journal article |
---|---|
Faculties: | Philosophy, Philosophy of Science and Religious Science > Munich Center for Mathematical Philosophy (MCMP) |
ISSN: | 0018-9448 |
Language: | English |
Item ID: | 46972 |
Date Deposited: | 27. Apr 2018, 08:12 |
Last Modified: | 04. Nov 2020, 13:23 |