Multivariate classification of neuroimaging data with nested subclasses: Biased accuracy and implications for hypothesis testing

www.lmu.de | UB | Blättern | Hilfe

Zur erweiterten Suche

English

Zur erweiterten Suche

Jamalabadi, Hamidreza; Alizadeh, Sarah; Schönauer, Monika; Leibold, Christian ORCID: https://orcid.org/0000-0002-4859-8000 und Gais, Steffen (2018): Multivariate classification of neuroimaging data with nested subclasses: Biased accuracy and implications for hypothesis testing.
In: PLOS Computational Biology 14(9), e1006486 [PDF, 4MB]

Vorschau

DOI: 10.1371/journal.pcbi.1006486

Abstract

Biological data sets are typically characterized by high dimensionality and low effect sizes. A powerful method for detecting systematic differences between experimental conditions in such multivariate data sets is multivariate pattern analysis (MVPA), particularly pattern classification. However, in virtually all applications, data from the classes that correspond to the conditions of interest are not homogeneous but contain subclasses. Such subclasses can for example arise from individual subjects that contribute multiple data points, or from correlations of items within classes. We show here that in multivariate data that have subclasses nested within its class structure, these subclasses introduce systematic information that improves classifiability beyond what is expected by the size of the class difference. We analytically prove that this subclass bias systematically inflates correct classification rates (CCRs) of linear classifiers depending on the number of subclasses as well as on the portion of variance induced by the subclasses. In simulations, we demonstrate that subclass bias is highest when between-class effect size is low and subclass variance high. This bias can be reduced by increasing the total number of subclasses. However, we can account for the subclass bias by using permutation tests that explicitly consider the subclass structure of the data. We illustrate our result in several experiments that recorded human EEG activity, demonstrating that parametric statistical tests as well as typical trial-wise permutation fail to determine significance of classification outcomes correctly.

Dokumententyp:	Zeitschriftenartikel
Fakultät:	Biologie > Department Biologie II > Neurobiologie
Themengebiete:	500 Naturwissenschaften und Mathematik > 570 Biowissenschaften; Biologie
URN:	urn:nbn:de:bvb:19-epub-60931-5
ISSN:	1553-7358
Sprache:	Englisch
Dokumenten ID:	60931
Datum der Veröffentlichung auf Open Access LMU:	11. Mrz. 2019 14:16
Letzte Änderungen:	04. Nov. 2020 13:39

Dokument bearbeiten