Over-optimism in bioinformatics: an illustration

www.lmu.de | UB | Blättern | Hilfe

Zur erweiterten Suche

English

Zur erweiterten Suche

Monika, Jelizarow; Vincent, Guillemot; Arthur, Tenenhaus; Korbinian, Strimmer und Anne-Laure, Boulesteix (3. Mai 2010): Over-optimism in bioinformatics: an illustration. Department of Statistics: Technical Reports, Nr. 81 [PDF, 596kB]

Vorschau

Download (596kB)

DOI: 10.5282/ubm/epub.11497

Abstract

In statistical bioinformatics research, different optimization mechanisms potentially lead to "over-optimism" in published papers. The present empirical study illustrates these mechanisms through a concrete example from an active research field. The investigated sources of over-optimism include the optimization of the data sets, of the settings, of the competing methods and, most importantly, of the method’s characteristics. We consider a "promising" new classification algorithm that turns out to yield disappointing results in terms of error rate, namely linear discriminant analysis incorporating prior knowledge on gene functional groups through an appropriate shrinkage of the within-group covariance matrix. We quantitatively demonstrate that this disappointing method can artificially seem superior to existing approaches if we "fish for significance”. We conclude that, if the improvement of a quantitative criterion such as the error rate is the main contribution of a paper, the superiority of new algorithms should be validated using "fresh" validation data sets.

Dokumententyp:	Paper
Keywords:	Validation, Fishing for Significance, Meta-Methodology, KEGG, Discriminant Analysis, Shrinkage Covariance Estimator
Fakultät:	Mathematik, Informatik und Statistik > Statistik > Technische Reports
Themengebiete:	500 Naturwissenschaften und Mathematik > 510 Mathematik
URN:	urn:nbn:de:bvb:19-epub-11497-4
Sprache:	Englisch
Dokumenten ID:	11497
Datum der Veröffentlichung auf Open Access LMU:	04. Mai 2010, 08:14
Letzte Änderungen:	04. Nov. 2020, 12:52

Dokument bearbeiten