Overview of Random Forest Methodology and Practical Guidance with Emphasis on Computational Biology and Bioinformatics

www.lmu.de | UB | Blättern | Hilfe

Zur erweiterten Suche

English

Zur erweiterten Suche

Boulesteix, Anne-Laure; Janitza, Silke; Kruppa, Jochen und König, Inke R. (25. Juli 2012): Overview of Random Forest Methodology and Practical Guidance with Emphasis on Computational Biology and Bioinformatics. Department of Statistics: Technical Reports, Nr. 129 [PDF, 376kB]

Vorschau

DOI: 10.5282/ubm/epub.13766

Abstract

The Random Forest (RF) algorithm by Leo Breiman has become a standard data analysis tool in bioinformatics. It has shown excellent performance in settings where the number of variables is much larger than the number of observations, can cope with complex interaction structures as well as highly correlated variables and returns measures of variable importance. This paper synthesizes ten years of RF development with emphasis on applications to bioinformatics and computational biology. Special attention is given to practical aspects such as the selection of parameters, available RF implementations, and important pitfalls and biases of RF and its variable importance measures (VIMs). The paper surveys recent developments of the methodology relevant to bioinformatics as well as some representative examples of RF applications in this context and possible directions for future research.

Dokumententyp:	Paper
Keywords:	random forest, regression and classification trees, genetic association studies, variable importance, bias
Fakultät:	Mathematik, Informatik und Statistik > Statistik > Technische Reports
Themengebiete:	500 Naturwissenschaften und Mathematik > 510 Mathematik
URN:	urn:nbn:de:bvb:19-epub-13766-3
Sprache:	Englisch
Dokumenten ID:	13766
Datum der Veröffentlichung auf Open Access LMU:	31. Jul. 2012 18:24
Letzte Änderungen:	04. Nov. 2020 12:54

Dokument bearbeiten