Home  |  Browse  |  Authors  |  Advanced Search  |  Help
Login | Create Account
Strobl, Carolin; Malley, James and Tutz, Gerhard (April 2009): An Introduction to Recursive Partitioning: Rationale, Application and Characteristics of Classification and Regression Trees, Bagging and Random Forests. Department of Statistics: Technical Reports, No.55

Metadaten exportieren

Autor(en) recherchieren

Lesezeichen anlegen

[img]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Reader
951Kb

Abstract

Recursive partitioning methods have become popular and widely used tools for nonparametric regression and classification in many scientific fields. Especially random forests, that can deal with large numbers of predictor variables even in the presence of complex interactions, have been applied successfully in genetics, clinical medicine and bioinformatics within the past few years. High dimensional problems are common not only in genetics, but also in some areas of psychological research, where only few subjects can be measured due to time or cost constraints, yet a large amount of data is generated for each subject. Random forests have been shown to achieve a high prediction accuracy in such applications, and provide descriptive variable importance measures reflecting the impact of each variable in both main effects and interactions. The aim of this work is to introduce the principles of the standard recursive partitioning methods as well as recent methodological improvements, to illustrate their usage for low and high dimensional data exploration, but also to point out limitations of the methods and potential pitfalls in their practical application. Application of the methods is illustrated using freely available implementations in the R system for statistical computing.

Item Type:Paper (Technical Report)
Keywords:CART, C4.5, bootstrap, variable selection, variable importance
Subjects:Psychology and Education Science > Psychological Methodology and Computer Science
Mathematics, Computer Science and Statistics > Statistics > Technical Reports
Dewey Classification:300 Social sciences > 310 General statistics
URN:urn:nbn:de:bvb:19-epub-10589-8
Language:English
ID Code:10589
Deposited On:28. Apr 2009 12:59
Last Modified:12. Jan 2012 16:54
Open Access LMU is powered by EPrints 3 which is developed by the School of Electronics and Computer Science at the University of Southampton. More information and software creditsAbout