Home  |  Browse  |  Authors  |  Advanced Search  |  Help
Login | Create Account
Strobl, Carolin (2005): Statistical Sources of Variable Selection Bias in Classification Tree Algorithms Based on the Gini Index. Collaborative Research Center 386, Discussion Paper 420

Metadaten exportieren

Autor(en) recherchieren

Lesezeichen anlegen

[img]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Reader
192Kb

Abstract

Evidence for variable selection bias in classification tree algorithms based on the Gini Index is reviewed from the literature and embedded into a broader explanatory scheme: Variable selection bias in classification tree algorithms based on the Gini Index can be caused not only by the statistical effect of multiple comparisons, but also by an increasing estimation bias and variance of the splitting criterion when plug-in estimates of entropy measures like the Gini Index are employed. The relevance of these sources of variable selection bias in the different simulation study designs is examined. Variable selection bias due to the explored sources applies to all classification tree algorithms based on empirical entropy measures like the Gini Index, Deviance and Information Gain, and to both binary and multiway splitting algorithms.

Item Type:Paper (Research Paper)
Subjects:Mathematics, Computer Science and Statistics
Mathematics, Computer Science and Statistics > Statistics
Mathematics, Computer Science and Statistics > Statistics > Collaborative Research Center 386
Dewey Classification:600 Natural sciences and mathematics
600 Natural sciences and mathematics > 510 Mathematics
URN:urn:nbn:de:bvb:19-epub-1789-0
ID Code:1789
Deposited On:11. Apr 2007
Last Modified:28. Jun 2010 14:35
Open Access LMU is powered by EPrints 3 which is developed by the School of Electronics and Computer Science at the University of Southampton. More information and software creditsAbout