Logo Logo
Help
Contact
Switch Language to German
Hornung, Roman (23. October 2017): Ordinal Forests. Department of Statistics: Technical Reports, No.212
[img]
Preview
470kB

Abstract

The prediction of the values of ordinal response variables using covariate data is a relatively infrequent task in many application areas. Accordingly, ordinal response variables have gained comparably little attention in the literature on statistical prediction modeling. The random forest method is one of the strongest prediction methods for binary response variables and continuous response variables. Its basic, tree-based concept has led to several extensions including prediction methods for other types of response variables. In this paper, the ordinal forest method is introduced, a random forest based prediction method for ordinal response variables. Ordinal forests allow prediction using both low-dimensional and high-dimensional covariate data and can additionally be used to rank covariates with respect to their importance for prediction. Using several real datasets and simulated data, the performance of ordinal forests with respect to prediction and covariate importance ranking is compared to competing approaches. First, these investigations reveal that ordinal forests tend to outperform competitors in terms of prediction performance. Second, it is seen that the covariate importance measure currently used by ordinal forest discriminates influential covariates from noise covariates at least similarly well as the measures used by competitors. In an additional investigation using simulated data, several further important properties of the OF algorithm are studied. The rationale underlying ordinal forests to use optimized score values in place of the class values of the ordinal response variable is in principle applicable to any regression method beyond random forests for continuous outcome that is considered in the ordinal forest method.