De Bin, Riccardo
(2. April 2015):
Boosting in Cox regression: a comparison between the likelihood-based and the model-based approaches with focus on the R-packages CoxBoost and mboost.
Department of Statistics: Technical Reports, No.180
Despite the limitations imposed by the proportional hazards assumption, the Cox model is probably the most popular statistical tool used to analyze survival data, thanks to its flexibility and ease of interpretation. For this reason, novel statistical/machine learning techniques are usually adapted to fit it, including boosting, an iterative technique originally developed in the machine learning community and later extended to the statistical field. The popularity of boosting has been further driven by the availability of user-friendly software such as the R packages mboost and CoxBoost, both of which allow the implementation of boosting in conjunction with the Cox model. Despite the common underlying boosting principles, these two packages use different techniques: the former is an adaption of the model-based boosting, while the latter adapts the likelihood-based boosting. Here we contrast these two boosting techniques as implemented in the R packages from an analytic point of view, and we examine the solutions there adopted to treat mandatory variables, i.e. variables that for some reasons must be included in the model. We explore the possibility of extending solutions currently only implemented in one package to the other. We illustrate the usefulness of these extensions through the application to two real data examples.