Drießlein, David; Küchenhoff, Helmut ORCID: 0000-0002-6372-2487; Tutz, Gerhard; Wippert, Pia Maria
(1. September 2017):
Variable Selection and Inference in a
follow-up Study on Back Pain.
Department of Statistics: Technical Reports, No.211
|
![[img]](https://epub.ub.uni-muenchen.de/40526/1.hassmallThumbnailVersion/trHK.pdf)  Preview |
|
550kB |
Abstract
The Lasso of Tibshirani (1996) is a useful method for estimation and implicit selection of predictors in a linear regression model, by using a 1-penalty, if the number of observations is not markedly larger than the number of possible predictors.
We apply the Lasso to a predictive linear regression model in a study with baseline and follow up measurement for unspecific low back pain with a focus on the selection of psycho sociological predictors. Practitioners want to report measures of
uncertainty for estimated regression coefficients, i.e. p-values or confidence intervals,
where post selection classical t-tests are not valid anymore. In the last few years
several approaches for inference in high-dimensional data settings have been developed.
We do a selective overview on assigning p-values to Lasso selected variables
and analyse two methods in a simulation study using the structure of our data set.
We found out that Multi Sample Splitting (Wasserman and Roeder, 2009; Meinshausen et al., 2009) may not be helpful for generating p-values, while the LDPE approach of Zhang and Zhang (2014) produces promising results for type-I-errors and power calculations on single hypotheses. Therefore, we apply the LDPE for the analysis of our back pain study.