Abstract
The Lasso of Tibshirani (1996) is a useful method for estimation and implicit selection of predictors in a linear regression model, by using a 1-penalty, if the number of observations is not markedly larger than the number of possible predictors. We apply the Lasso to a predictive linear regression model in a study with baseline and follow up measurement for unspecific low back pain with a focus on the selection of psycho sociological predictors. Practitioners want to report measures of uncertainty for estimated regression coefficients, i.e. p-values or confidence intervals, where post selection classical t-tests are not valid anymore. In the last few years several approaches for inference in high-dimensional data settings have been developed. We do a selective overview on assigning p-values to Lasso selected variables and analyse two methods in a simulation study using the structure of our data set. We found out that Multi Sample Splitting (Wasserman and Roeder, 2009; Meinshausen et al., 2009) may not be helpful for generating p-values, while the LDPE approach of Zhang and Zhang (2014) produces promising results for type-I-errors and power calculations on single hypotheses. Therefore, we apply the LDPE for the analysis of our back pain study.
Item Type: | Paper |
---|---|
Keywords: | Lasso; Multi Sample Splitting; LDPE; inference; post selection inference, MiSpEx Network |
Faculties: | Mathematics, Computer Science and Statistics > Statistics > Technical Reports |
Subjects: | 500 Science > 510 Mathematics 600 Technology > 610 Medicine and health |
URN: | urn:nbn:de:bvb:19-epub-40526-0 |
Language: | English |
Item ID: | 40526 |
Date Deposited: | 24. Sep 2017, 12:24 |
Last Modified: | 04. Nov 2020, 13:17 |