Abstract
The Lasso of Tibshirani (1996) is a useful method for estimation and implicit selection of predictors in a linear regression model, by using a 1-penalty, if the number of observations is not markedly larger than the number of possible predictors. We apply the Lasso to a predictive linear regression model in a study with baseline and follow up measurement for unspecific low back pain with a focus on the selection of psycho sociological predictors. Practitioners want to report measures of uncertainty for estimated regression coefficients, i.e. p-values or confidence intervals, where post selection classical t-tests are not valid anymore. In the last few years several approaches for inference in high-dimensional data settings have been developed. We do a selective overview on assigning p-values to Lasso selected variables and analyse two methods in a simulation study using the structure of our data set. We found out that Multi Sample Splitting (Wasserman and Roeder, 2009; Meinshausen et al., 2009) may not be helpful for generating p-values, while the LDPE approach of Zhang and Zhang (2014) produces promising results for type-I-errors and power calculations on single hypotheses. Therefore, we apply the LDPE for the analysis of our back pain study.
Dokumententyp: | Paper |
---|---|
Keywords: | Lasso; Multi Sample Splitting; LDPE; inference; post selection inference, MiSpEx Network |
Fakultät: | Mathematik, Informatik und Statistik > Statistik > Technische Reports |
Themengebiete: | 500 Naturwissenschaften und Mathematik > 510 Mathematik
600 Technik, Medizin, angewandte Wissenschaften > 610 Medizin und Gesundheit |
URN: | urn:nbn:de:bvb:19-epub-40526-0 |
Sprache: | Englisch |
Dokumenten ID: | 40526 |
Datum der Veröffentlichung auf Open Access LMU: | 24. Sep. 2017, 12:24 |
Letzte Änderungen: | 04. Nov. 2020, 13:17 |