Discriminative Power Lasso - Incorporating Discriminative Power of Genes into Regularization-Based Variable Selection

www.lmu.de | UB | Blättern | Hilfe

Zur erweiterten Suche

English

Zur erweiterten Suche

Fütterer, Cornelia; Nalenz, Malte und Augustin, Thomas (11. November 2021): Discriminative Power Lasso - Incorporating Discriminative Power of Genes into Regularization-Based Variable Selection. Department of Statistics: Technical Reports, Nr. 239 [PDF, 436kB]

Dies ist die neueste Version des Dokumentes.

Vorschau

DOI: 10.5282/ubm/epub.77862

Abstract

In precision medicine, it is known that specific genes are decisive for the development of different cell types. In drug development it is therefore of high relevance to identify biomarkers that allow to distinguish cell-subtypes that are connected to a disease. The main goal is to find a sparse set of genes that can be used for prediction. For standard classification methods the high dimensionality of gene expression data poses a severe challenge. Common approaches address this problem by excluding genes during preprocessing. As an alternative, L1-regularized regression (Lasso) can be used in order to identify the most impactful genes. We argue to use an adaptive penalization scheme, based on the biological insight that decisive genes are expressed differently among the cell types. The differences in gene expression are measured as their discriminitive power (DP), which is based on the univariate compactness within classes and separation between classes. ANOVA based measures, as well as measures coming from clustering theory, are applied to construct the covariate specific DP. The resulting model, that we call Discriminative Power Lasso (DP-Lasso), incorporates the DP as covariate specific penalization into the Lasso. Genes with a higher DP are penalized less heavily and have a higher chance for being part of the final model. With that the model can be guided towards more promising and trustworthy genes, while the coefficients of uninformative genes can be shrunken to zero more reliably. We test our method on single-cell RNA-sequencing data as well as on simulated data. DP-Lasso leads on average to significantly sparser solutions compared to competing Lasso-based regularization approaches, while being competitive in terms of accuracy.

Dokumententyp:	Paper
Fakultät:	Mathematik, Informatik und Statistik > Statistik > Technische Reports Mathematik, Informatik und Statistik > Statistik > Lehrstühle/Arbeitsgruppen > Method(olog)ische Grundlagen der Statistik und ihre Anwendungen
Themengebiete:	000 Informatik, Informationswissenschaft, allgemeine Werke > 000 Informatik, Wissen, Systeme 300 Sozialwissenschaften > 310 Statistiken 500 Naturwissenschaften und Mathematik > 570 Biowissenschaften; Biologie
URN:	urn:nbn:de:bvb:19-epub-91666-7
Sprache:	Englisch
Dokumenten ID:	91666
Datum der Veröffentlichung auf Open Access LMU:	30. Mrz. 2022 06:44
Letzte Änderungen:	30. Mrz. 2022 07:02

Alle Versionen dieses Dokumentes

Discriminative Power Lasso -- Incorporating Discriminative Power of Genes into Regularization-Based Variable Selection. (deposited 12. Nov. 2021 10:36)
- Discriminative Power Lasso - Incorporating Discriminative Power of Genes into Regularization-Based Variable Selection. (deposited 30. Mrz. 2022 06:44) [momentan angezeigt]

Dokument bearbeiten