Selection and fusion of categorical predictors with L₀-type penalties

www.lmu.de | UB | Blättern | Hilfe

Zur erweiterten Suche

English

Zur erweiterten Suche

Oelker, Magret-Ruth; Pößnecker, Wolfgang und Tutz, Gerhard (2015): Selection and fusion of categorical predictors with L₀-type penalties. In: Statistical Modelling, Bd. 15, Nr. 5: S. 389-410 [PDF, 504kB]

Vorschau

DOI: 10.1177/1471082X14553366

Abstract

In regression modelling, categorical covariates have to be coded. Depending on the number of categorical covariates and on the number of levels they have, the number of coefficients can become huge. To reduce the model complexity, coefficients of similar categories should be fused and coefficients of non-influential categories should be set to zero. To this end, Lasso-type penalties on the differences of coefficients are a standard approach. However, the clustering/selection performance of this approach is sometimes poor–especially when the adaptive weights are badly conditioned or not existing. In some situations, there is no incentive to cluster similar categories. To overcome this, a L0 penalty on the differences of coefficients is proposed, whereby the L0 ‘norm’ is defined as the number of non-zero entries in a vector. The proposed penalty favours to find clusters of categories that share the same effect on the response variable while the estimation accuracy is comparable to Lasso-type penalties. Numerical experiments within the framework of generalized linear models are promising. For illustration, data on the unemployment rates in Germany is analyzed.

Dokumententyp:	Zeitschriftenartikel
Fakultät:	Mathematik, Informatik und Statistik > Statistik Mathematik, Informatik und Statistik > Statistik > Lehrstühle/Arbeitsgruppen > Seminar für angewandte Stochastik
Themengebiete:	500 Naturwissenschaften und Mathematik > 510 Mathematik
URN:	urn:nbn:de:bvb:19-epub-31568-3
Allianz-/Nationallizenz:	Dieser Beitrag ist mit Zustimmung des Rechteinhabers aufgrund einer (DFG-geförderten) Allianz- bzw. Nationallizenz frei zugänglich.
Sprache:	Englisch
Dokumenten ID:	31568
Datum der Veröffentlichung auf Open Access LMU:	19. Dez. 2016 14:05
Letzte Änderungen:	04. Nov. 2020 13:08

Dokument bearbeiten