Clustering compositional data using Dirichlet mixture model

www.lmu.de | UB | Blättern | Hilfe

Zur erweiterten Suche

English

Zur erweiterten Suche

Pal, Samyajoy und Heumann, Christian (18. Mai 2022): Clustering compositional data using Dirichlet mixture model.
In: PLOS ONE 17(5) [PDF, 3MB]

Vorschau

Creative Commons: Namensnennung 4.0 (CC-BY)

DOI: 10.1371/journal.pone.0268438

Externer Volltext: https://doi.org/10.1371/journal.pone.0268438

Abstract

A model-based clustering method for compositional data is explored in this article. Most methods for compositional data analysis require some kind of transformation. The proposed method builds a mixture model using Dirichlet distribution which works with the unit sum constraint. The mixture model uses a hard EM algorithm with some modification to overcome the problem of fast convergence with empty clusters. This work includes a rigorous simulation study to evaluate the performance of the proposed method over varied dimensions, number of clusters, and overlap. The performance of the model is also compared with other popular clustering algorithms often used for compositional data analysis (e.g. KMeans, Gaussian mixture model (GMM) Gaussian Mixture Model with Hard EM (Hard GMM), partition around medoids (PAM), Clustering Large Applications based on Randomized Search (CLARANS), Density-Based Spatial Clustering of Applications with Noise (DBSCAN) etc.) for simulated data as well as two real data problems coming from the business and marketing domain and physical science domain, respectively. The study has shown promising results exploiting different distributional patterns of compositional data.

Dokumententyp:	Zeitschriftenartikel
Fakultät:	Mathematik, Informatik und Statistik > Statistik > Lehrstühle/Arbeitsgruppen > Methoden für fehlende Daten, Modellselektion und Modellmittelung
Themengebiete:	300 Sozialwissenschaften > 310 Statistiken
URN:	urn:nbn:de:bvb:19-epub-104643-4
ISSN:	1932-6203
Sprache:	Englisch
Dokumenten ID:	104643
Datum der Veröffentlichung auf Open Access LMU:	13. Jul. 2023 13:43
Letzte Änderungen:	04. Jan. 2024 12:05
DFG:	Gefördert durch die Deutsche Forschungsgemeinschaft (DFG) - 491502892

Dokument bearbeiten