ORCID: https://orcid.org/0000-0001-6002-6980 und Kira, Zsolt
(2023):
ConstraintMatch for Semi-constrained Clustering.
International Joint Conference on Neural Networks (IJCNN), Gold Coast, Australia, 18. - 23. Juni 2023.
In: IJCNN 2023 conference proceedings2023 International Joint Conference on Neural Networks (IJCNN),
Piscataway: IEEE.
Abstract
Constrained clustering allows the training of classi-fication models using pairwise constraints only, which are weak and relatively easy to mine, while still yielding full-supervision-level model performance. While they perform well even in the absence of the true underlying class labels, constrained clustering models still require large amounts of binary constraint annotations for training. In this paper, we propose a semi-supervised context whereby a large amount of unconstrained data is available alongside a smaller set of constraints, and propose ConstraintMatch to leverage such unconstrained data. While a great deal of progress has been made in semi-supervised learning using full labels, there are a number of challenges that prevent a naive application of the resulting methods in the constraint-based label setting. Therefore, we reason about and analyze these challenges, specifically 1) proposing a pseudo-constraining mechanism to overcome the confirmation bias, a major weakness of pseudo-labeling, 2) developing new methods for pseudo-labeling towards the selection of informative unconstrained samples, 3) showing that this also allows the use of pairwise loss functions for the initial and auxiliary losses which facilitates semi-constrained model training. In extensive experiments, we demonstrate the effectiveness of ConstraintMatch over relevant baselines in both the regular clustering and overclustering scenarios on five challenging benchmarks and provide analyses of its several components.
Dokumententyp: | Konferenzbeitrag (Paper) |
---|---|
Fakultät: | Mathematik, Informatik und Statistik > Statistik |
Themengebiete: | 000 Informatik, Informationswissenschaft, allgemeine Werke > 004 Informatik
500 Naturwissenschaften und Mathematik > 510 Mathematik |
ISBN: | 978-1-6654-8867-9 ; 978-1-6654-8868-6 |
Ort: | Piscataway |
Sprache: | Englisch |
Dokumenten ID: | 123747 |
Datum der Veröffentlichung auf Open Access LMU: | 15. Feb. 2025 15:38 |
Letzte Änderungen: | 15. Feb. 2025 15:38 |