DiffusAL: Coupling Active Learning with Graph Diffusion for Label-Efficient Node Classification

www.lmu.de | UB | Blättern | Hilfe

Zur erweiterten Suche

English

Zur erweiterten Suche

Gilhuber, Sandra; Busch, Julian; Rotthues, Daniel; Frey, Christian M. M. ORCID: https://orcid.org/0000-0003-2458-6651 und Seidl, Thomas ORCID: https://orcid.org/0000-0002-4861-1412 (2023): DiffusAL: Coupling Active Learning with Graph Diffusion for Label-Efficient Node Classification. European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), Turin, Italy, 18.- 22. September 2023. Koutra, Danai; Plant, Claudia; Gomez Rodriguez, Manuel; Baralis, Elena und Bonchi, Francesco (Hrsg.): In: Machine Learning and Knowledge Discovery in Databases: Research Track, Lecture Notes in Computer Science Bd. 14169 Cham: Springer. S. 75-91

Volltext auf 'Open Access LMU' nicht verfügbar.

DOI: 10.1007/978-3-031-43412-9_5

Abstract

Node classification is one of the core tasks on attributed graphs, but successful graph learning solutions require sufficiently labeled data. To keep annotation costs low, active graph learning focuses on selecting the most qualitative subset of nodes that maximizes label efficiency. However, deciding which heuristic is best suited for an unlabeled graph to increase label efficiency is a persistent challenge. Existing solutions either neglect aligning the learned model and the sampling method or focus only on limited selection aspects. They are thus sometimes worse or only equally good as random sampling. In this work, we introduce a novel active graph learning approach called DiffusAL, showing significant robustness in diverse settings. Toward better transferability between different graph structures, we combine three independent scoring functions to identify the most informative node samples for labeling in a parameter-free way: i) Model Uncertainty, ii) Diversity Component, and iii) Node Importance computed via graph diffusion heuristics. Most of our calculations for acquisition and training can be pre-processed, making DiffusAL more efficient compared to approaches combining diverse selection criteria and similarly fast as simpler heuristics. Our experiments on various benchmark datasets show that, unlike previous methods, our approach significantly outperforms random selection in 100% of all datasets and labeling budgets tested.

Dokumententyp:	Konferenzbeitrag (Paper)
Fakultät:	Mathematik, Informatik und Statistik > Informatik
Themengebiete:	000 Informatik, Informationswissenschaft, allgemeine Werke > 004 Informatik
ISBN:	978-3-031-43411-2; 978-3-031-43412-9
ISSN:	0302-9743
Ort:	Cham
Sprache:	Englisch
Dokumenten ID:	123686
Datum der Veröffentlichung auf Open Access LMU:	04. Feb. 2025 15:46
Letzte Änderungen:	04. Feb. 2025 15:46

Dokument bearbeiten