ORCID: https://orcid.org/0000-0002-8893-2683; Tichy, Antonin
ORCID: https://orcid.org/0000-0002-6260-9992; Paris, Sebastian und Schwendicke, Falk
ORCID: https://orcid.org/0000-0003-1223-1669
(Mai 2025):
Improving machine learning-based bitewing segmentation with synthetic data.
In: Journal of Dentistry, Bd. 156, 105679
Abstract
Objectives
Class imbalance in datasets is one of the challenges of machine learning (ML) in medical image analysis. We employed synthetic data to overcome class imbalance when segmenting bitewing radiographs as an exemplary task for using ML.
Methods
After segmenting bitewings into classes, i.e. dental structures, restorations, and background, the pixel-level representation of implants in the training set (1543 bitewings) and testing set (177 bitewings) was 0.03 % and 0.07 %, respectively. A diffusion model and a generative adversarial network (pix2pix) were used to generate a dataset synthetically enriched in implants. A U-Net segmentation model was trained on (1) the original dataset, (2) the synthetic dataset, (3) on the synthetic dataset and fine-tuned on the original dataset, or (4) on a dataset which was naïvely oversampled with images containing implants.
Results
U-Net trained on the original dataset was unable to segment implants in the testing set. Model performance was significantly improved by naïve over-sampling, achieving the highest precision. The model trained only on synthetic data performed worse than naïve over-sampling in all metrics, but with fine-tuning on original data, it resulted in the highest Dice score, recall, F1 score and ROC AUC, respectively. The performance on other classes than implants was similar for all strategies except training only on synthetic data, which tended to perform worse.
Conclusions
The use of synthetic data alone may deteriorate the performance of segmentation models. However, fine-tuning on original data could significantly enhance model performance, especially for heavily underrepresented classes.
Clinical significance
This study explored the use of synthetic data to enhance segmentation of bitewing radiographs, focusing on underrepresented classes like implants. Pre-training on synthetic data followed by fine-tuning on original data yielded the best results, highlighting the potential of synthetic data to advance AI-driven dental imaging and ultimately support clinical decision-making.
Dokumententyp: | Zeitschriftenartikel |
---|---|
Fakultät: | Medizin > Klinikum der LMU München > Poliklinik für Zahnerhaltung und Parodontologie |
Themengebiete: | 600 Technik, Medizin, angewandte Wissenschaften > 610 Medizin und Gesundheit |
ISSN: | 03005712 |
Sprache: | Englisch |
Dokumenten ID: | 126137 |
Datum der Veröffentlichung auf Open Access LMU: | 20. Mai 2025 13:17 |
Letzte Änderungen: | 20. Mai 2025 13:17 |
DFG: | Gefördert durch die Deutsche Forschungsgemeinschaft (DFG) - 445925495 |