Logo Logo
Hilfe
Hilfe
Switch Language to English

Schneider, Lisa; Krasowski, Aleksander ORCID logoORCID: https://orcid.org/0000-0003-0192-788X; Pitchika, Vinay ORCID logoORCID: https://orcid.org/0000-0001-6947-2602; Bombeck, Lisa; Schwendicke, Falk ORCID logoORCID: https://orcid.org/0000-0003-1223-1669 und Büttner, Martha ORCID logoORCID: https://orcid.org/0000-0001-9004-213X (Mai 2025): Assessment of CNNs, transformers, and hybrid architectures in dental image segmentation. In: Journal of Dentistry, Bd. 156, 105668 [PDF, 3MB]

Abstract

Objectives

Convolutional Neural Networks (CNNs) have long dominated image analysis in dentistry, reaching remarkable results in a range of different tasks. However, Transformer-based architectures, originally proposed for Natural Language Processing, are also promising for dental image analysis. The present study aimed to compare CNNs with Transformers for different image analysis tasks in dentistry.

Methods

Two CNNs (U-Net, DeepLabV3+), two Hybrids (SwinUNETR, UNETR) and two Transformer-based architectures (TransDeepLab, SwinUnet) were compared on three dental segmentation tasks on different image modalities. Datasets consisted of (1) 1881 panoramic radiographs used for tooth segmentation, (2) 1625 bitewings used for tooth structure segmentation, and (3) 2689 bitewings for caries lesions segmentation. All models were trained and evaluated using 5-fold cross-validation.

Results

CNNs were found to be significantly superior over Hybrids and Transformer-based architectures for all three tasks. (1) Tooth segmentation showed mean±SD F1-Score of 0.89±0.009 for CNNs, 0.86±0.015 for Hybrids and 0.83±0.22 for Transformer-based architectures. (2) In tooth structure segmentation CNNs also outperformed with 0.85±0.008 compared to Hybrids 0.84±0.005 and Transformers 0.83±0.011. (3) Even more pronounced results were found for caries lesions segmentation; 0.49±0.031 for CNNs, 0.39±0.072 for Hybrids and 0.32±0.039 for Transformer-based architectures.

Conclusion

CNNs significantly outperformed Transformer-based architectures and their Hybrids on three segmentation tasks (teeth, tooth structures, caries lesions) on varying dental data modalities (panoramic and bitewing radiographs).

Clinical significance

As deep-learning-based image analysis is part of modern dentistry, practitioners and dental researchers should be aware of strength and limitations of modern model architectures for dental-image analysis. Models that demonstrate optimal performance in other domains do not necessarily constitute the optimal selection for the purpose of dental imaging.

Dokument bearbeiten Dokument bearbeiten