ORCID: https://orcid.org/0009-0003-9304-2954; Tolstaya, Ekaterina
ORCID: https://orcid.org/0000-0002-8893-2683; Tichy, Antonin
ORCID: https://orcid.org/0000-0002-6260-9992; Paris, Sebastian
ORCID: https://orcid.org/0000-0002-1302-8761; Aarabi, Ghazal
ORCID: https://orcid.org/0000-0001-5484-2594; Chaurasia, Akhilanand
ORCID: https://orcid.org/0000-0002-8356-9512; Malenova, Yoana; Steybe, David und Schwendicke, Falk
ORCID: https://orcid.org/0000-0003-1223-1669
(2025):
Machine learning versus clinicians for detection and classification of oral mucosal lesions.
In: Journal of Dentistry, Bd. 161, 105992
[PDF, 10MB]
Abstract
Objectives
The detection and classification of oral mucosal lesions is a challenging task due to high heterogeneity and overlap in clinical appearance. Nevertheless, differentiating benign from potentially malignant lesions is essential for appropriate management. This study evaluated whether a deep learning model trained to discriminate 11 classes of oral mucosal lesions could exceed the performance of general dentists.
Methods
4079 intraoral photographs of benign, potentially malignant and malignant oral lesions were labeled using bounding boxes and classified into 11 classes. The data were split 80:20 for training (n = 3031) and validation (n = 766), keeping an independent test set (n = 282). The YOLOv8 computer vision model was implemented for image classification and object detection. Model performance was evaluated on the test set which was also assessed by six general dentists and three specialists in oral surgery. Evaluation metrics included sensitivity, specificity, F1-score, precision, area under the receiver operating characteristic curve (AUROC), and average precision (AP) at multiple thresholds of intersection over union.
Results
In terms of classification, the highest F1-score (0.80) and AUROC (0.96) were observed for human papillomavirus (HPV)-related lesions, whereas the lowest F1-score (0.43) and AUROC (0.78) were obtained for keratosis. In terms of object detection, the best results were achieved for HPV-related lesions (AP25 = 0.82) and proliferative verrucous leukoplakia (AP25 = 0.80; AP50 = 0.76), while the lowest values were noted for leukoplakia (AP25 = 0.36; AP50 = 0.20). Overall, the model performed comparable to specialists (p = 0.93) and significantly better than general dentists (p < 0.01).
Conclusion
The developed model performed as well as specialists in oral surgery, highlighting its potential as a valuable tool for oral lesion assessment.
Clinical significance
By providing performance comparable to oral surgeons and superior to general dentists, the developed multi-class model could support the clinical evaluation of oral lesions, potentially enabling earlier diagnosis of potentially malignant disorders, enhancing patient management and improving patient prognosis.
| Dokumententyp: | Zeitschriftenartikel |
|---|---|
| Fakultät: | Medizin > Klinikum der LMU München > Poliklinik für Zahnerhaltung und Parodontologie |
| Themengebiete: | 600 Technik, Medizin, angewandte Wissenschaften > 610 Medizin und Gesundheit |
| URN: | urn:nbn:de:bvb:19-epub-128968-8 |
| ISSN: | 03005712 |
| Sprache: | Englisch |
| Dokumenten ID: | 128968 |
| Datum der Veröffentlichung auf Open Access LMU: | 22. Okt. 2025 12:45 |
| Letzte Änderungen: | 22. Okt. 2025 12:45 |
