journal article Aug 14, 2025

Performance Evaluation of Deep Learning for the Detection and Segmentation of Thyroid Nodules: Systematic Review and Meta-Analysis

Abstract
Abstract

Background
Thyroid cancer is one of the most common endocrine malignancies. Its incidence has steadily increased in recent years. Distinguishing between benign and malignant thyroid nodules (TNs) is challenging due to their overlapping imaging features. The rapid advancement of artificial intelligence (AI) in medical image analysis, particularly deep learning (DL) algorithms, has provided novel solutions for automated TN detection. However, existing studies exhibit substantial heterogeneity in diagnostic performance. Furthermore, no systematic evidence-based research comprehensively assesses the diagnostic performance of DL models in this field.


Objective
This study aimed to execute a systematic review and meta-analysis to appraise the performance of DL algorithms in diagnosing TN malignancy, identify key factors influencing their diagnostic efficacy, and compare their accuracy with that of clinicians in image-based diagnosis.


Methods
We systematically searched multiple databases, including PubMed, Cochrane, Embase, Web of Science, and IEEE, and identified 41 eligible studies for systematic review and meta-analysis. Based on the task type, studies were categorized into segmentation (n=14) and detection (n=27) tasks. The pooled sensitivity, specificity, and the area under the receiver operating characteristic curve (AUC) were calculated for each group. Subgroup analyses were performed to examine the impact of transfer learning and compare model performance against clinicians.


Results
For segmentation tasks, the pooled sensitivity, specificity, and AUC were 82% (95% CI 79%‐84%), 95% (95% CI 92%‐96%), and 0.91 (95% CI 0.89‐0.94), respectively. For detection tasks, the pooled sensitivity, specificity, and AUC were 91% (95% CI 89%‐93%), 89% (95% CI 86%‐91%), and 0.96 (95% CI 0.93‐0.97), respectively. Some studies demonstrated that DL models could achieve diagnostic performance comparable with, or even exceeding, that of clinicians in certain scenarios. The application of transfer learning contributed to improved model performance.


Conclusions
DL algorithms exhibit promising diagnostic accuracy in TN imaging, highlighting their potential as auxiliary diagnostic tools. However, current studies are limited by suboptimal methodological design, inconsistent image quality across datasets, and insufficient external validation, which may introduce bias. Future research should enhance methodological standardization, improve model interpretability, and promote transparent reporting to facilitate the sustainable clinical translation of DL-based solutions.
Topics

No keywords indexed for this article. Browse by subject →

References
75
[1]
Worldwide Increasing Incidence of Thyroid Cancer: Update on Epidemiology and Risk Factors

Gabriella Pellegriti, Francesco Frasca, Concetto Regalbuto et al.

Journal of Cancer Epidemiology 10.1155/2013/965212
[2]
Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries

Freddie Bray, Jacques Ferlay, Isabelle Soerjomataram et al.

CA: A Cancer Journal for Clinicians 10.3322/caac.21492
[3]
Machine Learning in Medical Imaging

Maryellen L. Giger

Journal of the American College of Radiology 10.1016/j.jacr.2017.12.028
[4]
Deep Learning: A Primer for Radiologists

Gabriel Chartrand, Phillip M. Cheng, Eugene Vorontsov et al.

RadioGraphics 10.1148/rg.2017170077
[5]
Chan "Computer-aided diagnosis in the era of deep learning" Med Phys 10.1002/mp.13764
[6]
Wildman-Tobriner "Simplifying risk stratification for thyroid nodules on ultrasound: validation and performance of an artificial intelligence thyroid imaging reporting and data system" Curr Probl Diagn Radiol 10.1067/j.cpradiol.2024.07.006
[7]
Zheng "A segmentation-based algorithm for classification of benign and malignancy thyroid nodules with multi-feature information" Biomed Eng Lett 10.1007/s13534-024-00375-2
[8]
Zhang "Predicting malignancy in thyroid nodules based on conventional ultrasound and elastography: the value of predictive models in a multi-center study" Endocrine 10.1007/s12020-022-03271-w
[9]
Zhong "Diagnostic accuracy of S-Detect in distinguishing benign and malignant thyroid nodules: a meta-analysis" PLoS ONE 10.1371/journal.pone.0272149
[10]
Deng "Application of CT and MRI images based on artificial intelligence to predict lymph node metastases in patients with oral squamous cell carcinoma: a subgroup meta-analysis" Front Oncol 10.3389/fonc.2024.1395159
[11]
Zhu "Ultrasound-based deep learning using the VGGNet model for the differentiation of benign and malignant thyroid nodules: a meta-analysis" Front Oncol 10.3389/fonc.2022.944859
[12]
HajiEsmailPoor "Radiomics diagnostic performance in predicting lymph node metastasis of papillary thyroid carcinoma: a systematic review and meta-analysis" Eur J Radiol 10.1016/j.ejrad.2023.111129
[13]
A quality assessment tool for artificial intelligence-centered diagnostic test accuracy studies: QUADAS-AI

Viknesh Sounderajah, Hutan Ashrafian, Sherri Rose et al.

Nature Medicine 10.1038/s41591-021-01517-0
[14]
Preferred Reporting Items for a Systematic Review and Meta-analysis of Diagnostic Test Accuracy Studies

Matthew D. F. McInnes, David Moher, Brett D. Thombs et al.

JAMA 10.1001/jama.2017.19163
[15]
Shankarlal "Computer-aided detection and diagnosis of thyroid nodules using machine and deep learning classification algorithms" IETE J Res 10.1080/03772063.2020.1844083
[16]
Zhou "H-Net: A dual-decoder enhanced FCNN for automated biomedical image diagnosis" Inf Sci (Ny) 10.1016/j.ins.2022.09.019
[17]
Xu "Research on thyroid nodule segmentation using an improved U-Net network" RIMNI 10.23967/j.rimni.2024.05.012
[18]
Wang "A multiscale attentional unet model for automatic segmentation in medical ultrasound images" Ultrason Imaging 10.1177/01617346231169789
[19]
Usman "Intelligent healthcare system for IoMT-integrated sonography: leveraging multi-scale self-guided attention networks and dynamic self-distillation" Internet of Things 10.1016/j.iot.2024.101065
[20]
Sun "GLFNet: global-local fusion network for the segmentation in ultrasound images" Comput Biol Med 10.1016/j.compbiomed.2024.108103
[21]
Sun "CRSANet: class representations self-attention network for the segmentation of thyroid nodules" Biomed Signal Process Control 10.1016/j.bspc.2023.105917
[22]
Lu "Learning contextual representations with copula function for medical image segmentation" Biomed Signal Process Control 10.1016/j.bspc.2023.104900
[23]
Li "A weakly supervised deep active contour model for nodule segmentation in thyroid ultrasound images" Pattern Recognit Lett 10.1016/j.patrec.2022.12.015
[24]
Li "Fusing enhanced transformer and large kernel CNN for malignant thyroid nodule segmentation" Biomed Signal Process Control 10.1016/j.bspc.2023.104636
[25]
Chen "FDE-net: frequency-domain enhancement network using dynamic-scale dilated convolution for thyroid nodule segmentation" Comput Biol Med 10.1016/j.compbiomed.2022.106514
[26]
Ali "CIL-Net: densely connected context information learning network for boosting thyroid nodule segmentation using ultrasound images" Cogn Comput 10.1007/s12559-024-10289-x
[27]
Ajilisa "Segmentation of thyroid nodules from ultrasound images using convolutional neural network architectures" IFS 10.3233/jifs-212398
[28]
Rehman "Deep learning based fast screening approach on ultrasound images for thyroid nodules diagnosis" Diagnostics (Basel) 10.3390/diagnostics11122209
[29]
Fu "Stratifying high-risk thyroid nodules using a novel deep learning system" Exp Clin Endocrinol Diabetes 10.1055/a-2122-5585
[30]
Agustin "Residual U-Net approach for thyroid nodule detection and classification from thyroid ultrasound images" Automatika 10.1080/00051144.2024.2316503
[31]
Buda "Management of thyroid nodules seen on US images: deep learning may match performance of radiologists" Radiology 10.1148/radiol.2019181343
[32]
Chen "ThyroidNet: a deep learning network for localization and classification of thyroid nodules" Comput Model Eng Sci 10.32604/cmes.2023.031229
[33]
Gong "Diagnostic value of artificial intelligence-assistant diagnostic system combined with contrast-enhanced ultrasound in thyroid TI-RADS 4 nodules" J Ultrasound Med 10.1002/jum.16170
[34]
He "Thyroid gland delineation in noncontrast-enhanced CTs using deep convolutional neural networks" Phys Med Biol 10.1088/1361-6560/abc5a6
[35]
Li "SDA-Net: self-distillation driven deformable attentive aggregation network for thyroid nodule identification in ultrasound images" Artif Intell Med 10.1016/j.artmed.2023.102699
[36]
Liu "Comparison of diagnostic accuracy and utility of artificial intelligence–optimized ACR TI-RADS and original ACR TI-RADS: a multi-center validation study based on 2061 thyroid nodules" Eur Radiol 10.1007/s00330-022-08827-y
[37]
Si "Diagnostic and therapeutic performances of three score-based thyroid imaging reporting and data systems after application of equal size thresholds" Quant Imaging Med Surg 10.21037/qims-22-592
[38]
Srivastava "GSO-CNN-based model for the identification and classification of thyroid nodule in medical USG images" Netw Model Anal Health Inform Bioinforma 10.1007/s13721-022-00388-w
[39]
Srivastava "Optimizing CNN based model for thyroid nodule classification using data augmentation, segmentation and boundary detection techniques" Multimed Tools Appl 10.1007/s11042-023-15068-8
[40]
Tong "Integration of artificial intelligence decision aids to reduce workload and enhance efficiency in thyroid nodule management" JAMA Netw Open 10.1001/jamanetworkopen.2023.13674
[41]
Xu "Generalizability and diagnostic performance of AI models for thyroid US" Radiology 10.1148/radiol.221157
[42]
Zhao "A comparative analysis of two machine learning-based diagnostic patterns with thyroid imaging reporting and data system for thyroid nodules: diagnostic performance and unnecessary biopsy rate" Thyroid 10.1089/thy.2020.0305
[43]
Zheng "Diagnostic performance of artificial intelligence-based computer-aided diagnosis system in longitudinal and transverse ultrasonic views for differentiating thyroid nodules" Front Endocrinol (Lausanne) 10.3389/fendo.2023.1137700
[44]
Zheng "Automated detection and recognition of thyroid nodules in ultrasound images using Improve Cascade Mask R-CNN" Multimed Tools Appl 10.1007/s11042-021-10939-4
[45]
Zhou "Aided diagnosis of thyroid nodules based on an all-optical diffraction neural network" Quant Imaging Med Surg 10.21037/qims-23-98
[46]
He "A comparison of the performances of artificial intelligence system and radiologists in the ultrasound diagnosis of thyroid nodules" Curr Med Imaging 10.2174/1573405618666220422132251
[47]
Yang "Automated diagnosis and management of follicular thyroid nodules based on the devised small-dataset interpretable foreground optimization network deep learning: a multicenter diagnostic study" Int J Surg 10.1097/js9.0000000000000506
[48]
Chen "Applying machine-learning models to differentiate benign and malignant thyroid nodules classified as C-TIRADS 4 based on 2D-ultrasound combined with five contrast-enhanced ultrasound key frames" Front Endocrinol (Lausanne) 10.3389/fendo.2024.1299686
[49]
Diagnosis of thyroid nodules on ultrasonography by a deep convolutional neural network

Jieun Koh, Eunjung Lee, Kyunghwa Han et al.

Scientific Reports 10.1038/s41598-020-72270-6
[50]
Liu "The value of the computer-aided diagnosis system for thyroid lesions based on computed tomography images" Quant Imaging Med Surg 10.21037/qims.2019.04.01

Showing 50 of 75 references

Metrics
7
Citations
75
References
Details
Published
Aug 14, 2025
Vol/Issue
27
Pages
e73516-e73516
Cite This Article
Jiayu Ni, Yue You, Xiaohe Wu, et al. (2025). Performance Evaluation of Deep Learning for the Detection and Segmentation of Thyroid Nodules: Systematic Review and Meta-Analysis. Journal of Medical Internet Research, 27, e73516-e73516. https://doi.org/10.2196/73516