journal article Open Access May 01, 2025

EM-DeepSD: A Deep Neural Network Model Based on Cell-Free DNA End-Motif Signal Decomposition for Cancer Diagnosis

Diagnostics Vol. 15 No. 9 pp. 1156 · MDPI AG
View at Publisher Save 10.3390/diagnostics15091156
Abstract
Background and Objectives: The accurate discrimination between patients with and without cancer using their cell-free DNA (cfDNA) is crucial for early cancer diagnosis. The end-motifs of cfDNA serve as significant cancer biomarkers, offering compelling prospects for cancer diagnosis. This study proposes EM-DeepSD, a signal decomposition deep learning framework based on cfDNA end-motifs, which is aimed at improving the accuracy of cancer diagnosis and adapting to different sequencing modalities. Materials and Methods: This study included 146 patients diagnosed with cancer and 122 non-cancer controls. EM-DeepSD comprises three core modules. Initially, it utilizes a signal decomposition module to decompose and reconstruct the input end-motif profiles, thereby generating multiple regular subsequences that optimize the subsequent modeling process. Subsequently, both a machine learning module and a deep learning module are employed to improve the accuracy of cancer diagnosis. Furthermore, this paper compares the performance of EM-DeepSD with that of existing benchmarked methods to demonstrate its superiority. Based on the EM-DeepSD framework, we developed the EM-DeepSSA model and compared it with two benchmarked methods across different cfDNA sequencing datasets. Results: In the internal validation set, EM-DeepSSA outperformed the two benchmark methods for cancer diagnosis (area under the curve (AUC), 0.920; adjusted p value < 0.05). Meanwhile, EM-DeepSSA also exhibited the best performance on two independent external testing sets that were subjected to 5-hydroxymethylcytosine sequencing (5hmCS) and broad-range cell-free DNA sequencing (BR-cfDNA-Seq), respectively (test set-1: AUC = 0.933; test set-2: AUC = 0.956; adjusted p value < 0.05). Conclusions: In summary, we present a new framework which can achieve high classification performance in cancer diagnosis and which is applicable to different sequencing modalities.
Topics

No keywords indexed for this article. Browse by subject →

References
55
[1]
Serpas "Dnase1l3 deletion causes aberrations in length and end-motif frequencies in plasma DNA" Proc. Natl. Acad. Sci. USA (2019) 10.1073/pnas.1815031116
[2]
Jiang "Plasma DNA End-Motif Profiling as a Fragmentomic Marker in Cancer, Pregnancy, and Transplantation" Cancer Discov. (2020) 10.1158/2159-8290.cd-19-0622
[3]
Zhou "Epigenetic analysis of cell-free DNA by fragmentomic profiling" Proc. Natl. Acad. Sci. USA (2022) 10.1073/pnas.2209852119
[4]
Cheng "Distinct Features of Plasma Ultrashort Single-Stranded Cell-Free DNA as Biomarkers for Lung Cancer Detection" Clin. Chem. (2023) 10.1093/clinchem/hvad131
[5]
Nguyen "Multimodal analysis of methylomics and fragmentomics in plasma cell-free DNA for multi-cancer early detection and localization" Elife (2023) 10.7554/elife.89083.3
[6]
Chen, M., Chan, R.W.Y., Cheung, P.P.H., Ni, M., Wong, D.K.L., Zhou, Z., Ma, M.L., Huang, L., Xu, X., and Lee, W.S. (2022). Fragmentomics of urinary cell-free DNA in nuclease knockout mouse models. PLoS Genet., 18. 10.1371/journal.pgen.1010262
[7]
Cell-Free DNA Fragmentomics: The Novel Promising Biomarker

Ting Qi, Min Pan, Huajuan Shi et al.

International Journal of Molecular Sciences 10.3390/ijms24021503
[8]
Wang "Utility of Circulating Free DNA Fragmentomics in the Prediction of Pathological Response after Neoadjuvant Chemoradiotherapy in Locally Advanced Rectal Cancer" Clin. Chem. (2023) 10.1093/clinchem/hvac173
[9]
Cao "Multi-Dimensional Fragmentomics Enables Early and Accurate Detection of Colorectal Cancer" Cancer Res. (2024) 10.1158/0008-5472.can-23-3486
[10]
Hou "Systematically Evaluating Cell-Free DNA Fragmentation Patterns for Cancer Diagnosis and Enhanced Cancer Detection via Integrating Multiple Fragmentation Patterns" Adv. Sci. (2024) 10.1002/advs.202308243
[11]
Jiao "Leveraging cfDNA fragmentomic features in a stacked ensemble model for early detection of esophageal squamous cell carcinoma" Cell Rep. Med. (2024) 10.1016/j.xcrm.2024.101664
[12]
Zhou "Fragmentation landscape of cell-free DNA revealed by deconvolutional analysis of end motifs" Proc. Natl. Acad. Sci. USA (2023) 10.1073/pnas.2220982120
[13]
Shen, H., Liu, J., Chen, K., and Li, X. (2024). Language model enables end-to-end accurate detection of cancer from cell-free DNA. Brief. Bioinform., 25. 10.1093/bib/bbae053
[14]
Hibon "To combine or not to combine: Selecting among forecasts and their combinations" Int. J. Forecast. (2005) 10.1016/j.ijforecast.2004.05.002
[15]
Sundby "Early detection of malignant and pre-malignant peripheral nerve tumors using cell-free DNA fragmentomics" Clin. Cancer Res. (2024) 10.1158/1078-0432.ccr-24-0797
[16]
The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis

Norden E. Huang, Zheng Shen, Steven R. Long et al.

Proceedings of the Royal Society A: Mathematical,... 1998 10.1098/rspa.1998.0193
[17]
Parsaei "Comparative Analysis of Wavelet-based Feature Extraction for Intramuscular EMG Signal Decomposition" J. Biomed. Phys. Eng. (2017)
[18]
Yang "The heart sound classification of congenital heart disease by using median EEMD-Hurst and threshold denoising method" Med. Biol. Eng. Comput. (2025) 10.1007/s11517-024-03173-1
[19]
Dhongade "EEG-based schizophrenia detection: Integrating discrete wavelet transform and deep learning" Cogn. Neurodyn. (2025) 10.1007/s11571-025-10248-8
[20]
Leng "Prediction of Patient Visits for Skin Diseases through Enhanced Evolutionary Computation and Ensemble Learning" J. Med. Syst. (2025) 10.1007/s10916-025-02185-0
[21]
Parsaei "Intramuscular EMG signal decomposition" Crit. Rev. Biomed. Eng. (2010) 10.1615/critrevbiomedeng.v38.i5.20
[22]
Xiao "A New Approach for Stock Price Analysis and Prediction Based on SSA and SVM" Int. J. Inf. Technol. Decis. Mak. (2019) 10.1142/s021962201841002x
[23]
Kalantari "Forecasting COVID-19 pandemic using optimal singular spectrum analysis" Chaos Solitons Fractals (2021) 10.1016/j.chaos.2020.110547
[24]
Kumar "Stationary wavelet transform based ECG signal denoising method" ISA Trans. (2021) 10.1016/j.isatra.2020.12.029
[25]
Quinn "EMD: Empirical Mode Decomposition and Hilbert-Huang Spectral Analyses in Python" J. Open Source Softw. (2021) 10.21105/joss.02977
[26]
Huang, Y., Tong, S., Tong, Z., and Cong, F. (2021). Signal Identification of Gear Vibration in Engine-Gearbox Systems Based on Auto-Regression and Optimized Resonance-Based Signal Sparse Decomposition. Sensors, 21. 10.3390/s21051868
[27]
Cura "Detection of Alzheimer’s Dementia by Using Signal Decomposition and Machine Learning Methods" Int. J. Neural Syst. (2022) 10.1142/s0129065722500423
[28]
Munguía-Siu, A., Vergara, I., and Espinoza-Rodríguez, J.H. (2024). The Use of Hybrid CNN-RNN Deep Learning Models to Discriminate Tumor Tissue in Dynamic Breast Thermography. J. Imaging, 10. 10.3390/jimaging10120329
[29]
Shen "Development of a deep learning model for cancer diagnosis by inspecting cell-free DNA end-motifs" NPJ Precis. Oncol. (2024) 10.1038/s41698-024-00635-5
[30]
Kwiecinski, J., Grodecki, K., Pieszko, K., Dabrowski, M., Chmielak, Z., Wojakowski, W., Niemierko, J., Fijalkowska, J., Jagielak, D., and Ruile, P. (2025). Preprocedural CT angiography and machine learning for mortality prediction after transcatheter aortic valve replacement. Prog. Cardiovasc. Dis., ahead of print. 10.1016/j.pcad.2025.04.007
[31]
Xu, Y.W., Peng, Y.H., Liu, C.T., Chen, H., Chu, L.Y., Chen, H.L., Wu, Z.Y., Wei, W.Q., Xu, L.Y., and Wu, F.C. (2025). Machine learning technique-based four-autoantibody test for early detection of esophageal squamous cell carcinoma: A multicenter, retrospective study with a nested case-control study. BMC Med., 23. 10.1186/s12916-025-04066-2
[32]
Liu, T., Guo, H., Li, Q., Chen, K., Xu, J., Ma, Y., Lin, Z., Zhou, X., and Chen, B. (2025). Machine Learning-Enhanced Cerebrospinal Fluid N-Glycome for the Diagnosis and Prognosis of Primary Central Nervous System Lymphoma. J. Proteome Res., ahead of print. 10.1021/acs.jproteome.4c01006
[33]
Feher, B., de Souza Oliveira, E.H., Mendes Duarte, P., Werdich, A.A., Giannobile, W.V., and Feres, M. (2025). Machine learning-assisted prediction of clinical responses to periodontal treatment. J. Periodontol., ahead of print.
[34]
Stackpole "Cost-effective methylome sequencing of cell-free DNA for accurately detecting and locating cancer" Nat. Commun. (2022) 10.1038/s41467-022-32995-6
[35]
Cristiano "Genome-wide cell-free DNA fragmentation in patients with cancer" Nature (2019) 10.1038/s41586-019-1272-6
[36]
Zhou "Artificial intelligence in gastrointestinal cancer research: Image learning advances and applications" Cancer Lett. (2025) 10.1016/j.canlet.2025.217555
[37]
Liu, J., Shen, H., Chen, K., and Li, X. (2024). Large language model produces high accurate diagnosis of cancer from end-motif profiles of cell-free DNA. Brief. Bioinform., 25. 10.1093/bib/bbae430
[38]
Hu, X., Shi, Y., Cheng, S.H., Huang, Z., Zhou, Z., Shi, X., Zhang, Y., Liu, J., Ma, M.L., and Ding, S.C. (2025). Transformer-based deep learning for accurate detection of multiple base modifications using single molecule real-time sequencing. Commun. Biol., 8. 10.1038/s42003-025-08009-8
[39]
Lee, T.R., Ahn, J.M., Lee, J., Kim, D., Park, J., Jeong, B.H., Oh, D., Kim, S.M., Jung, G.C., and Choi, B.H. (2025). Integrating Plasma Cell-Free DNA Fragment End Motif and Size with Genomic Features Enables Lung Cancer Detection. Cancer Res., ahead of print. 10.1158/0008-5472.can-24-1517
[40]
Zhu "A deep-learning model for quantifying circulating tumour DNA from the density distribution of DNA-fragment lengths" Nat. Biomed. Eng. (2025) 10.1038/s41551-025-01370-3
[41]
Michel "Noninvasive Multicancer Detection Using DNA Hypomethylation of LINE-1 Retrotransposons" Clin. Cancer Res. (2025) 10.1158/1078-0432.ccr-24-2669
[42]
Mehmood, F., Arshad, S., and Shoaib, M. (2024). ADH-Enhancer: An attention-based deep hybrid framework for enhancer identification and strength prediction. Brief. Bioinform., 25. 10.1093/bib/bbae030
[43]
Zhang, H., Dong, P., Guo, S., Tao, C., Chen, W., Zhao, W., Wang, J., Cheung, R., Villanueva, A., and Fan, J. (2020). Hypomethylation in HBV integration regions aids non-invasive surveillance to hepatocellular carcinoma by low-pass genome-wide bisulfite sequencing. BMC Med., 18. 10.1186/s12916-020-01667-x
[44]
Song "5-Hydroxymethylcytosine signatures in cell-free DNA provide information about tumor types and stages" Cell Res. (2017) 10.1038/cr.2017.106
[45]
fastp: an ultra-fast all-in-one FASTQ preprocessor

Shifu Chen, Yanqing Zhou, Yaru Chen et al.

Bioinformatics 2018 10.1093/bioinformatics/bty560
[46]
Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications

Felix Krueger, Simon R. Andrews

Bioinformatics 2011 10.1093/bioinformatics/btr167
[47]
Chan "Noninvasive detection of cancer-associated genome-wide hypomethylation and copy number aberrations by plasma DNA bisulfite sequencing" Proc. Natl. Acad. Sci. USA (2013) 10.1073/pnas.1313995110
[48]
Mukhopadhyay, S.K., and Krishnan, S. (2020). A singular spectrum analysis-based model-free electrocardiogram denoising technique. Comput. Methods Programs Biomed., 188. 10.1016/j.cmpb.2019.105304
[49]
Kang "LASSO-Based Machine Learning Algorithm for Prediction of Lymph Node Metastasis in T1 Colorectal Cancer" Cancer Res. Treat. (2021) 10.4143/crt.2020.974
[50]
Leitheiser "Machine learning models predict the primary sites of head and neck squamous cell carcinoma metastases based on DNA methylation" J. Pathol. (2022) 10.1002/path.5845

Showing 50 of 55 references

Metrics
1
Citations
55
References
Details
Published
May 01, 2025
Vol/Issue
15(9)
Pages
1156
License
View
Funding
National Key Research and Development Program of China Award: 2021YFC2500400
Cancer Innovative Research Program of Sun Yat-sen University Cancer Center Award: 2021YFC2500400
Cite This Article
Zhi-Yang Zhao, Chang-Ling Huang, Tong-min Wang, et al. (2025). EM-DeepSD: A Deep Neural Network Model Based on Cell-Free DNA End-Motif Signal Decomposition for Cancer Diagnosis. Diagnostics, 15(9), 1156. https://doi.org/10.3390/diagnostics15091156