journal article Mar 23, 2026

Explainable Machine Learning for Assessing Digital Health Literacy in Older Adults: Validation and Development of a Two-Stage Model Integrating Performance-Based and Self-Assessed Indicators

Abstract
Background
Digital health literacy (DHL) is the ability to locate, understand, evaluate, and apply health information in digital environments. It is essential for older adults to effectively engage with contemporary health care. However, existing DHL assessments primarily rely on self-reported measures, which are susceptible to subjective bias and often fail to capture actual performance. There is a need for a comprehensive, data-driven approach that integrates objective performance indicators with self-assessments to accurately predict and explain DHL levels in older adults.


Objective
This study develops and validates a machine learning approach to predict DHL levels in older adults by integrating performance-based and self-assessed evaluations.


Methods
We applied a 2-stage methodological framework using 2 independent datasets. In the first stage, to identify performance-based determinants, we assessed actual digital and information comprehension in a separate pilot cohort of 30 older adults (aged 60-74 years). In parallel, to measure self-reported DHL, we conducted an online survey with a distinct group of 1000 older adults (aged 55-74 years) using the Digital Health Literacy Scale and the Korean version of the eHealth Literacy Scale (KeHEALS). Bayesian linear regression was applied to both datasets to identify significant explanatory variables. In the second phase, we trained and validated a binary classification model to predict KeHEALS levels using the survey dataset (n=1000), leveraging the features identified in the first stage. Five machine learning algorithms were evaluated, and the best-performing model was interpreted using Shapley Additive Explanations (SHAP) analysis.


Results
In the pilot performance-based assessment, using a greater number of electronic devices and having higher educational attainment were positively associated with comprehension, whereas alcohol intake showed a negative association. In the self-assessed survey data, key correlates included interest in health-related apps, self-care confidence, age, smoking, alcohol intake, number of devices used, and exercise frequency. Among the machine learning models, categorical boosting demonstrated the most balanced performance (accuracy 0.785, precision 0.769, F1-score 0.765, area under the receiver operating characteristic curve 0.835), outperforming the dummy classifier (accuracy 0.540). SHAP analysis indicated that self-care confidence, health information search, interest in health-related apps, number of electronic devices used, and exercise frequency were the strongest positive contributors to high-DHL predictions, whereas older age and lifestyle factors (alcohol intake, smoking) contributed negatively.


Conclusions
By explicitly integrating performance-based and self-assessed indicators within an explainable machine learning framework, this study demonstrates that DHL in older adults is influenced by both digital engagement and health management factors. These findings suggest that the proposed framework can serve as a structured approach for evaluating DHL in older adults and inform the design of personalized digital health interventions in clinical and community settings.
Topics

No keywords indexed for this article. Browse by subject →

References
78
[1]
The Impact of Digital Technology on Healthcare Delivery and Patient Outcomes

Taiwo Raheemah Alawiye

E-Health Telecommunication Systems and Networks 10.4236/etsn.2024.132002
[4]
eHealth Literacy: Essential Skills for Consumer Health in a Networked World

Cameron D Norman, Harvey A Skinner

Journal of Medical Internet Research 10.2196/jmir.8.2.e9
[9]
The report on the digital divideNational Information Society Agency20242025-12-04https://www.nia.or.kr/site/nia_kor/ex/bbs/View.do?cbIdx=81623&bcIdx=27832&parentSeq=27832
[21]
eHEALS: The eHealth Literacy Scale

Cameron D Norman, Harvey A Skinner

Journal of Medical Internet Research 10.2196/jmir.8.4.e27
[37]
Digital health literacy and associated factors among internet users from China: a cross-sectional study

Bing-Yue Zhao, Long Huang, Xiao Cheng et al.

BMC Public Health 10.1186/s12889-024-18324-0
[43]
Machine Learning Approaches for Clinical Psychology and Psychiatry

Dominic Dwyer, Peter Falkai, Nikolaos Koutsouleris

Annual Review of Clinical Psychology 10.1146/annurev-clinpsy-032816-045037
[45]
From local explanations to global understanding with explainable AI for trees

Scott M. Lundberg, Gabriel Erion, Hugh Chen et al.

Nature Machine Intelligence 10.1038/s42256-019-0138-9
[46]
From machine learning to deep learning: Advances of the recent data-driven paradigm shift in medicine and healthcare

Chiranjib Chakraborty, Manojit Bhattacharya, Soumen Pal et al.

Current Research in Biotechnology 10.1016/j.crbiot.2023.100164

Showing 50 of 78 references

Metrics
1
Citations
78
References
Details
Published
Mar 23, 2026
Vol/Issue
14
Pages
e86171
Cite This Article
Choonghee Park, Jiyeon Park, Seora Kim, et al. (2026). Explainable Machine Learning for Assessing Digital Health Literacy in Older Adults: Validation and Development of a Two-Stage Model Integrating Performance-Based and Self-Assessed Indicators. JMIR Medical Informatics, 14, e86171. https://doi.org/10.2196/86171