Abstract
Abstract

Objectives
Type 2 diabetes (T2D) is a growing public health burden with persistent racial and ethnic disparities. . This study assessed the completeness of social determinants of health (SdoH) data for patients with T2D in Epic Cosmos, a nationwide, cross-institutional electronic health recors (EHR) database.


Materials and Methods
The study included adults with T2D (ICD-10: E11.*) with encounters between 2022 and 2024. We analyzed 11 individual-level SDoH data elements across 5 domains—financial strain, food insecurity, housing instability, intimate partner violence, and transportation needs—and 4 components of the Social Vulnerability Index (SVI), representing neighborhood-level SDoH. Data completeness for each data element (ie, the proportion of individuals with non-missing values) was evaluated using generalized linear models, adjusting for source healthcare organization, sex, and age.


Results
Among 12 031 927 individuals with T2D, adjusted completeness for individual-level SDoH data elements ranged from 11.2% to 31.5%, varying by data element and racial/ethnic group. American Indian or Alaska Native, Asian, Hispanic, and Native Hawaiian or Other Pacific Islander individuals had lower completeness for all individual-level SDoH compared to White individuals. In contrast, SVI data elements were available for nearly all patients since they are derived from patient addresses routinely collected in EHRs.


Discussion
While SVI data elements were widely available, individual-level SDoH data elements had significant missingness, limiting their usability for secondary analyses. Racial/ethnic disparities in SDoH completeness further complicate their use.


Conclusion
Standardized, equitable SDoH collection is critical to close documentation gaps, reduce disparities, and enable accurate, bias-resistant analyses in T2D care.
Topics

No keywords indexed for this article. Browse by subject →

References
52
[1]
Shen "Twenty-five years of evolution and hurdles in electronic health records and interoperability in medical research: comprehensive review" J Med Internet Res. (2025) 10.2196/59024
[2]
Epic Cosmos (2024)
[3]
Palchuk "A global federated real-world data and analytics platform for research" JAMIA Open (2023) 10.1093/jamiaopen/ooad035
[4]
"The “All of Us” research program" New Engl J Med (2019) 10.1056/nejmsr1809937/suppl_file/nejmsr1809937_disclosures.pdf
[5]
Haendel "The National COVID Cohort Collaborative (N3C): rationale, design, infrastructure, and deployment" J Am Med Inform Assoc (2021) 10.1093/jamia/ocaa196
[6]
Forrest "PCORnet® 2020: current state, accomplishments, and future directions" J Clin Epidemiol (2021) 10.1016/j.jclinepi.2020.09.036
[7]
Castellanos "Raising the bar for real-world data in oncology: approaches to quality across multiple dimensions" JCO Clin Cancer Inform (2024) 10.1200/cci.23.00046
[8]
NEJM Catalyst (2017)
[9]
World Health Organization: Social Determinants of Health (2025)
[10]
Hill-Briggs "Social determinants of health and diabetes: a scientific review" Diabetes Care (2020) 10.2337/dci20-0053
[11]
Hill-Briggs "Social determinants of health, race, and diabetes population health improvement: Black/African Americans as a population exemplar" Curr Diab Rep (2022) 10.1007/s11892-022-01454-3
[12]
Hill-Briggs "Overview of social determinants of health in the development of diabetes" Diabetes Care (2023) 10.2337/dci23-0001
[13]
Centers for Disease Control and Prevention (2023)
[14]
Lin "Projection of the future diabetes burden in the United States through 2060" Popul Health Metr (2018) 10.1186/s12963-018-0166-4
[15]
CMS Strategic Plan Health Equity (2024)
[16]
Dullabh Expanding Social Determinants of Health Data across PCORnet (2022)
[17]
TechTarget (2024)
[18]
Craven "Toward standardization, harmonization, and integration of social determinants of health data: a Texas Clinical and Translational Science Award institutions collaboration" J Clin Transl Sci (2024) 10.1017/cts.2024.2
[19]
Phuong "Extracting patient-level social determinants of health into the OMOP common data model" AMIA Annu Symp Proc (2021)
[20]
Cook "The quality of social determinants data in the electronic health record: a systematic review" J Am Med Inform Assoc (2021) 10.1093/jamia/ocab199
[21]
Hatef "Assessing the availability of data on social and behavioral determinants in structured and unstructured electronic health records: a retrospective analysis of a multilevel health care system" JMIR Med Inform (2019) 10.2196/13802
[22]
Torres "ICD social codes: an underutilized resource for tracking social needs" Med Care (2017) 10.1097/mlr.0000000000000764
[23]
Llamocca "Use of ICD-10-CM codes for adverse social determinants of health across health systems" Psychiatr Serv (2025) 10.1176/appi.ps.20240148
[24]
Tarabichi "The Cosmos collaborative: a vendor-facilitated electronic health record data aggregation platform" ACI Open (2021) 10.1055/s-0041-1731004
[25]
Cheng "Prevalence of diabetes by race and ethnicity in the United States, 2011-2016" JAMA (2019) 10.1001/jama.2019.19365
[26]
The REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) Statement

Eric I. Benchimol, Liam Smeeth, Astrid Guttmann et al.

PLoS Medicine 2015 10.1371/journal.pmed.1001885
[27]
[28]
Office of Management and Budget
[29]
Overstreet "The intimate partner violence stigmatization model and barriers to Help-Seeking" Basic Appl Soc Psych (2013) 10.1080/01973533.2012.746599
[30]
Which states have mandatory domestic violence reporting? (2025)
[31]
Thompson (2025)
[32]
Fujihara "Machine learning approach to drug treatment strategy for diabetes care" Diabetes Metab J (2023) 10.4093/dmj.2022.0349
[33]
Sheng "Artificial intelligence for diabetes care: current and future prospects" Lancet Diabetes Endocrinol (2024) 10.1016/s2213-8587(24)00154-2
[34]
Obermeyer "Dissecting racial bias in an algorithm used to manage the health of populations" Science (2019) 10.1126/science.aax2342
[35]
Rajkomar "Ensuring fairness in machine learning to advance health equity" Ann Intern Med (2018) 10.7326/m18-1990
[36]
Gianfrancesco "Potential biases in machine learning algorithms using electronic health record data" JAMA Intern Med (2018) 10.1001/jamainternmed.2018.3763
[37]
Challen "Artificial intelligence, bias and clinical safety" BMJ Qual Saf (2019) 10.1136/bmjqs-2018-008370
[38]
Chin "Guiding principles to address the impact of algorithm bias on racial and ethnic disparities in health and health care" JAMA Netw Open (2023) 10.1001/jamanetworkopen.2023.45050
[39]
Dorr "Harnessing the promise of artificial intelligence responsibly" JAMA (2023) 10.1001/jama.2023.2771
[40]
(2025)
[41]
(2025)
[42]
Lenert "Electronic health record-based screening for intimate partner violence: a cluster randomized clinical trial" JAMA Netw Open (2024) 10.1001/jamanetworkopen.2024.25070
[43]
Chen "Ethical machine learning in healthcare" Annu Rev Biomed Data Sci (2021) 10.1146/annurev-biodatasci-092820-114757
[44]
Madley-Dowd "The proportion of missing data should not be used to guide decisions on multiple imputation" J Clin Epidemiol (2019) 10.1016/j.jclinepi.2019.02.016
[45]
Pedersen "Missing data and multiple imputation in clinical epidemiological research" Clin Epidemiol (2017) 10.2147/clep.s129785
[46]
Brown "Assessing area-level deprivation as a proxy for Individual-Level social risks" Am J Prev Med (2023) 10.1016/j.amepre.2023.06.006
[47]
Al-Sahab "Biases in electronic health records data for generating real-world evidence: an overview" J Healthc Inform Res (2024) 10.1007/s41666-023-00153-2
[48]
Verheij "Possible sources of bias in primary care electronic health record data use and reuse" J Med Internet Res (2018) 10.2196/jmir.9134
[49]
Brown "Information extraction from electronic health records to predict readmission following acute myocardial infarction: does natural language processing using clinical notes improve prediction of readmission?" J Am Heart Assoc (2022) 10.1161/jaha.121.024198
[50]
Weber "Biases introduced by filtering electronic health records for patients with “complete data" J Am Med Inform Assoc (2017) 10.1093/jamia/ocx071

Showing 50 of 52 references