journal article Open Access Oct 28, 2024

Practical guide to SHAP analysis: Explaining supervised machine learning model predictions in drug development

View at Publisher Save 10.1111/cts.70056
Abstract
AbstractDespite increasing interest in using Artificial Intelligence (AI) and Machine Learning (ML) models for drug development, effectively interpreting their predictions remains a challenge, which limits their impact on clinical decisions. We address this issue by providing a practical guide to SHapley Additive exPlanations (SHAP), a popular feature‐based interpretability method, which can be seamlessly integrated into supervised ML models to gain a deeper understanding of their predictions, thereby enhancing their transparency and trustworthiness. This tutorial focuses on the application of SHAP analysis to standard ML black‐box models for regression and classification problems. We provide an overview of various visualization plots and their interpretation, available software for implementing SHAP, and highlight best practices, as well as special considerations, when dealing with binary endpoints and time‐series models. To enhance the reader's understanding for the method, we also apply it to inherently explainable regression models. Finally, we discuss the limitations and ongoing advancements aimed at tackling the current drawbacks of the method.
Topics

No keywords indexed for this article. Browse by subject →

References
50
[14]
Qian Z "Integrating expert ODEs into neural ODEs: pharmacology and disease progression" Adv Neural Inf Proces Syst (2021)
[21]
Denney W "What is normal? A meta‐analysis of phase 1 placebo data" Population Approach Group in Europe (2014)
[24]
Shapley LS "A value for n‐person games" Contribution to the Theory of Games (1953)
[25]
Strumbelj E "An efficient explanation of individual classifications using game theory" J Machine Learning Res (2010)
[26]
Lundberg SM "A unified approach to interpreting model predictions" Adv Neural Inf Proces Syst (2017)
[27]
Molnar C (2023)
[29]
Centers for Disease Control and Prevention (CDC).National Center for Health Statistics (NCHS). National Health and Nutrition Examination Survey. Accessed July 25 2024.https://www.cdc.gov/nchs/nhanes/
[32]
Masís S (2023)
[33]
Wolberg WH "Importance of nuclear morphology in breast cancer prognosis" Clin Cancer Res (1999)
[34]
Ismail AA "Benchmarking deep learning interpretability in time series predictions" Adv Neural Inf Proces Syst (2020)
[36]
SHAPforxgboost.Accessed July 25 2024.https://cran.r‐project.org/web/packages/SHAPforxgboost/readme/README.html
[37]
Shapper.Accessed July 25 2024.https://modeloriented.github.io/shapper/
[38]
ChristophM.Interpretable machine learning: A guide for making black box models explainable(Leanpub).2020.
[40]
Corr_shap.Accessed July 29 2024.https://github.com/Fraunhofer‐SCAI/corr_shap/tree/main
[41]
Shapr.Accessed July 29 2024.https://github.com/NorskRegnesentral/shapr
[43]
DuvalA MalliarosFD.Graphsvx: Shapley value explanations for graph neural networks. Machine Learning and Knowledge Discovery in Databases Research Track: European Conference ECML PKDD 2021 Bilbao Spain September 13–17 2021 Proceedings Part II 21. 2021: 302–318. 10.1007/978-3-030-86520-7_19
[46]
RibeiroMT SinghS GuestrinC.“Why should i trust you?” Explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining2016: 1135–1144. 10.1145/2939672.2939778
[47]
SundararajanM TalyA YanQ.Axiomatic attribution for deep networks. International Conference on Machine Learning2017: 3319–3328.
[48]
Fisher A "All models are wrong, but many are useful: learning a variable's importance by studying an entire class of prediction models simultaneously" J Mach Learn Res (2019)
[49]
BentoJ SaleiroP CruzAF FigueiredoMA BizarroP.Timeshap: explaining recurrent models through sequence perturbations. Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining2021: 2565–2573. 10.1145/3447548.3467166
Cited By
450
Biomedical Signal Processing and Co...
Metrics
450
Citations
50
References
Details
Published
Oct 28, 2024
Vol/Issue
17(11)
License
View
Cite This Article
Ana Victoria Ponce‐Bobadilla, Vanessa Schmitt, Corinna S. Maier, et al. (2024). Practical guide to SHAP analysis: Explaining supervised machine learning model predictions in drug development. Clinical and Translational Science, 17(11). https://doi.org/10.1111/cts.70056
Related

You May Also Like