Reinforcement Learning in Healthcare: A Survey

Chao Yu; Jiming Liu; Shamim Nemati; Guosheng Yin

doi:10.1145/3477600

journal article Nov 23, 2021

Reinforcement Learning in Healthcare: A Survey

Chao Yu

Jiming Liu

Shamim Nemati Guosheng Yin

ACM Computing Surveys Vol. 55 No. 1 pp. 1-36 · Association for Computing Machinery (ACM)

View at Publisher Save 10.1145/3477600

Abstract

As a subfield of machine learning,
reinforcement learning
(RL) aims at optimizing decision making by using interaction samples of an agent with its environment and the potentially delayed feedbacks. In contrast to traditional supervised learning that typically relies on one-shot, exhaustive, and supervised reward signals, RL tackles sequential decision-making problems with sampled, evaluative, and delayed feedbacks simultaneously. Such a distinctive feature makes RL techniques a suitable candidate for developing powerful solutions in various healthcare domains, where diagnosing decisions or treatment regimes are usually characterized by a prolonged period with delayed feedbacks. By first briefly examining theoretical foundations and key methods in RL research, this survey provides an extensive overview of RL applications in a variety of healthcare domains, ranging from dynamic treatment regimes in chronic diseases and critical care, automated medical diagnosis, and many other control or scheduling problems that have infiltrated every aspect of the healthcare system. In addition, we discuss the challenges and open issues in the current research and highlight some potential solutions and directions for future research.

Topics

No keywords indexed for this article. Browse by subject →

References

257

[1]

David Abel John Salvatier Andreas Stuhlmüller and Owain Evans. 2017. Agent-agnostic human-in-the-loop reinforcement learning. arXiv:1701.04079. Retrieved from https://arxiv.org/abs/1701.04079.

[2]

10.1109/jproc.2013.2262913

[3]

10.3934/mbe.2004.1.223

[4]

10.1016/j.biosystems.2011.07.005

[5]

10.5555/3120007.3120018

[6]

Walid Abdullah Al and Il Dong Yun. 2019. Partial policy-based reinforcement learning for anatomical landmark localization in 3d medical images. IEEE Transactions on Medical Imaging 39, 4 (2019), 1245–1255.

[7]

Amir Alansary Loic Le Folgoc Ghislain Vaillant Ozan Oktay Yuanwei Li Wenjia Bai Jonathan Passerat-Palmbach Ricardo Guerrero Konstantinos Kamnitsas Benjamin Hou et al. 2018. Automatic view planning with multi-scale deep reinforcement learning agents. In International Conference on Medical Image Computing and Computer-Assisted Intervention . Springer 277–285. 10.1007/978-3-030-00928-1_32

[8]

Amir Alansary Ozan Oktay Yuanwei Li Loic Le Folgoc Benjamin Hou Ghislain Vaillant Ben Glocker Bernhard Kainz and Daniel Rueckert. 2018. Evaluating reinforcement learning agents for anatomical landmark detection. Medical Image Analysis 53 (2018) 156–164. 10.1016/j.media.2019.02.007

[9]

10.2337/diab.23.5.397

[10]

10.1109/titb.2011.2154384

[11]

Hideki Asoh Masanori Shiro1 Shotaro Akaho Toshihiro Kamishima Koiti Hasida Eiji Aramaki and Takahide Kohro. 2013. An application of inverse reinforcement learning to medical records of diabetes treatment. In Proceedings of the Workshop on Reinforcement Learning with Generalized Feedback (ECMLPKDD’13) . 1–8.

[12]

Hideki Asoh, Masanori Shiro, Shotaro Akaho, Toshihiro Kamishima, K. Hashida, Eiji Aramaki, and Takahide Kohro. 2013. Modeling medical records of diabetes using Markov decision processes. In Proceedings of the ICML’13 Workshop on Role of Machine Learning in Transforming Healthcare.

[13]

Susan Athey and Guido W. Imbens. 2015. Machine learning methods for estimating heterogeneous causal effects. stat 1050, 5 (2015).

[14]

10.1109/urai.2018.8441801

[15]

Abiral Baniya, Stephen Herrmann, Qiquan Qiao, and Huitian Lu. 2017. Adaptive interventions treatment modelling and regimen optimization using sequential multiple assignment randomized trials (SMART) and Q-learning. In Proceedings of the IIE Annual Conference. 1187–1192.

[16]

10.5555/3327144.3327175

[17]

10.5555/3157096.3157262

[18]

Richard Bellman. 2013. Dynamic Programming. Courier Corporation.

[19]

10.5555/538693

[20]

10.1117/12.2309945

[21]

Surya Bhupatiraju Kumar Krishna Agrawal and Rishabh Singh. 2018. Towards mixed optimization for reinforcement learning with program synthesis. arXiv:1807.00403. Retrieved from https://arxiv.org/abs/1807.00403.

[22]

10.1109/ictai.2011.15

[23]

10.1586/17434440.2013.827515

[24]

10.1007/s10994-014-5458-8

[25]

10.5555/2984093.2984115

[26]

Emily L. Butler, Eric B. Laber, Sonia M. Davis, and Michael R. Kosorok. 2017. Incorporating patient preferences into estimation of optimal individualized treatment rules. Biometrics (2017).

[27]

10.3233/idt-170285

[28]

Bibhas Chakraborty and Erica E. M. Moodie. 2013. Statistical Reinforcement Learning. Springer, New York. 31–52 pages.

[29]

10.1177/0962280209105013

[30]

Bibhas Chakraborty, Victor Strecher, and S. A. Murphy. 2008. Bias correction and confidence intervals for fitted Q-iteration. In Proceedings of the Workshop on Model Uncertainty and Risk in Reinforcement Learning (NIPS’08). Citeseer.

[31]

Chun-Hao Chang Mingjie Mai and Anna Goldenberg. 2018. Dynamic measurement scheduling for adverse event forecasting using deep RL. arXiv:1812.00268. Retrieved from https://arxiv.org/abs/1812.00268.

[32]

10.1145/3132635.3132637

[33]

Zhengping Che Sanjay Purushotham Robinder Khemani and Yan Liu. 2015. Distilling knowledge from deep networks with applications to healthcare domain. arXiv:1512.03542. Retrieved from https://arxiv.org/abs/1512.03542.

[34]

10.1109/icaci.2016.7449855

[35]

Li-Fang Cheng Niranjani Prasad and Barbara E. Engelhardt. 2018. An optimal policy for patient laboratory tests in intensive care units. arXiv:1808.04679. Retrieved from https://arxiv.org/abs/1808.04679. 10.1142/9789813279827_0029

[36]

10.5555/2034063.2034096

[37]

Opportunities and obstacles for deep learning in biology and medicine

Travers Ching, Daniel S. Himmelstein, Brett K. Beaulieu-Jones et al.

Journal of The Royal Society Interface 10.1098/rsif.2017.0387

[38]

Tianshu Chu, Jie Wang, and Jiayu Chen. 2016. An adaptive online learning framework for practical breast cancer diagnosis. In Medical Imaging 2016: Computer-Aided Diagnosis, Vol. 9785. International Society for Optics and Photonics, 978524.

[39]

10.1109/embc.2013.6610293

[40]

Elena Daskalaki Luca Scarnato Peter Diem and Stavroula G. Mougiakakou. 2010. Preliminary results of a novel approach for glucose regulation using an Actor-Critic learning based controller. IET 1–5. 10.1049/ic.2010.0287

[41]

10.1007/s00134-017-4982-y

[42]

10.1561/2300000021

[43]

10.1109/icmla.2014.8

[44]

10.1007/s11886-013-0441-8

[45]

10.1007/978-3-642-05258-3_5

[46]

10.5555/1046920.1088690

[47]

10.1109/cdc.2006.377527

[48]

10.1002/sim.6859

[49]

10.1109/cidm.2011.5949442

[50]

10.1038/s41591-018-0316-z

Showing 50 of 257 references

Cited By

404

ORAL: Adaptive Gap Increasing for Advantage Learning via Occam’s Razor Principle

Zhe Zhang, Yongle Zhou · 2026

IEEE Transactions on Neural Network...

Smart Imitator: Learning from Imperfect Clinical Decisions

Dilruk Perera, Siqi Liu · 2025

Journal of the American Medical Inf...

Neurosymbolic Reinforcement Learning and Planning: A Survey

Kamal Acharya, Waleed Raza · 2024

IEEE Transactions on Artificial Int...

Reimagining space layout design through deep reinforcement learning

Reza Kakooee, Benjamin Dillenburger · 2024

Journal of Computational Design and...

Explaining Deep Q-Learning Experience Replay with SHapley Additive exPlanations

Robert S. Sullivan, Luca Longo · 2023

Machine Learning and Knowledge Extr...

A reinforcement learning approach for multi-fleet aircraft recovery under airline disruption

JunHyeok Lee, Kyungsik Lee · 2022

Applied Soft Computing

Metrics

404

Citations

257

References

Details

Published: Nov 23, 2021
Vol/Issue: 55(1)
Pages: 1-36
License: View

Authors

C

Chao Yu

Sun Yat-sen University, Guangzhou, China

J

Jiming Liu

Hong Kong Baptist University, Kowloon Tong, Hong Kong, China

S

Shamim Nemati

UC San Diego, La Jolla, CA, USA

G

Guosheng Yin

The University of Hong Kong, Pokfulam, Hong Kong, China

Funding

Hongkong Scholar Program Award: XJ2017028

Cite This Article

Chao Yu, Jiming Liu, Shamim Nemati, et al. (2021). Reinforcement Learning in Healthcare: A Survey. ACM Computing Surveys, 55(1), 1-36. https://doi.org/10.1145/3477600

Reinforcement Learning in Healthcare: A Survey

You May Also Like