journal article Nov 23, 2021

Reinforcement Learning in Healthcare: A Survey

Abstract
As a subfield of machine learning,
reinforcement learning
(RL) aims at optimizing decision making by using interaction samples of an agent with its environment and the potentially delayed feedbacks. In contrast to traditional supervised learning that typically relies on one-shot, exhaustive, and supervised reward signals, RL tackles sequential decision-making problems with sampled, evaluative, and delayed feedbacks simultaneously. Such a distinctive feature makes RL techniques a suitable candidate for developing powerful solutions in various healthcare domains, where diagnosing decisions or treatment regimes are usually characterized by a prolonged period with delayed feedbacks. By first briefly examining theoretical foundations and key methods in RL research, this survey provides an extensive overview of RL applications in a variety of healthcare domains, ranging from dynamic treatment regimes in chronic diseases and critical care, automated medical diagnosis, and many other control or scheduling problems that have infiltrated every aspect of the healthcare system. In addition, we discuss the challenges and open issues in the current research and highlight some potential solutions and directions for future research.
Topics

No keywords indexed for this article. Browse by subject →

References
257
[1]
David Abel John Salvatier Andreas Stuhlmüller and Owain Evans. 2017. Agent-agnostic human-in-the-loop reinforcement learning. arXiv:1701.04079. Retrieved from https://arxiv.org/abs/1701.04079.
[6]
Walid Abdullah Al and Il Dong Yun. 2019. Partial policy-based reinforcement learning for anatomical landmark localization in 3d medical images. IEEE Transactions on Medical Imaging 39, 4 (2019), 1245–1255.
[7]
Amir Alansary Loic Le Folgoc Ghislain Vaillant Ozan Oktay Yuanwei Li Wenjia Bai Jonathan Passerat-Palmbach Ricardo Guerrero Konstantinos Kamnitsas Benjamin Hou et al. 2018. Automatic view planning with multi-scale deep reinforcement learning agents. In International Conference on Medical Image Computing and Computer-Assisted Intervention . Springer 277–285. 10.1007/978-3-030-00928-1_32
[8]
Amir Alansary Ozan Oktay Yuanwei Li Loic Le Folgoc Benjamin Hou Ghislain Vaillant Ben Glocker Bernhard Kainz and Daniel Rueckert. 2018. Evaluating reinforcement learning agents for anatomical landmark detection. Medical Image Analysis 53 (2018) 156–164. 10.1016/j.media.2019.02.007
[11]
Hideki Asoh Masanori Shiro1 Shotaro Akaho Toshihiro Kamishima Koiti Hasida Eiji Aramaki and Takahide Kohro. 2013. An application of inverse reinforcement learning to medical records of diabetes treatment. In Proceedings of the Workshop on Reinforcement Learning with Generalized Feedback (ECMLPKDD’13) . 1–8.
[12]
Hideki Asoh, Masanori Shiro, Shotaro Akaho, Toshihiro Kamishima, K. Hashida, Eiji Aramaki, and Takahide Kohro. 2013. Modeling medical records of diabetes using Markov decision processes. In Proceedings of the ICML’13 Workshop on Role of Machine Learning in Transforming Healthcare.
[13]
Susan Athey and Guido W. Imbens. 2015. Machine learning methods for estimating heterogeneous causal effects. stat 1050, 5 (2015).
[15]
Abiral Baniya, Stephen Herrmann, Qiquan Qiao, and Huitian Lu. 2017. Adaptive interventions treatment modelling and regimen optimization using sequential multiple assignment randomized trials (SMART) and Q-learning. In Proceedings of the IIE Annual Conference. 1187–1192.
[18]
Richard Bellman. 2013. Dynamic Programming. Courier Corporation.
[21]
Surya Bhupatiraju Kumar Krishna Agrawal and Rishabh Singh. 2018. Towards mixed optimization for reinforcement learning with program synthesis. arXiv:1807.00403. Retrieved from https://arxiv.org/abs/1807.00403.
[26]
Emily L. Butler, Eric B. Laber, Sonia M. Davis, and Michael R. Kosorok. 2017. Incorporating patient preferences into estimation of optimal individualized treatment rules. Biometrics (2017).
[28]
Bibhas Chakraborty and Erica E. M. Moodie. 2013. Statistical Reinforcement Learning. Springer, New York. 31–52 pages.
[30]
Bibhas Chakraborty, Victor Strecher, and S. A. Murphy. 2008. Bias correction and confidence intervals for fitted Q-iteration. In Proceedings of the Workshop on Model Uncertainty and Risk in Reinforcement Learning (NIPS’08). Citeseer.
[31]
Chun-Hao Chang Mingjie Mai and Anna Goldenberg. 2018. Dynamic measurement scheduling for adverse event forecasting using deep RL. arXiv:1812.00268. Retrieved from https://arxiv.org/abs/1812.00268.
[33]
Zhengping Che Sanjay Purushotham Robinder Khemani and Yan Liu. 2015. Distilling knowledge from deep networks with applications to healthcare domain. arXiv:1512.03542. Retrieved from https://arxiv.org/abs/1512.03542.
[35]
Li-Fang Cheng Niranjani Prasad and Barbara E. Engelhardt. 2018. An optimal policy for patient laboratory tests in intensive care units. arXiv:1808.04679. Retrieved from https://arxiv.org/abs/1808.04679. 10.1142/9789813279827_0029
[37]
Opportunities and obstacles for deep learning in biology and medicine

Travers Ching, Daniel S. Himmelstein, Brett K. Beaulieu-Jones et al.

Journal of The Royal Society Interface 10.1098/rsif.2017.0387
[38]
Tianshu Chu, Jie Wang, and Jiayu Chen. 2016. An adaptive online learning framework for practical breast cancer diagnosis. In Medical Imaging 2016: Computer-Aided Diagnosis, Vol. 9785. International Society for Optics and Photonics, 978524.
[40]
Elena Daskalaki Luca Scarnato Peter Diem and Stavroula G. Mougiakakou. 2010. Preliminary results of a novel approach for glucose regulation using an Actor-Critic learning based controller. IET 1–5. 10.1049/ic.2010.0287

Showing 50 of 257 references

Cited By
404
IEEE Transactions on Neural Network...
Journal of the American Medical Inf...
IEEE Transactions on Artificial Int...
Journal of Computational Design and...
Machine Learning and Knowledge Extr...
Metrics
404
Citations
257
References
Details
Published
Nov 23, 2021
Vol/Issue
55(1)
Pages
1-36
License
View
Funding
Hongkong Scholar Program Award: XJ2017028
Cite This Article
Chao Yu, Jiming Liu, Shamim Nemati, et al. (2021). Reinforcement Learning in Healthcare: A Survey. ACM Computing Surveys, 55(1), 1-36. https://doi.org/10.1145/3477600
Related

You May Also Like

Data clustering

A. K. Jain, M. N. Murty · 1999

9,568 citations

Anomaly detection

Varun Chandola, Arindam Banerjee · 2009

8,799 citations

Machine learning in automated text categorization

Fabrizio Sebastiani · 2002

5,027 citations

Object tracking

Alper Yilmaz, Omar Javed · 2006

3,632 citations

A Survey on Bias and Fairness in Machine Learning

Ninareh Mehrabi, Fred Morstatter · 2021

3,466 citations