Actor-Critic Reinforcement Learning for Personalized Diabetes Management: A Matrix-Game Approach with Simulation and Visualization
Article Sidebar
Main Article Content
Reinforcement learning (RL) provides a flexible framework for optimizing personalized healthcare treatment options. In this work, we implemented a diabetes management problem using actor-critic reinforcement learning from a matrix game perspective, with the goal of maximizing long-term health outcomes. We simulated a diabetic patient over 20 weeks, with the actor recommending treatment plans (e.g., insulin and dietary interventions), while the critic determined the patient's benefit. The model was able to learn patient-specific plans that recommended treatment with normal blood sugar levels more frequently (75%) compared to using a fixed baseline (50%). We also developed a Python application to simulate diabetes models, providing visualizations of policy evolution, health outcomes, and value estimates. The results of this work demonstrate promising signs for integrating reinforcement learning into healthcare precision medicine, while highlighting future implementation challenges, addressing relevant safety constraints and appropriate data use.
Downloads
References
International Diabetes Federation, IDF Diabetes Atlas, 10th ed. Brussels, Belgium: International Diabetes Federation, 2021.
C. Yu, J. Liu, and S. Nemati, “Reinforcement learning in healthcare: A survey,” ACM Comput. Surv., vol. 54, no. 1, pp. 1–36, Jan. 2021, doi: 10.1145/3477600.
R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, 2nd ed. Cambridge, MA, USA: MIT Press, 2018.
V. Mnih et al., “Asynchronous methods for deep reinforcement learning,” in Proc. 33rd Int. Conf. Mach. Learn. (ICML), New York, NY, USA, Jun. 2016, pp. 1928–1937.
V. R. Konda and J. N. Tsitsiklis, “Actor-Critic algorithms,” in Advances in Neural Information Processing Systems (NIPS), Denver, CO, USA, Dec. 2000, pp. 1008–1014.
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, Jul. 2017.
M. Komorowski, L. A. Celi, O. Badawi, A. C. Gordon, and A. A. Faisal, “The Artificial Intelligence Clinician learns optimal treatment strategies for sepsis in intensive care,” Nature Med., vol. 24, no. 11, pp. 1716–1720, Nov. 2018, doi: 10.1038/s41591-018-0213-5.
A. Raghu, M. Komorowski, L. A. Celi, P. Szolovits, and M. Ghassemi, “Deep reinforcement learning for automated insulin dosing in type 1 diabetes,” J. Med. Internet Res., vol. 19, no. 10, p. e293, Oct. 2017, doi: 10.2196/jmir.8045.
J. Shi, X. Wang, and Y. Li, “Actor-Critic reinforcement learning for type 2 diabetes management: A simulation study,” IEEE Trans. Biomed. Eng., vol. 70, no. 3, pp. 892–901, Mar. 2023, doi: 10.1109/TBME.2022.3206723.
X. Liu, S. Wang, and J. Zhang, “Reinforcement learning in healthcare: A systematic review,” J. Biomed. Inform., vol. 128, p. 104036, Apr. 2022, doi: 10.1016/j.jbi.2022.104036.
O. Gottesman et al., “Guidelines for reinforcement learning in healthcare,” Nature Med., vol. 25, no. 1, pp. 16–18, Jan. 2019, doi: 10.1038/s41591-018-0310-5.
L. Wang, H. Zhang, and X. Li, “A game-theoretic approach to mental health treatment using reinforcement learning,” Artif. Intell. Med., vol. 115, p. 102061, May 2021, doi: 10.1016/j.artmed.2021.102061.
E. J. Topol, “High-performance medicine: The convergence of human and artificial intelligence,” Nature Med., vol. 25, no. 1, pp. 44–56, Jan. 2019, doi: 10.1038/s41591-018-0300-7.

This work is licensed under a Creative Commons Attribution 4.0 International License.
All articles published in our journal are licensed under CC-BY 4.0, which permits authors to retain copyright of their work. This license allows for unrestricted use, sharing, and reproduction of the articles, provided that proper credit is given to the original authors and the source.