Skip to yearly menu bar Skip to main content


Poster

Position Paper: A Critical Evaluation of Reinforcement Learning in Dynamic Treatment Regimes

Zhiyao Luo · Yangchen Pan · Peter Watkinson · Tingting Zhu


Abstract:

In the rapidly changing healthcare landscape, the implementation of offline reinforcement learning (RL) in dynamic treatment regimes (DTRs) presents a mix of unprecedented opportunities and challenges. This position paper offers a critical examination of the current status of offline RL in the context of DTRs. We argue for a reassessment of the necessity and efficacy of applying RL in DTRs, citing concerns such as inconsistent and potentially inconclusive evaluation metrics, the absence of naive and supervised learning baselines, and the diverse choice of RL formulation in existing research. Through a case study with more than 17,000 evaluation experiments using a publicly available Sepsis data set, we demonstrate that the relative performance of RL algorithms can significantly vary with changes in evaluation metrics and Markov Decision Process (MDP) formulations. Surprisingly, it is observed that in some instances, RL algorithms perform poorly compared to random baselines subjected to policy evaluation methods and reward design. This casts doubt on the effectiveness and essentiality of employing RL algorithms in DTRs. Additionally, we discussed potential enhancements toward more reliable development of RL-based dynamic treatment regimes (RL-DTRs) and invited further discussion within the community.

Live content is unavailable. Log in and register to view live content