Publications

Is Risk-Sensitive Reinforcement Learning Properly Resolved?

In arXiv 2023

This paper provides an in-depth analysis on the biased optimization issue of existing risk-sensitive reinforcement learning (RSRL) methods, and proposes Trajectory Q-Learning (TQL), a novel RSRL framework that is proven to learn the optimal policy w.r.t. various risk measures.