Skip to yearly menu bar Skip to main content


Poster

Reward Shaping for Reinforcement Learning with An Assistant Reward Agent

HAOZHE MA · Kuankuan Sima · Thanh Vinh Vo · Di Fu · Tze-Yun Leong


Abstract:

Reward shaping is a promising approach to tackle the sparse-reward challenge of reinforcement learning by reconstructing more informative, dense rewards. This paper introduces a novel dual-agent reward shaping framework, composed of two synergistic agents: a policy agent to learn the optimal behavior and a reward agent to generate auxiliary reward signals. The proposed method operates as a self-learning approach, without reliance on expert knowledge or hand-crafted functions. By restructuring the rewards to capture future-oriented information, our framework effectively enhances the sample efficiency and convergence stability. Furthermore, the auxiliary reward signals facilitate the exploration of the environment in the early stage and the exploitation of the policy agent in the late stage, achieving a self-adaptive balance. We evaluate our framework on continuous control tasks with sparse and delayed rewards, demonstrating its robustness and superiority over existing methods.

Live content is unavailable. Log in and register to view live content