Skip to yearly menu bar Skip to main content


Poster

Planning with Theory of Mind for Few-Shot Adaptation in Mixed-motive Environments

Yizhe Huang · Anji Liu · Fanqi Kong · Yaodong Yang · Song-Chun Zhu · Xue Feng


Abstract:

Despite the recent successes of multi-agent reinforcement learning (MARL) algorithms, efficiently adapting to other agents in mixed-motive environments remains a significant challenge.One feasible approach is to use Theory of Mind (ToM) to reason about the mental states of other agents and model their behavior. However, these methods often encounter difficulties in efficient reasoning and utilization of inferred information.To address these issues, we propose Planning with Theory of Mind (PToM), a novel multi-agent algorithm that enables few-shot adaptation to unseen policies in mixed-motive environments. PToM is hierarchically composed of two modules: an opponent modeling module that utilizes ToM to infer others' goals and learn corresponding goal-conditioned policies, and a planning module that employs Monte Carlo Tree Search (MCTS) to identify the best response.Our approach improves efficiency by updating beliefs about others' goals both between and within episodes and by using information from the opponent modeling module to guide planning.Experimental results demonstrate that in mixed-motive environments, PToM exhibits superior few-shot adaptation capabilities when interacting with various unseen agents, and excels in self-play scenarios. Furthermore, the emergence of social intelligence during our experiments underscores the potential of our approach in complex multi-agent environments.

Live content is unavailable. Log in and register to view live content