Poster
Environment Design for Inverse Reinforcement Learning
Thomas Kleine Buening · Victor Villin · Christos Dimitrakakis
Learning a reward function from demonstrations suffers from lowsample-efficiency. Even with abundant data, current inversereinforcement learning methods that focus on learning from a singleenvironment can fail to handle slight changes in the environmentdynamics. We tackle these challenges through adaptive environmentdesign. In our framework, the learner repeatedly interacts withthe expert, with the former selecting environments to identify the reward function as quickly as possible from the expert'sdemonstrations in said environments. This results in improvementsin both sample-efficiency and robustness, as we show experimentally, for both exact and approximate inference.
Live content is unavailable. Log in and register to view live content