Skip to yearly menu bar Skip to main content


Poster

A Distributional Analogue to the Successor Representation

Harley Wiltzer · Jesse Farebrother · Arthur Gretton · Yunhao Tang · Andre Barreto · Will Dabney · Marc Bellemare · Mark Rowland


Abstract:

This paper contributes a new approach for distributional reinforcement learning which elucidatesa clean separation of transition structure and reward in the learning process. Analogous to howthe successor representation (SR) describes the expected consequences of behaving according to agiven policy, our distributional successor measure(SM) describes the distributional consequences ofthis behaviour. We formulate the distributionalSM as a distribution over distributions and provide theory connecting it with distributional andmodel-based reinforcement learning. Moreover,we propose an algorithm that learns the distributional SM from data by minimizing a two-levelmaximum mean discrepancy. Key to our methodare a number of algorithmic techniques that areindependently valuable for learning generativemodels of state. As an illustration of the usefulness of the distributional SM, we show that itenables zero-shot risk-sensitive policy evaluationin a way that was not previously possible.

Live content is unavailable. Log in and register to view live content