Skip to yearly menu bar Skip to main content


Poster

Temporal Distances in Stochastic Settings: Theoretical Properties and Application to Reinforcement Learning

Vivek Myers · Chongyi Zheng · Anca Dragan · Sergey Levine · Benjamin Eysenbach


Abstract:

Temporal distances lie at the heart of many algorithms for planning, control, and reinforcement learning, allowing one to estimate the transit time between two states. However, prior attempts to define such temporal distances in stochastic settings have been stymied by an important limitation: these prior approaches do not satisfy the triangle inequality. This is not merely a definitional concern, but translates to an inability to generalize and find shortest paths. In this paper, we build on prior work in contrastive learning and quasimetrics to show how successor features learned by contrastive learning (after a change of variables) form a temporal distance that does satisfy the triangle inequality, even in stochastic settings. Importantly, this temporal distance is computationally efficient to estimate, even in high-dimensional and stochastic settings. Experiments in controlled settings and benchmark suites demonstrate that an RL algorithm based on these new temporal distances has intriguing generalization properties and outperforms prior methods, include those based on quasimetrics.

Live content is unavailable. Log in and register to view live content