Skip to yearly menu bar Skip to main content


Poster

Tripod: Three Complementary Inductive Biases for Disentangled Representation Learning

Kyle Hsu · Jubayer Ibn Hamid · Kaylee Burns · Chelsea Finn · Jiajun Wu


Abstract:

In disentangled representation learning, inductive biases are crucial for narrowing down an underspecified solution set. In this work, we endow a neural network autoencoder with three select inductive biases from the literature: latent quantization, latent multiinformation regularization, and the Hessian (off-diagonal) penalty. In principle, these inductive biases are deeply complementary: they most directly specify properties of the latent space, encoder, and decoder, respectively. In practice, however, naively combining these techniques fails to yield significant benefits. To address this, we propose innovations to the three techniques that simplify the learning problem, equip key regularization terms with stabilizing invariances, and quash degenerate incentives. The resulting model, Tripod, achieves state-of-the-art results on a comprehensive suite of four image disentanglement benchmarks. We also verify that Tripod improves significantly on its naive incarnation and that all three of its ``legs'' are necessary for consistent performance.

Live content is unavailable. Log in and register to view live content