ICML Poster A Tensor Decomposition Perspective on Second-order RNNs

Poster

A Tensor Decomposition Perspective on Second-order RNNs

Maude Lizaire · Michael Rizvi-Martel · Marawan Gamal · Guillaume Rabusseau

[ Abstract ]

Abstract:

Second-order Recurrent Neural Networks (2RNNs) are a generalization of RNNs which leverage the expressivity of second-order interactions for sequence modelling. These models are provably more expressive than their linear counterparts and have connections to well studied models from formal language theory. However, as they are parameterized by large tensors, performing computation with such models quickly becomes untractable. Different approaches have been proposed to circumvent this issue. One, known as MIRNN, consists in limiting the type of interactions used by the model. Another is to leverage tensor decomposition to diminish the parameter count. In this work, we study the model resulting from parameterizing 2RNNs using the CP decomposition, which we call CPRNN. Intuitively, the rank of the decomposition should reduce expressivity. We analyze the interaction between rank and hidden size and how these parameters affect the model capacity.Moreover, we formally show how RNNs, 2RNNs and MIRNNs relate to CPRNNs as a function of rank and hidden dimension. We support these results empirically by performing expriments using the Penn Tree Bank dataset. Our experimental results show that, given a fixed parameter budget, one can always find a choice of rank and hidden size such that CPRNNs outperform RNNs, 2RNNs and MIRNNs.

Live content is unavailable. Log in and register to view live content