ICML Poster Is Kernel Prediction More Powerful than Gating in Convolutional Neural Networks?

Poster

Is Kernel Prediction More Powerful than Gating in Convolutional Neural Networks?

Lorenz K. Muller

[ Abstract ]

Abstract:

Neural networks whose weights are the output of a predictor (HyperNetworks), achieve excellent performance on many tasks. In ConvNets, kernel prediction layers are a popular type of HyperNetwork. Previous theoretical work has argued that there exists a hierarchy of multiplicative interactions, in which gating is at the bottom and full weight prediction as in HyperNetworks is at the top. In this paper we demonstrate constructively an equivalence between gating combined with fixed weight layers and weight-prediction, relativising the notion of a hierarchy of multiplicative interactions. We further derive an equivalence between a restricted type of HyperNetwork and factorization machines. Finally we find empirically that gating layers can learn to imitate weight prediction layers with an SGD variant and show a novel practical application in image denoising using kernel prediction networks. Our reformulation of predicted kernels a combination of fixed layers and gating reduces memory requirements.

Live content is unavailable. Log in and register to view live content