Skip to yearly menu bar Skip to main content


Poster

Memorization Through the Lens of Curvature of Loss Function Around Samples

Isha Garg · Deepak Ravikumar · Kaushik Roy


Abstract:

Deep neural networks are over-parameterized and easily overfit and memorize the datasets that they train on. In the extreme case, it has been shown that networks memorize a randomly labeled dataset. Recently, research has shown that the curvature of a sample can be utilized as a metric of its importance for subsampling in coresets. Inspired by this, in this paper we show that a modified version of curvature calculation, in particular, averaging across all epochs, serves as a reliable metric of sample memorization. We show that this curvature metric effectively captures memorization statistics, both qualitatively and quantitatively in popular image datasets. We provide quantitative validation of the proposed metric against memorization scores released by Feldman & Zhang (2020). Further, experiments on mislabeled data detection show that corrupted samples are learned with high curvature and using curvature for identifying mislabelled examples outperforms existing approaches. Qualitatively, we find that high curvature samples correspond to long-tailed, mislabeled, or conflicting instances, indicating a likelihood of memorization. Notably, this analysis helps us find, to the best of our knowledge, a novel failure mode on the CIFAR100 and ImageNet datasets: that of duplicated images with differing labels.

Live content is unavailable. Log in and register to view live content