Skip to yearly menu bar Skip to main content


Poster

Delving into Differentially Private Transformer

Youlong Ding · Xueyang Wu · Yining meng · Yonggang Luo · Hao Wang · Pan Weike


Abstract:

Deep learning with differential privacy (DP) has garnered significant attention over past years, leading to the development of numerous methods aimed at enhancing model accuracy and training efficiency. This paper delves into the problem of training Transformer models with differential privacy. Our treatment is modular: the logic is to reduce such a specific problem with additional hardness as training DP Transformer to the more basic problem of training DP (vanilla) Neural Nets, which is better understood and amenable to a brunch of model-agnostic methods. Such reduction is done by first identifying the hardness unique to DP Transformer training: the `attention distraction' phenomenon and a lack of compatibility with existing techniques for efficient gradient clipping. To deal with these two issues, we propose the Re-Attention Mechanism and Phantom Clipping, respectively. We believe that our work not only casts new light on training DP Transformers but also promotes a modular treatment to advance research in the field of differentially private deep learning.

Live content is unavailable. Log in and register to view live content