Skip to yearly menu bar Skip to main content


Poster

Privacy Backdoors: Stealing Data with Corrupted Pretrained Models

Shanglun Feng · Florian Tramer


Abstract:

Practitioners commonly download pretrained machine learning models from open repositories andfinetune them to fit specific applications. We showthat this practice introduces a new risk of privacy backdoors. By tampering with a pretrainedmodel’s weights, an attacker can fully compromise the privacy of the finetuning data. We showhow to build privacy backdoors for a variety ofmodels, including transformers, which enable anattacker to reconstruct individual finetuning samples, with a guaranteed success! We further showthat backdoored models allow for tight privacyattacks on models trained with differential privacy(DP). The common optimistic practice of trainingDP models with loose privacy guarantees is thusinsecure if the model is not trusted. Overall, ourwork highlights a crucial and overlooked supplychain attack on machine learning privacy.

Live content is unavailable. Log in and register to view live content