ICML Poster The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright BreachesWithout Adjusting Finetuning Pipeline

Poster

The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright BreachesWithout Adjusting Finetuning Pipeline

Haonan Wang · Qianli Shen · Yao Tong · Yang Zhang · Kenji Kawaguchi

[ Abstract ]

Abstract:

The commercialization of text-to-image diffusionmodels (DMs) brings forth potential copyrightconcerns. Although several efforts aim to protectDMs from copyright issues by impeding unauthorizedaccess to copyrighted material, the vulnerabilitiesof these solutions are underexplored. Inthis study, we proposed a backdoor attack method(SilentBadDiffusion) to induce copyrightinfringement, without requiring access to or controlover the training processes. Our methodstrategically disperses copyrighted informationacross poisoning data, rendering it free from suspicionwhen inserted into clean dataset. The poisoningdata embeds the connections between copyrightinformation and text references into DMsduring training. By leveraging their ability to composemultiple elements through a textual prompt,DMs can be triggered to generate images infringingcopyright. Our experiments show the efficacyand stealth of the poisoning data, the specificityof trigger prompts, and the preservation of performancein DMs for image generation. Additionally,the results reveal that the more sophisticatedthe DMs are, the easier the success of the attackbecomes. These findings underline potential pitfallsin the prevailing copyright protection strategiesand underscore the necessity for increasedscrutiny to prevent the misuse of DMs.

Live content is unavailable. Log in and register to view live content