Skip to yearly menu bar Skip to main content


Poster

Purifying Quantization-conditioned Backdoors via Layer-wise Activation Correction with Distribution Approximation

Boheng Li · Yishuo Cai · Jisong Cai · Yiming Li · Han Qiu · Run Wang · Tianwei Zhang


Abstract:

Model quantization is a compression technique that converts a full-precision model to a more compact low-precision version for better storage. Despite the great success of quantization, recent studies revealed the feasibility of malicious exploiting model quantization to implant quantization-conditioned backdoors (QCBs). These special backdoors remain dormant in full-precision models but are exposed upon quantization. Unfortunately, existing defenses have limited effects on mitigating QCBs. In this paper, we conduct the first in-depth analysis of QCBs. We reveal an intriguing characteristic of QCBs, where activation of backdoor-related neurons on even benign samples enjoy a distribution drift after quantization, although this drift is more significant on poisoned samples. Motivated by this finding, we propose to purify the backdoor-exposed quantized model by aligning its layer-wise activation with its full-precision version. To further exploit the more pronounced activation drifts on poisoned samples, we design an additional module to layer-wisely approximate poisoned activation distribution based on batch normalization statistics of the full-precision model. Extensive experiments are conducted, verifying the effectiveness of our defense and its resistance to the adaptive attack.

Live content is unavailable. Log in and register to view live content