Skip to yearly menu bar Skip to main content


Poster

Sparse Model Inversion: Efficient Inversion of Vision Transformers with Less Hallucination

Zixuan Hu · Yongxian Wei · Li Shen · Zhenyi Wang · Lei Li · Chun Yuan · Dacheng Tao


Abstract: Model inversion, aiming at reconstructing inputs from pre-trained discriminative models, is especially useful when original training data is unavailable due to privacy or size constraints. However, existing inversion methods (i) are mainly based on convolutional neural networks (CNNs), (ii) suffer from redundant computation, and (iii) neglect unintended inversion of spurious correlations (a phenomenon we term as ``hallucination'' in model inversion). For the first time, we provide a thorough critique of existing methods, including their limitations and underlying causes. To simultaneously address these limitations, we propose a novel sparse model inversion approach, which enables (i) efficient inversion (ii) from large-scale Vision Transformers (ViTs) (iii) with less hallucination. Specifically, it selectively inverts semantic foregrounds while progressively stopping the inversion process of uninformative backgrounds, thereby reducing redundant computations and preventing potential hallucination. Notably, this is achieved without requiring any extra computational or informational demands. Through a combination of analytical and empirical studies, we validate the effectiveness of our approach in significantly boosting inversion speed (up to $\times$3.79) while maintaining, or even improving, the performance of downstream applications like model quantization and knowledge transfer.

Live content is unavailable. Log in and register to view live content