Skip to yearly menu bar Skip to main content


Poster

Feature Attribution with Necessity and Sufficiency via Dual-stage Perturbation Test for Causal Explanation

Xuexin Chen · Ruichu Cai · Zhengting Huang · Yuxuan Zhu · Julien Horwood · Zhifeng Hao · Zijian Li · Jose Miguel Hernandez-Lobato


Abstract:

We investigate the problem of explainability in machine learning. To address this problem, Feature Attribution Methods (FAMs) measure the contribution of each feature through a perturbation test, where the difference in prediction is compared under different perturbations. However, such perturbation tests may not accurately distinguish the contributions of different features, when their change in prediction is the same after perturbation. In oder to enhance the ability of the FAMs to distinguish different feature’s contributions in the above challenging situation, we first propose utilizing the probability (PNS) that perturbing a feature is a necessary and sufficient cause for the prediction to change as feature importance measure. Then, we present a Feature Attribution with Necessity and Sufficiency (FANS) method to compute PNS where the perturbation test involves two (factual and interventional) stages. In practice, to generate counterfactual samples, we use a resampling-based approach on the observed samples to approximate the required conditional distribution. Finally, we combine FANS and gradient based optimization to extract the subset with the largest PNS. We demonstrate that our FANS outperforms existing FAMs on six benchmarks.

Live content is unavailable. Log in and register to view live content