This repo contains the code and data for our PACE (ICML 2024 paper):
Probabilistic Conceptual Explainers: Trustworthy Conceptual Explanations for Vision Foundation Models
Hengyi Wang*, Shiwei Tan*, Hao Wang
[Paper] [ICML Website]
and our VALC (EMNLP 2024 Findings paper):
Variational Language Concepts for Interpreting Foundation Language Models
Hengyi Wang, Shiwei Tan, Zhiqing Hong, Desheng Zhang, Hao Wang
[Paper] [ACL Website]
We propose five desiderata for explaining vision foundation models like ViTs - faithfulness, stability, sparsity, multi-level structure, and parsimony - and demonstrate the inadequacy of current methods in meeting these criteria comprehensively. Rather than using sparse autoencoders (SAEs), we introduce a variational Bayesian explanation framework, dubbed ProbAbilistic Concept Explainers (PACE), which models the distributions of patch embeddings to provide trustworthy post-hoc conceptual explanations. Our PACE can provide dataset-, image-, and patch-level explanations for ViTs and achieves all five desiderata (faithfulness, stability, sparsity, multi-level structure, and parsimony) in a unified framework.
PACE is compatible with arbitrary vision transformers.
Below are some sample concepts automatically discovered by our PACE, without the need for concept annotation during training.
Figure 1. Above are some sample concepts discovered by PACE in the COLOR dataset. See Figure 3 of our paper for details on the COLOR dataset.
Figure 2. Above are some sample concepts discovered by PACE in the Oxford Flower dataset.
conda env create -f environment_PACE.yml
conda activate PACE
cd srcpython generate_data.pybash ./train_ViT.sh
bash ./train_PACE.shbash ./eval_PACE.shComing Soon!
@inproceedings{PACE,
  title={Probabilistic Conceptual Explainers: Trustworthy Conceptual Explanations for Vision Foundation Models},
  author={Hengyi Wang and
          Shiwei Tan and
          Hao Wang},
  booktitle={International Conference on Machine Learning},
  year={2024}
}
@inproceedings{VALC,
  title={Variational Language Concepts for Interpreting Foundation Language Models},
  author={Hengyi Wang and
          Shiwei Tan and
          Zhiqing Hong and 
          Desheng Zhang and
          Hao Wang},
  booktitle={Findings of the Association for Computational Linguistics: EMNLP 2024},
  year={2024}
}
