sLIME (semantic LIME) provides a generic interface to the Local Interpretable Model-Agnostic Explanations package, allowing for construction of arbitrary transformers that remove features / concepts from data instances. For example, with images, the original package only implements superpixels as features; with sLIME it is possible to consider human-level concepts, such as eyes or ears, in the local models.
The dissertation associated with this project is here - I probably won't be writing this into a shorter paper.
The following tutorials are / will be available in the repository.
- Superpixel segmentation: Recreates the superpixel segmentation tutorial from the LIME repo as a basic introduction to transformers and perturbers.
- Generated datasets: How to explain classifications on a generated dataset where the user can create arbitrary in-distribution images through feature perturbation.
- Training transformers from generated datasets: How to train transformers on a dataset where the user has access to examples of images with and without features (eg. the same background with and without a foreground object).
- Training transformers from feature detectors (not yet available): How to train transformers on a dataset where the user only has accessed to examples that are labelled as to whether they contain a feature or not.