Publications | Infosys Centre for AI

Stable and Uncertainty-Aware Local Post-hoc Explanations Using Active Learning

With the increasing deployment of opaque machine learning models, there is a critical need for locally faithful, stable and reliable model explanations. Existing model-agnostic approaches typically rely on random perturbation sampling, which introduces variance across runs, thereby motivating the development of perturbation strategies that explicitly prioritize informativeness. In order to alleviate the instability in explanations, there is a need to query perturbations that are maximally informative about local surrogate parameters. We propose the EAGLE framework, which formulates the perturbation selection as an active learning problem, using expected information gain and Bayesian disagreement, and produces feature importance scores with associated confidence estimates, thereby quantifying the reliability of explanations. Experiments on tabular and image datasets show that EAGLE improves reproducibility across runs, balances the exploration-exploitation tradeoff in perturbation selection, and achieves high neighborhood stability relative to state of the art baselines such as Focus Sampling based BayesLIME and UnRAvEL.

Notes

Authors

* External authors

Authors

Sumedha Chugh, Dr. Ranjitha Prasad

Venue

San Diego, USA

Published

September 22, 2025