ConTrust: Robust Contrastive Explanations for Deep Neural Networks

(started Aug 2022, duration 4 years)


The area of Explainable Artificial Intelligence (XAI) is concerned with providing methods and tools to improve the interpretability of learned models, such as Deep Neural Networks (DNNs). A widely recognised factor contributing to this end is the availability of contrastive explanations, arguments supporting or contrasting the decisions taken by a DNN. While several approaches exist to generate such explanations, they are often lacking robustness, i.e., they may produce completely different explanations for similar events. This phenomenon can have troubling implications, as lack of robustness may indicate that explanations are not capturing the underlying decision-making process of a DNN and thus cannot be trusted.


My proposal is to tackle these problems using an approach based on Verification of Neural Network (VNN). VNN is concerned with providing methods to certify that DNNs satisfy a given property, or generate counterexamples to comprehensively show the circumstances under which violations may occur. Crucially, a large body of VNN research focuses on certifying the robustness of predictions made by DNNs and efficient tools have been developed for this purpose.

In this project, I will extend techniques for the verification of DNNs and develop new explainability methods to generate robust contrastive explanations for deep neural networks. Such an approach is motivated by several similarities between the two research areas, but also by the lack of effective solutions within XAI for generating robust explanations.


  1. Provably Robust and Plausible Counterfactual Explanations for Neural Networks via Robust Optimisation. J. Jiang, J. Lan, F. Leofante, A. Rago, F. Toni. The 15th Asian Conference on Machine Learning (ACML 2023).
  2. Robust Explanations for Human-Neural Multi-agent Systems with Formal Verification. F. Leofante, A. Lomuscio. The 20th European Conference on Multi-Agent Systems (EUMAS 2023).
  3. Counterfactual Explanations and Model Multiplicity: a Relational Verification View. F. Leofante, E. Botoeva, V. Rajani. The 20th International Conference on Principles of Knowledge Representation and Reasoning (KR 2023).
  4. Towards Robust Contrastive Explanations for Human-Neural Multi-agent Systems. F. Leofante, A. Lomuscio. The 22nd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2023).
  5. Formalising the Robustness of Counterfactual Explanations for Neural Networks. J. Jiang*, F. Leofante*, A. Rago, F. Toni. The 37th AAAI Conference on Artificial Intelligence (AAAI 2023). * Equal contribution.


  1. Robust Explainable AI: the Case of Counterfactual Explanations. F. Leofante. The 26th European Conference on Artificial Intelligence (ECAI 2023).


This research is funded by Imperial College London under the ICRF fellowship scheme.