[PhD position] Explainability and Interpretability in Probabilistic Planning

  • Topic: Artificial intelligence and probabilistic planning.
  • Laboratory: LORIA (CNRS / Inria / Université de Lorraine)
  • Localization: Nancy (France)
  • Team: LARSEN
  • Supervision: Olivier Buffet & Vincent Thomas
  • Keywords: Artificial Intelligence, partially observable Markov decision processes (POMDPs), information-oriented control, explainability.
  • Apply before: May 2nd, 2022 (2022-05-02)
  • To apply: https://recrutement.inria.fr/public/classic/en/offres/2022-04720


In artificial intelligence, automated planning consists in finding which actions an agent should perform to reach a given objective. This PhD thesis focuses more specifically on probabilistic planning, where action outcomes may be uncertain, and the current state of the system is only partially known, with possibly noisy observations [6]. When a human interacts with a planning system, he may have various expectations regarding the strategy obtained through planning or have specific requirements. For instance,

  1. if this system provides a plan that the human should implement, the human may wish to understand the choices made by the planning system or to detail his preferences, or
  2. if the human and the planning system act within the same environment, for instance as part of a human-robot collaboration, the human may wish to be able to anticipate the robot’s actions and understand their objective.

In both cases, the planning system should be able to be as transparent as possible for the human, either by providing elements explaining the proposed strategy (case 1), or by proposing to the robot a strategy leaving as little uncertainty as possible regarding its interpretation (case 2). In all cases, to allow deriving the best possible response, it may be important to account for the human’s viewpoint: what he knows or may know about the dynamics, the current situation, or the objectives.

More generally, various questions can be raised regarding the information available either to the human, or even to the agent. These questions touch on several topics, including explainability (giving some answers to possibly explain the resulting strategy), interpretability (deriving a strategy whose execution is as readable as possible for the human), or confidentiality and privacy (deriving a strategy that hides the robot’s intentions to an external observer or reveals as little as possible personal data that the human would like to keep confidential).


The scientific literature has typically addressed these question independently from each other. Recently, Chakraborti et al. [2, 3] have proposed a survey and formal definitions of these various problems in the framework of general automated planning and information theory. Similarly, we would like, in this PhD thesis, to adopt a unified point of view, making the choice of a Bayesian approach to quantifying uncertainties, and to see which tools to propose to answer these questions in the setting of Markov decision models [6]. Specific models already allow reasoning for instance on the information available to the agent itself (as ρ-POMDPs [1, 5], which we have proposed in the past), or on collaborative or competitive interaction with other agents (POSG [8, 4] and I-POMDP [7]).

The objective of this PhD thesis is to propose a systematic method to describe, formalize and solve any problem combining a planning task and a willingness to control or optimize some informations held by one actor or the other, human or agent.

Desired qualifications

We are looking for candidates with a strong interest for artificial intelligence and planning.
The candidate should be familiar with probability theory and have very good programming skills.


[1] M. Araya-López, O. Buffet, V. Thomas, and F. Charpillet. “A POMDP Extension with Belief-dependent Rewards”. In: NIPS-10. 2010.
[2] T. Chakraborti, A. Kulkarni, S. Sreedharan, D. E. Smith, and S. Kambhampati. “Explicability? Legibility? Predictability? Transparency? Privacy? Security? The Emerging Landscape of Interpretable Agent Behavior”. In: ICAPS-19. 2021. URL : https://ojs.aaai.org/index.php/ICAPS/article/view/3463.
[3] T. Chakraborti, S. Sreedharan, and S. Kambhampati. “The Emerging Landscape of Explainable Automated Planning & Decision Making”. In: IJCAI-20. 2020. DOI : 10.24963/ijcai.2020/669.
[4] A. Delage, O. Buffet, and J. Dibangoye. “HSVI fo zs-POSGs using Concavity, Convexity and Lipschitz Properties”. In: CoRR/arXiv (2021). URL : https://arxiv.org/abs/2110.14529.
[5] M. Fehr, O. Buffet, V. Thomas, and J. Dibangoye. “rho-POMDPs have Lipschitz-Continuous epsilon-Optimal Value Functions”. In: NIPS-18. 2018.
[6] F. Garcia et al. Markov Decision Processes and Artificial Intelligence. Ed. by O. Sigaud and O. Buffet. ISBN: 978-1-84821-167-4. ISTE – Wiley, 2010, p. 480.
[7] P. Gmytrasiewicz and P. Doshi. “Interactive POMDPs: Properties and Preliminary Results”. In: AAMAS-04. 2004.
[8] E. A. Hansen, D. Bernstein, and S. Zilberstein. “Dynamic Programming for Partially Observable Stochastic Games”. In: AAAI-04. San Jose, CA, 2004.