[PhD 2019] Formal and statistical modeling of dialogue

Supervisor : Maxime Amblard, Sémagramme (maxime.amblard@univ-lorraine.fr)
Co-supervisor : Chloé Braud, Synalp (chloe.braud@loria.fr)

Title : Formal and statistical modeling of dialogue

Application : https://www.adum.fr/script/candidature/index.pl?site=Lorraine&matricule_prop=25850

Subject :
Modeling interaction is a crucial step for Natural Language Processing (NLP), which requires the development of automatic tools able to simulate these exchanges.
A typical example is chatbots and all the services based on them. But Dialogue Models face two types of difficulties.

(I) The first issue concerns the availability of resources and models that can analyze and process dialogues. Modelling dialogues is very hard, in particular because conversations  highlight particular uses such as the relationship between questions and answers.
Recent models show that this issue is beyond  semantics understanding, and even more so beyond discourse. However, it shares with the latter the involvement of formal methods (based on the principle of compositionality), of distributional semantics [7], and of machine learning.

Most of formalisms for natural languages are based, in one way or another, on a notion of state change that is used to model the dynamic phenomena. As a consequence, it is difficult to use standard tools of mathematical logic at the level of the discourse interpretation. A type-theoretic way of rebuilding DRT (and its variants [8,9]) and dynamic logic [5] has been proposed. This proposal, which is based on Church’s simple theory of type [3], takes advantage of the notion of continuation in order to allow quantifiers to dynamically extend their scopes. In particular, these continuations could be adapted to the dynamics of dialogue, the semantics and pragmatics.

Another perspective is to use Machine Learning approaches in order to identify dialogical relations and dialogical interactions. Generally speaking discursive analysis aims at building a structure representing the semantic links between sentences.

Automatically deriving this structure represents a challenge especially when the links are implicit, as proven for monologues [11]. In dialogue, an additional difficulty is that we need semantic links between speech acts. Annotated data are very scarce for dialogues, with one corpus developed for English [2]. Consequently, only a few automatic systems exist [1,10] with performance probably limited by the few amount of data. Semi-supervised or transfer learning methods could help to identify such kind of relations, based on data existing for monologues and on constraints identified through the formal modeling. More annotation is still needed at least for evaluating the models.

(II) The second is that dialogue models must be coordinated with pragmatic inferences at a higher level. In this case, we can refer to linguistic models of dialogue such as [6], or to models that capture conceptual links, such as in TTR [4]. While speech models provide important information, dialogue makes it possible to share information in a more sophisticated way. The solution must take into account the background of all speakers, as well as how they have common ground.

To this end, the candidate will propose a dialogue model to structure the different necessary linguistic informations for interaction. This model will be implemented in a tool that finely manages interaction through formal and learning strategies. As a result, the development of multilingual annotated resources will be integrated into the thesis work.

[1] Afantenos, Stergos and Kow, Eric and Asher, Nicholas and Perret, Jérémy (2015). Discourse parsing for multi-party chat dialogues, Proceedings of ACL.
[2] Asher, N and Lascarides, A and Lemon, O and Guhe, M and Rieser, V and Muller, P and Afantenos, S and Benamara, F and Vieu, L and Denis, P (2012). Modelling strategic conversation: The STAC project, Proceedings of the Workshop on the Semantics and Pragmatics of Dialogue.
[3] Church, A. (1940). A formulation of the simple theory of types. Journal of Symbolic Logic, 5:56–68.
[4] Cooper, R., & Ginzburg, J. (2015). TTR for Natural Language Semantics 2.
[5] de Groote, P. (2006). Towards a Montagovian account of dynamics. In M. Gibson and J. Howell, editors, Proceedings of Semantics and Linguistic Theory XVI. Cornell University, Ithaca, NY.
[6] Ginzburg, J. (2016). Semantics of dialogue. In M. Aloni & P. Dekker (Eds.), The Cambridge Handbook of Formal Semantics. Cambridge University Press.
[7] Harris, Z. S. (1954). Distributional structure. Word, 10(2-3), 146-162.
[8] Kamp, H. and U. Reyle (1993). From Discourse to Logic. Kluwer Academic Publishers, Dordrecht.
[9] van Eijck, J. and H. Kamp (1997). Representing Discourse in Context. In J. van Benthem and A. ter Meulen (eds.), Handbook of Logic and Language. Elsevier.
[10] Shi, Zhouxing and Huang, Minlie (2018). A Deep Sequential Model for Discourse Parsing on Multi-Party Dialogues, arXiv preprint arXiv:1812.00176.
[11] Xue, Nianwen and Ng, Hwee Tou and Pradhan, Sameer and Rutherford, Attapol and Webber, Bonnie and Wang, Chuan and Wang, Hongmin (2016). Conll 2016 shared task on multilingual shallow discourse parsing, Proceedings of the CoNLL-16 shared task.

Logo du CNRS

Logo d'Inria

Logo Université de Lorraine