Postdoc – Improvement of vocal tools for pronunciation evaluation in language learning

PostDoc – Improvement of vocal tools for pronunciation evaluation in language learning (french version)

  • Where:   LORIA (Nancy)
  • Team:     MULTISPEECH
  • Contact:
    • Slim Ouni –
    • Denis Jouvet –
  • Duration:             16 months
  • Starting date:    Winter 2018



MULTISPEECH studies various aspects of speech modeling, both for speech recognition and for speech synthesis. The approaches developed rely on signal processing and statistical models. The most recent modeling approaches are based on neural networks and deep learning that have yielded substantial performance gains in many areas.

Voice technologies can also be used for foreign language learning. The objective is then to detect pronunciation defects of learners (pronunciation of sounds and intonation), to make diagnosis and to help the learner to improve his pronunciation by providing him with multimodal information (textual, audio and visual). Several recent collaborative projects have focused on this theme and have enabled the recording of learner speech corpora (e.g., [Trouvain et al., 2016]), the analysis of the non-native speech of learners (e.g., [Jouvet et al 2015, Zimmerer et al., 2016, Gosh et al., 2016]) and an investigation of the reliability of automatic feedbacks to the learners (e.g., [Bonneau et al., 2013]).

As part of the e-FRAN METAL collaborative project on the use of digital technologies in education, these techniques will be adapted, enriched and implemented to help learning a foreign language at school. Experiments are planned in middle and high school classes.


Research will focus on the improvement and development of vocal tools for evaluating pronunciations, both at the level of sounds and at the intonation level. An important point to investigate and to deepen concerns the reliability of the processing and of the measures carried on the speech signal (e.g., duration of the sounds resulting from the phonetic segmentation, and values of the fundamental frequency), and the handling of this reliability information on the measures in elaborating diagnosis of pronunciation defects, and feedback to learners.

After adapting tools and models to the non-native context of learners of a foreign language, most of the project will be devoted to more innovative aspects and will involve the study of deep learning approaches for the detection of pronunciation defects, and the estimation of uncertainties on the measurements made (sound durations and fundamental frequency values) in order to ensure the reliability of the diagnoses performed.


  • Knowledge in speech processing, speech recognition, or speech synthesis
  • Good knowledge of a speech recognition toolkit
  • Knowledge in neural networks, and experience in using a neural network toolkit
  • Good computer and programming skills


Références bibliographiques :

[Bonneau et al., 2013] A. Bonneau, D. Fohr, I. Illina, D. Jouvet, O. Mella, L. Mesbahi, L. Orosanu. “Gestion d’erreurs pour la fiabilisation des retours automatiques en apprentissage de la prosodie d’une langue seconde’. Traitement Automatique des Langues, ATALA, 2013, 53 (3),  <hal-00834278>

[Ghosh et al., 2016] S. Ghosh, C. Fauth, A. Sini, Y. Laprie. “L1-L2 Interference: The case of final devoicing of French voiced fricatives in final position by German learners”. Interspeech 2016, Sep 2016, San Francisco, United States. 2016, pp.3156 – 3160, 2016, <hal-01397176>

[Jouvet et al., 2015] D Jouvet, A. Bonneau, J. Trouvain, F. Zimmerer, Y. Laprie, B. Moebius. “Analysis of phone confusion matrices in a manually annotated French-German learner corpus”. Workshop on Speech and Language Technology in Education, Sep 2015, Leipzig, Germany. Proceedings SLaTE 2015, Workshop on Speech and Language Technology in Education. <hal-01184186>

[Trouvain et al., 2016] J. Trouvain, A. Bonneau, V. Colotte, C. Fauth, D. Fohr, D. Jouvet, J. Jügler, Y. Laprie, O. Mella, B. Moebius, F. Zimmerer. “The IFCASL corpus of French and German non-native and native read speech”. LREC’2016, 10th edition of the Language Resources and Evaluation Conference, May 2016, Portorož, Slovenia. Proceedings LREC’2016. <hal-01293935>

[Zimmerer et al., 2015] F. Zimmerer, J. Trouvain, A. Bonneau. “One corpus, one research question, three methods “German vowels produced by French speakers”. Worshop on Phonetic learner corpora. Satellite meeting of ICPhS 2015., Aug 2015, Glasgow, United Kingdom.  <hal-01186078>

[Zimmerer et al., 2016] F. Zimmerer, A. Bonneau, B. Andreeva. “Influence of L1 prominence on L2 production: French and German speakers”. Speech Prosody 2016, May 2016, Boston, United States. Speech Prosody 2016, 2016, pp.370 – 374, 2016, <hal-01399974>


En ce moment

Logo du CNRS
Logo Inria
Logo Université de Lorraine