Engineer – Adaptation and development of vocal tools to support foreign language learning

Engineer – Adaptation and development of vocal tools to support foreign language learning (french version)

  • Where:   LORIA (Nancy)
  • Team:     MULTISPEECH
  • Contact:
    • Slim Ouni –
    • Denis Jouvet –
  • Duration:             12 months (with possible extension)
  • Starting date:    Autumn 2017



MULTISPEECH studies various aspects of speech modeling, both for speech recognition and for speech synthesis. The approaches developed rely on signal processing and statistical models. The most recent modeling approaches are based on neural networks and deep learning that have yielded substantial performance gains in many areas.

Voice technologies can also be used for foreign language learning. The objective is then to detect pronunciation defects of learners (pronunciation of sounds and intonation), to make diagnosis and to help the learner to improve his pronunciation by providing him with multimodal information (textual, audio and visual). Several recent collaborative projects have focused on this theme and have enabled the recording of learner speech corpora (e.g., [Trouvain et al., 2016]), the analysis of the non-native speech of learners (e.g., [Jouvet et al 2015, Zimmerer et al., 2016, Gosh et al., 2016]) and an investigation of the reliability of automatic feedbacks to the learners (e.g., [Bonneau et al., 2013]).

As part of the e-FRAN METAL collaborative project on the use of digital technologies in education, these techniques will be adapted, enriched and implemented to help learning a foreign language at school. Experiments are planned in middle and high school classes.


In this context, the first objective will consist in consolidating vocal tools to assist in the evaluation of pronunciations, and to adapt them to the usage envisaged in the project. This will require the collection of teen voices (corresponding to the targeted levels for middle and high school experiments) and the adaptation of acoustic models to teen voices. Given the computer tools available in the classes, a client-server operating mode will be preferred.

Further work will focus on the development of the overall version of the pronunciation learning system and its experimentation in middle and high school classes. The developed system should integrate presentation of examples, evaluation of pronunciations and feedback to the learners on the quality of their pronunciations.


  • Knowledge in speech processing, speech recognition, or speech synthesis
  • Good knowledge of a speech recognition toolkit
  • Good computer and programming skills



[Bonneau et al., 2013] A. Bonneau, D. Fohr, I. Illina, D. Jouvet, O. Mella, L. Mesbahi, L. Orosanu. “Gestion d’erreurs pour la fiabilisation des retours automatiques en apprentissage de la prosodie d’une langue seconde’. Traitement Automatique des Langues, ATALA, 2013, 53 (3),  <hal-00834278>

[Ghosh et al., 2016] S. Ghosh, C. Fauth, A. Sini, Y. Laprie. “L1-L2 Interference: The case of final devoicing of French voiced fricatives in final position by German learners”. Interspeech 2016, Sep 2016, San Francisco, United States. 2016, pp.3156 – 3160, 2016, <hal-01397176>

[Jouvet et al., 2015] D Jouvet, A. Bonneau, J. Trouvain, F. Zimmerer, Y. Laprie, B. Moebius. “Analysis of phone confusion matrices in a manually annotated French-German learner corpus”. Workshop on Speech and Language Technology in Education, Sep 2015, Leipzig, Germany. Proceedings SLaTE 2015, Workshop on Speech and Language Technology in Education. <hal-01184186>

[Trouvain et al., 2016] J. Trouvain, A. Bonneau, V. Colotte, C. Fauth, D. Fohr, D. Jouvet, J. Jügler, Y. Laprie, O. Mella, B. Moebius, F. Zimmerer. “The IFCASL corpus of French and German non-native and native read speech”. LREC’2016, 10th edition of the Language Resources and Evaluation Conference, May 2016, Portorož, Slovenia. Proceedings LREC’2016. <hal-01293935>

[Zimmerer et al., 2015] F. Zimmerer, J. Trouvain, A. Bonneau. “One corpus, one research question, three methods “German vowels produced by French speakers”. Worshop on Phonetic learner corpora. Satellite meeting of ICPhS 2015., Aug 2015, Glasgow, United Kingdom.  <hal-01186078>

[Zimmerer et al., 2016] F. Zimmerer, A. Bonneau, B. Andreeva. “Influence of L1 prominence on L2 production: French and German speakers”. Speech Prosody 2016, May 2016, Boston, United States. Speech Prosody 2016, 2016, pp.370 – 374, 2016, <hal-01399974>


En ce moment

Logo du CNRS
Logo Inria
Logo Université de Lorraine