The OLKi project: working towards open respectful AI for citizens

OLKi (Open Language and Knowledge for citizens) is a Lorraine University of Excellence (LUE) IMPACT project whose founding principle is to respond to the challenge of language and knowledge engineering. It is an interdisciplinary project combining computing and mathematics with the humanities and social sciences. It is coordinated by Christophe Cerisara, CNRS researcher at the Loria and manager of the Synalp team, and Aurore Coince, the project manager. 

Data leakage : a societal problematic and scientific issue

In May 2018, the political and commercial marketing company Cambridge Analytica announced the cessation of its business activities. The cause of this collapse was an international scandal in which the company was accused of having used the personal data of around 87 million Facebook users for political purposes. The upsurge of this type of scandal has led to citizens understanding less about artificial intelligence (AI) and mistrusting it more.

The issue of data protection also affects scientists. The deep learning methods which are now omnipresent in all AI’s applicative fields lose all their value if they are not continually enriched by adding large quantities of data. Data is therefore often compared to oil in terms of strategic importance. And yet, the great majority of these riches escapes the confines of country borders and is controlled by private companies thus limiting its usage by French and European scientists.

So how can control of these masses of data be kept and how can data be extracted without negatively impacting the privacy of the citizens who are at the origin of this data?

The OLKi project’s mission is to find solutions for these two issues by designing new algorithms for automatic learning that are dedicated to the extraction of knowledge from language data. Furthermore, solutions which guarantee the fair, open, shared control of data and its usage are developed. These respect citizens and their privacy.

Working towards a change in the communicational paradigm

Data on the social networks possess enormous potential for AI researchers but is it actually possible to access that data ? Who really controls it ? The OLKi project proposes to adapt our means of communication to our requirements and work towards the reappropriation of data control by researchers and citizens.

An alternative platform which derives from a large scale citizen movement

At the heart of the project is the aim to provide alternatives and overcome the usual networks by developing a platform. Institutional and citizens’ initiatives already exist such as Academic Torrents, P2P, Ortolang, Datagouv… and Fediverse. Fediverse was mainly developed in Europe and Japan. It already has 2.5 million users and is a federation of interconnected servers constructed using open source software.

The aims of the platform developed by the OLKi project are to interconnect with Fediverse’s nodes and to add a research and scientific knowledge dimension to its existing resources (music, blogs, videos, etc.). The ambition of this platform is to increase the fluidity of communication between different actors – researchers, service providers and citizens. It will host and distribute scientific resources linked to language data and knowledge extracted from such communication.

Finally, as well as achieving progress in control, ethics, openness, transparency and respect of privacy, the platform will solve problems faced by many current scientific platforms such as long-term maintenance, scaling-up, cost reduction, controlling data suppliers and the interaction between research and citizens.

OLKi: an interdisciplinary project

The OLKi project brings together 5 laboratories – the Loria works on computing and artificial intelligence, the IECL provides mathematical formalization particularly concerning automatic learning, the ATILF contributes research into linguistics, the Archives Henri Poincaré provides input on epistemological and ethical questions and the CREM works on questions of usage of the media and social networks.

By interconnecting these different disciplines, OLKi will carry out research into the production of language learning resources, the detection of hate speech on social networks, large scale group dynamics, the analysis of discourse and corpora, etc.

The Loria’s involvement with the LUE

The Loria is involved in four LUE Impact projects. The OLKi project is part of the latest wave of projects alongside the DigiTrust project which works on citizens’ trust of the digital sphere. In previous waves of project, the Loria had already been involved in the ULHyS project working on hydrogen energy and the GEENAGE project which aims to define a new diagnosis and care strategy for normal and pathological ageing

The OLKi project was launched at the Loria on Thursday March 15th2019 and involved the research consortium along with 70 participants (academic personnel and socio-economic actors).

For more information please see:

The mailing list which is open to anyone interested: (you can either subscribe directly or make a subscription request to