Gabin Personeni (Orpailleur) will defend his thesis on Friday, November 9th at 2pm in room A008.
His thesis is entitled “Contribution of domain ontologies for knowledge discovery in biomedical data”.
The jury will be composed of the following 8 members :
-Olivier Dameron, Maître de Conférences à l’Université de Rennes 1
-Céline Rouveirol, Professeur à l’Université Paris 13
-Jérôme Azé, Professeur à l’Université de Montpellier
-Anne Boyer, Professeur à l’Université de Lorraine
-Adrien Coulet, Maître de Conférences à l’Université de Lorraine
-Marie-Dominique, Chargée de Recherches, CNRS
-Michel Dumontier, Distinguished Professor, Maastricht University
-Malika Smaïl-Tabbone, Maître de Conférences à l’Université de Lorraine
The semantic Web proposes standards and tools to formalize and share knowledge on the Web, in the form of ontologies. Biomedical ontologies and associated data represents a vast collection of complex, heterogeneous and linked knowledge. The analysis of such knowledge presents great opportunities in healthcare, for instance in pharmacovigilance. This thesis explores several ways to make use of this biomedical knowledge in the data mining step of a knowledge discovery process. In particular, we propose three methods in which several ontologies cooperate to improve data mining results.
A first contribution of this thesis describes a method based on pattern structures, an extension of formal concept analysis, to extract associations between adverse drug events from patient data. In this context, a phenotype ontology and a drug ontology cooperate to allow a semantic comparison of these complex adverse events, and leading to the discovery of associations between such events at varying degrees of generalization, for instance, at the drug or drug class level.
A second contribution uses a numeric method based on semantic similarity measures to classify different types of genetic intellectual disabilities, characterized by both their phenotypes and the functions of their linked genes. We study two different similarity measures, applied with different combinations of phenotypic and gene function ontologies. In particular, we investigate the influence of each domain of knowledge represented in each ontology on the classification process, and how they can cooperate to improve that process.
Finally, a third contribution uses the data component of the semantic Web, the Linked Open Data (LOD), together with linked ontologies, to characterize genes responsible for intellectual deficiencies. We use Inductive Logic Programming, a suitable method to mine relational data such as LOD while exploiting domain knowledge from ontologies by using reasoning mechanisms. Here, ILP allows to extract from LOD and ontologies a descriptive and predictive model of genes responsible for intellectual disabilities.
These contributions illustrates the possibility of having several ontologies cooperate to improve various data mining processes.