ORPAILLEUR

Knowledge Discovery guided by Domain Knowledge

Department 4 : Knowledge and language management

Team leader: Miguel Couceiro
Mail: miguel.couceiro(at)loria.fr

Website

Presentation

Knowledge discovery in databases (KDD) is a process for extracting from large databases knowledge units that can be interpreted and reused within knowledge-based sys- tems. From an operational point of view, this process is based on three main steps: (i) selection and preparation of the data, (ii) data mining, (iii) interpretation of the extracted units. The KDD process – -as implemented in the Orpailleur team– is based on data mining methods that are either symbolic or numerical. Symbolic methods are based on frequent itemsets search, association rule extraction, For- mal Concept Analysis (FCA[GW99]), and extensions of FCA such as Relational Concept Analysis (RCA [?]) and Pattern Structures [?]. Numerical methods are based on second-order Hidden Markov Models (HMM[MB06]), which have good capabilities for locating “stationary segments” in data and are thus well adapted to the mining of temporal and spatial data.

Research activities

  • Fundamentals of KDDK
  • Knowledge systems and semantic web
  • Implementing KDDK in Life Sciences
  • Structural systems biology

Software

  • Coron
  • Carottage and GenExp-LandSiTes
  • BioRegistry
  • IntelliGO
  • CreChainDo
  • Kasimir
  • Taaable
  • HexServer
  • KB-Dock

Collaborations

  • Institute of Genetics and Molecular and Cellular Biology (IGBMC) in Strasbourg (Bioinformatics team and platform)
  • INRA
  • UQAM Montréal
  • HSE Moscow
  • UFMG Belo Horizonte (Brazil, Fapemig INRIA Project) and UPFE Recife (Brazil, Facepe IN- RIA Project), plus Madagascar (University of Fianarantsoa).
  • University of Bari and the Institute Chimique de Saria (IQS) in Barcelona.
  • J. Craig Center Institute in Rockville (USA), and CIRCB in Yaoundé Cameroun.

Keywords

Knowledge discovery, knowledge representation and reasoning, knowledge system, seman- tic web, ontology, text mining, knowledge discovery in life sciences, hidden Markov models, systems biology, docking