ORPAILLEUR
Department 4 : Knowledge and language management
Team leader: Miguel Couceiro
Mail: miguel.couceiro(at)loria.fr
Presentation
Knowledge discovery in databases (KDD) is a process for extracting from large databases knowledge units that can be interpreted and reused within knowledge-based sys- tems. From an operational point of view, this process is based on three main steps: (i) selection and preparation of the data, (ii) data mining, (iii) interpretation of the extracted units. The KDD process – -as implemented in the Orpailleur team– is based on data mining methods that are either symbolic or numerical. Symbolic methods are based on frequent itemsets search, association rule extraction, For- mal Concept Analysis (FCA[GW99]), and extensions of FCA such as Relational Concept Analysis (RCA [?]) and Pattern Structures [?]. Numerical methods are based on second-order Hidden Markov Models (HMM[MB06]), which have good capabilities for locating “stationary segments” in data and are thus well adapted to the mining of temporal and spatial data.
Research activities
- Fundamentals of KDDK
- Knowledge systems and semantic web
- Implementing KDDK in Life Sciences
- Structural systems biology
Software
- Coron
- Carottage and GenExp-LandSiTes
- BioRegistry
- IntelliGO
- CreChainDo
- Kasimir
- Taaable
- HexServer
- KB-Dock
Collaborations
- Institute of Genetics and Molecular and Cellular Biology (IGBMC) in Strasbourg (Bioinformatics team and platform)
- INRA
- UQAM Montréal
- HSE Moscow
- UFMG Belo Horizonte (Brazil, Fapemig INRIA Project) and UPFE Recife (Brazil, Facepe IN- RIA Project), plus Madagascar (University of Fianarantsoa).
- University of Bari and the Institute Chimique de Saria (IQS) in Barcelona.
- J. Craig Center Institute in Rockville (USA), and CIRCB in Yaoundé Cameroun.
Keywords
Knowledge discovery, knowledge representation and reasoning, knowledge system, seman- tic web, ontology, text mining, knowledge discovery in life sciences, hidden Markov models, systems biology, docking