Loading Events

« All Events

  • This event has passed.

PhD defense: Tatiana Makhalova

23 June 2021 @ 14:00 pm - 17:00 pm

Tatiana Makhalova will defend his thesis, entitled: Contributions to pattern set mining: from complex datasets to significant and useful pattern sets, on Wednesday, June 23, at 2 p.m.online (in English).
Abstract:
We discuss different aspects of pattern mining in binary and numerical tabular datasets. The objective of pattern mining is to discover a small set of non-redundant patterns that may cover entirely a given dataset and be interpreted as useful and significant knowledge units. We focus on such issues as (i) formal definition of pattern interestingness, (ii) the mitigation of the pattern explosion problem, (iii) measure for evaluating the performance of pattern mining, and (iv) the discrepancy between interestingness and quality of the discovered pattern sets.
The first part of the talk is devoted to a so-called closure structure and the GDPM algorithm for its computing. The closure structure allows for estimating both the data and pattern complexity.  Moreover, we discuss how the closure structure allows an analyst to understand the intrinsic data configuration before selecting an interestingness measure for pattern mining.
In the second part, we discuss the difference between interestingness and quality of pattern sets. We present the KeepItSimple algorithm that adopts the best practices of supervised learning in pattern mining and relates interestingness and the quality of pattern sets. We show that KeepItSimple allows for efficient mining of a set of interesting and good-quality patterns without any pattern explosion.
The third part of the talk is devoted to numerical pattern mining. We present an MDL-based algorithm called Mint for mining pattern sets in numerical data. The Mint algorithm relies on a strong theoretical foundation and at the same time has a practical objective in returning a small set of numerical, non-redundant, and informative patterns. Mint has very good behavior in practice and usually outperforms its competitors.
Keywords: Pattern Set Mining; Pattern interestingness; MDL; Minimum Description Length principle; Closed patterns; Equivalence classes; Data complexity; Closure structure; Pattern explosion; Pattern evaluation; Formal Concept Analysis; Interval Pattern Structures; Binary data; Numerical data

Composition of the jury

Reviewers:

Arnaud Soulet, MCf HDR, Université de Tours, Tours 

Jilles Vreeken, Pr. The CISPA Helmholtz Center for Information Security, Saarbrücken
 
Examiners:
François Charoy, Pr. Université de Lorraine, Nancy

Antoine Cornuéjols, Pr. AgroParisTech, Paris   

Elisa Fromont, Pr. Université de Rennes, Rennes
Esther Galbrun, CR Inria, University of Eastern Finland, Kuopio
Christel Vrain, Pr. Université de d’Orléans, Orléans
Supervisors:
Sergei O. Kuznetsov  Pr. NRU HSE, Moscow
Amedeo Napoli, DR CNRS LORIA, Nancy

Logo d'Inria