Yacine Abboud (Kiwi) will defend his thesis on Wednesday, November 28th in room A008 at 10.30 am.
His thesis is entitled “Pattern mining: between accessibility and robustness“.
– Sandra BRINGAY : Full Professor, University of Montpellier 3
– Omar BOUCELMA : Full Professor, University of Aix-Marseille
– Vincent GUIGUE: Associate Professor, UPMC – LIP6
– François CHAROY : Full Professor, University of Lorraine
– Anne BOYER : Full Professor, University of Lorraine
– Armelle BRUN : Associate Professor, University of Lorraine
Information now occupies a central place in our daily lives, it is both ubiquitous and easy to access. Yet extracting information from data is often an inaccessible process. Indeed, even though data mining methods are now accessible to all, the results of these mining are often complex to obtain and exploit for the user. Pattern mining combined with the use of constraints is a very promising direction of the literature to both improve the efficiency of the mining and make its results more apprehensible to the user. However, the combination of constraints desired by the user is often problematic because it does not always fit with the characteristics of the searched data such as noise. In this thesis, we propose two new constraints and an algorithm to overcome this issue. The robustness constraint allows to mine noisy data while preserving the added value of the contiguity constraint. The extended closedness constraint improves the apprehensibility of the set of extracted patterns while being more noise-resistant than the conventional closedness constraint. The C3Ro algorithm is a generic sequential pattern mining algorithm that integrates many constraints, including the two new constraints that we have introduced, to provide the user the most efficient mining possible while reducing the size of the set of extracted patterns. C3Ro competes with the best pattern mining algorithms in the literature in terms of execution time while consuming significantly less memory. C3Ro has been experienced in extracting competencies from web-based job postings.
Keywords: data mining, pattern mining, closed contiguous sequential pattern mining, constraints, noise-resistant