My main research is on the evaluation of performance and correction of distributed systems in the contexts of High Performance Computing and Grid Computing. I put a particular emphasis on experimentation methodologies. For that, I use several approaches, such as simulation, emulation, and formal methods. For each of them I have a pragmatic approach, striving to provide ready to use tools with strong technical and theoretical basis. Past research also include monitoring solutions and middleware for scientific applications.
I am also involved in Computer Science Education, although more lightly so far.
Here is a list of the five publications I am the most proud of. You can also find the complete boring list of all my publications over the years here. Reducing this list to only five items is actually a political choice, to fight the quantitative evaluations (see this blog post for more details). These publications were selected because they are the most representative of my work, even if some of them were not published to well-known venues yet. I encourage you to do the same exercice as I did here: select your five main publications, and argument why you consider them as such. In my personal case, I faced what I was refusing to see: even if I'm personnally opposed to any quantitative evaluation, I still tend to publish short papers on very specific topics. I have issues advising to a friend which of my papers (s)he should read...
Towards Scalable, Accurate, and Usable Simulations of Distributed Applications and Systems.Olivier Beaumont, Laurent Bobelin, Henri Casanova , Pierre-Nicolas Clauss, Bruno Donassolo, Lionel Eyraud-Dubois, Stéphane Genaud, Sascha Hunold, Arnaud Legrand, Martin Quinson, Cristian Rosa, Lucas Schnorr, Mark Stillwell, Frédéric Suter, Christophe Thiery, Pedro Velho, Jean-Marc Vincent, Young J. Won.
This paper intends to give a good overview of our work around the SimGrid framework over the past three years (which explains the amound of co-authors). Since SimGrid hosts most of my scientific activity since almost 10 years now, it is naturally very important to me. That's only a RR for now because the reviewing process is still ongoing. Will see if the reviewers like it or not.
INRIA RR-7761, HAL.
If your religion forbids you to read good research reports (you should choose your religion more carfully, you weirdo), you can read this ways too short and quite outdated paper that got accepted at a conference back in 2008. But it's nowhere as instructive as the RR, I'd say.
Article, Slides, HAL. Published at the 10th IEEE International Conference on Computer Modeling and Simulation, Cambrige, UK, 2008.
Single Node On-Line Simulation of MPI Applications with SMPI. Pierre-Nicolas Clauss, Mark Stillwell, Stéphane Genaud, Frédéric Suter, Henri Casanova, Martin Quinson.
This article presents a very large body of work about the ability to simulate directly MPI applications within SimGrid. In the process, we had to naturally write the interception code, but also greatly improve the accuracy of our models for cluster-like platforms. Even if I was not as active as I'd have liked to be, this great scientific adventure is very close from what I'd like to live more often: A clear but difficult goal, an intense investigation work involving a lot of cleaver people, and a lot of theoritical work and technical difficulties on this investigation path...
The article is well written (thanks Henri!) and serves well the great work we did on that topic; this is rare enough in my publication list to be noted A longer version is currently in preparation for a journal submittion.
Article, Slides, HAL. Published at the 25th IEEE International Parallel & Distributed Processing Symposium (IPDPS'11), May 16-20, 2011, Anchorage (Alaska) USA.
Parallel Simulation of Peer-to-Peer Systems. Martin Quinson, Cristian Rosa, Christophe Thiéry.
This is another great scientific adventure that we had. The question was "how fast, how big can SimGrid simulation be". Quite soon, we parallelized the simulator to gain in speed and that was already a difficult contribution: the simulation loop can be very short (depending on the user code), and is thus very challenging to parallelize efficiently.
Moreover, we had the nageling feeling that the speedup was the wrong quality metric: the best way to get a huge speedup is to write a really bad sequential version first. So, we worked very hard on improving the sequential simulations to not use that cheat. It took several months, were we had a sort of race between the sequential and parallel version of our code. Each time that the profiler allowed us to improve the parallel version, we ran it on the sequential version too. And almost every time, the improvement in the parallel version allowed us to pin-point a clear inefficiency in the sequential version, which, once fixed, allowed the sequential to be faster again than the parallel... The work of Christophe in this process is to be noted: We hated him more than once for improving sequential this way!
After a few months of this game, we submitted a first version of this paper to a conference, and one of the reviewer answered that 30 years of research in parallel discrete event simulation is not something that we can easily ignore. So we started thinking of why we do not follow that classical approach, and why our approach is more adapted to this specific case.
This resulted in what I consider to be a well balanced paper between fundamental findings and technical results (we can simulate either 1000 times biggers scenarios than our best concurrent, or go 1000 times faster on similar scenarios). Well, now we are waiting for the return of reviewers of this new version, submitted to CCGrid.
INRIA , HAL.
A Simple Model of Communication APIs - Application to Dynamic Partial-order Reduction. Cristian Rosa, Stephan Merz, Martin Quinson.
This paper is somehow much more theoretical than the rest of my work. Our overall work is to extend SimGrid, which is a tool to study the performance of distributed applications through simulation so that it becomes possible to study the correction of these applications through dynamic verification (which is a variant of model checking). The classical issue here is the exponential explosion of the state space to explore.
This paper explains how we implemented a limited amount of communication primitives at the core of SimGrid. This limited set is combined together to write all the communications functionalities offered to the users in SimGrid (including the MPI ones!). This design allowed us to propose and prove an efficient independence function between these communications core primitives. We shown that our resulting reduction process do not cut any mandatory transitions, but it cuts more useless transitions than the previously proposed independence functions.
This work was mainly done by Cristian Rosa during his PhD, and a journal paper is under preparation to present these facts (and was came out of here) in more details.
Article, Slides, HAL. Published at the 10th International Workshop on Automated Verification of Critical Systems (AVOCS 2010),
A First Step Towards Automatically Building Network Representations. Lionel Eyraud Dubois, Arnaud Legrand, Martin Quinson and Frédéric Vivien.
This paper presents my lastest work on the automatic network tomography front. It very important to me because it somehow concludes several years of work. Developping a usable network automatic mapper is why I developped GRAS in the first place (which is how I got involved into SimGrid actually). Also, we had to develop a new evaluation framework for the mapper algorithms before to work on the algorithms themselves (that was done during the master thesis of Ahmed Harbaoui), and the algorithms presented in this article are the result of the post-doc of Lionel.
As you can see, this work concluded a lot of previous efforts, and was quite seminal for me on new fronts. Currently, network mapping is not one of my active development effort anymore. They are somehow continued by Laurent Bobelin, who took some inspiration in my PhD and the work presented here, and proposed new solutions in his own PhD and subsequent works. I'm delighted!
Article, Slides, HAL. Published at the 13th International EuroPar Conference, Rennes, France, August 2007, LNCS 4641:160--169 (Springer-Verlag).
I am assistant professor at the university of Nancy since 2005. I was assistant lecturer in the apache research group of the University of Grenoble for the 6 last months of 2004. Earlier in 2004, I was post-doc in the Mayhem Lab of the UCSB (USA). Before, between 2000 and 2003, I was PhD student at the LIP laboratory of the ENS-Lyon (France). I still quite actively collaborate with the apaches and the lippiens. In another life, I was student at the university of St Etienne where I got my Maitrise in 1999.