SimGrid "easy tasks" or internships

Table of Contents

Sitemap

---> misc
| ---> 2016
| ---> 2015
| ---> 2014
| ---> 2013
| ---> 2012
`--> Agenda

Edit on <2013-03-20 mer.>: This is a list Martin and I came up with a long time ago. Ideally, I would keep it updated but right now I'm putting my blog back into work.

Stock implementations of P2P algorithms

Localization

Skills

C/java/lua/… programming. Basics in distributed algorithms may help but this may be a good opportunity for learning.

Description

Despite our efforts for providing examples, SimGrid still lacks a decent collection of inspiring examples. The reason for that is that our codes based on SimGrid are generally too complex or too specific to be useful to demonstrate how to use SimGrid. Hence we only have a few rather simple examples. SimGrid has now a very efficient simulation engine that enable to conduct Peer-to-Peer simulations that are extremely scalable without trading accuracy for speed. Currently, we only have an implementation of the Chord protocol. Providing implementations of standard DHT such as CAN, Pastry, or Tapestry would allow researcher to easily compare new protocols with standard one. It would also allow teacher to more easily build their lectures on top of SimGrid.

Future Event Set

Localization

Skills

Good level of C programming. Valgrind/callgrind would be appreciated but this may be a good opportunity for learning.

Description

The core of a simulator manages events and dates thanks to a data structure called the future event set. The simulation kernel keeps inserting and removing events (do this at time t) in this data structure. Currently, SURF (SimGrid's kernel) uses a simple heap that has a guaranteed complexity but is not necessarily that efficient in practice. Recently a very nice data structure has been proposed for this purpose: the ladder queue. This data structure has amortized cost O(1) for every operation, regardless of the statistic properties of the workload and seems very promising. SURF would need a little refactoring and

Surfing on multi-core machines

Localization

Skills

Good level of C programming. Average knowledge in concurrency (threads) and synchronization. Valgrind/callgrind would be appreciated but this may be a good opportunity for learning.

Description

Recently, SimGrid started trying to take advantage of multicore machine. Simix (SimGrid's abstraction for managing user process and synchronization) is now multi-core (see a recent INRIA research report) ready but other pieces of our code could benefit from the same kind of development, in particular SURF (SimGrid's kernel). An easy approach is simply to run models in parallel. In principle, there should be no difficulty but one needs to make sure about this. Then, some kernels may be parallelized internally but it's more complex.

Improving Lua bindings performances

Localization

Skills

Good level of C and lua programming. Basic knowledge of Valgrind/callgrind.

Description

SimGrid offers several bindings for higher-level languages (java, lua, ruby,…) that make rapid prototyping of simulations much more efficient for users. Lua is very lightweight and somehow designed for this kind of usage. The performance loss of lua simulations when compared to C simulations should be around 10% but no more and our current implementation of lua bindings seems far from this. One needs to investigate the reason for this performance loss and fix it.

File storage API in MSG

Localization

Skills

C programming. Good knowledge of SimGrid layers and internal logic but this may be a good opportunity for learning.

Description

There is no notion of files in SimGrid as modeling the performance of disks is very challenging. Yet recently, we have started collaborating with CERN researchers that really need such abstraction and have a lots of exciting ideas about storage performance modeling. One needs to propose an API at all levels of SimGrid (MSG, SIMIX, SURF, XML description) and to implement it. The surf models will need to be traced with SimGrid's tracing mechanism so that disk usage and the impact of I/O can then be visualized with tools like Triva or Paje.

Power consumption

Localization

Skills

C programming. Good knowledge of SimGrid layers and internal logic but this may be a good opportunity for learning.

Description

Reducing power consumption has become a major issue when managing today's distributed platforms (for cloud providers, for large computing centers, …) and applications. For example, there has been recently a workshop on this topic. The first presentation was done by Johnatan Pecero (Université du Luxembourg) and evaluates scheduling algorithms with the power consumption/application response time trade-off in mind. Such study could be very easily done in SimGrid (especially since we already have a collection of DAG scheduling heuristics implemented with SimDAG) if SimGrid could keep track of power consumption. One needs to propose an API (scale CPU frequency down/up, turn down/up the machine, …) at all levels of SimGrid (MSG, SIMIX, SURF, XML description) and to implement it. Keeping track of power consumption should be done with SimGrid's tracing mechanism so that power consumption can then be visualized with tools like Triva or Paje.

Support for stochastic models in SimGrid

Localization

Skills

C programming. Basic knowledge in statistics and probabilities would be appreciated but this may be a good opportunity for learning.

Description

Since the creation of SimGrid, SimGrid has some support for handling resource availability variations with traces. This is very convenient when one wants to study the sensibility of algorithms to background load or unexpected load variations. A few years ago, we have reworked on this mechanism (see Bruno De Moura Donassolo, Henri Casanova, Arnaud Legrand, Pedro Velho . Fast and Scalable Simulation of Volunteer Computing Systems Using SimGrid. Workshop on Large-Scale System and Application Performance (LSAP), 2010 PDF) so that it works for Volunteer Computing systems. A drawback of the current approach is that one needs to feed all (possibly thousands) hosts with their own (and large) availability trace file and the simulation has become so fast now that a non-negligible amount of time is spent parsing these trace files. In several situations, one knows that these traces (such as the ones you can get from real systems at Failure Trace Archive) are well modeled by stochastic models (Bahman Javadi, Derrick Kondo, Jean-Marc Vincent, David P. Anderson. Discovering Statistical Models of Availability in Large Distributed Systems: An Empirical Study of SETI@home. IEEE Transactions on Parallel and Distributed Systems, 2011 PDF). Often users just generate these huge traces outside the simulator so everything would be much more efficient and easier to manage is SimGrid was generating such traces on the fly and in a reproducible way. RNGstream (http://www.iro.umontreal.ca/~lecuyer/myftp/streams00/) has been written by an international expert of random number generation that seems perfectly suited to our needs. Such support could then be used not only for generating traces but for all stochastic models we plan to implement in SimGrid.

Platform Description Archive and Simulacrum

Localization

Skills

GUI builders (Java/Swing if you extend what is already there, C++/Qt or other if you decide to restart from scratch). Basic knowledge in statistics and probabilities would be appreciated but this may be a good opportunity for learning.

Description

Proposing a set of standard platform enabling researchers to more easily reproduce results and research of others is one of our preoccupation. To this end, we set up years ago a Platform Description Archive (http://pda.gforge.inria.fr/) and wrote Simulacrum, a platform generator specifically tailored for SimGrid. Unfortunately, SimGrid platform description format has been heavily modified since and these tools are now outdated. Furthermore, Simulacrum was kind of a toy that would need some polishing to reach the status of really usable and reproducible software. New platform generation algorithms will probably have to be written as well as a support for communicating with the PDA repository.

Simterpose

Localization

Skills

C programming and Operating System. Knowing what strace and a system call are is mendatory.

Description

SimGrid enables to run models of applications of models of platforms. Some of our API (like SMPI or GRAS) enable to run the real code on top of platform models but it requires to recompile the code and/or to annotate. There are situations where it would be interesting to run unmodified code on top of fake platforms. Somehow that's what emulation or virtual machine is about except that there is little control on the platform model. If your code runs for 1 hour on an virtual machine you cannot really say how much it would have taken on another machine with a faster or slower processor (except by bluntly multiplying this time by some factor but the various application bottlenecks make this kind of extrapolation quite questionable). The idea of Simterpose is to trap all system calls so as to run the unmodified application on top of SimGrid, which would allow to capture traces and accurately network performances or IO in a controlled model. The time would be managed by SimGrid instead of being measured outside the emulation and then simply scaled up or down.

Improving java bindings performances

Localization

Skills

Good level of C and java. Knowledge about JVM black magic and thread libraries would help.

Description

SimGrid offers several bindings for higher-level languages (java, lua, ruby,…) that make rapid prototyping of simulations much more efficient for users. Java is very common and benefits from very efficient IDEs. Unfortunately, standard JVM have limitations that prevent running really scalable and fast simulations compared to what we can do when using directly C and low-level system libraries. In particular, the number of threads in a standard JVM is limited. In C, we use either threads or Unix Sys V ucontext that are a kind of user-friendly version of setjmp/longjmp. Ucontext are the basic mechanism for building user-level thread libraries. The only reason why we sometime build our SimGrid simulation using pthreads instead of ucontext is that ucontext implementation is broken on several OS. Scalable java simulation require a similar mechanism (java continuations) that is not standard either but is available in some "experimental" JVM. If, just like what we set up for the C version, we could have two implementions of our thread backend (one based on java threads and the other on java continuations), it would allow users to develop their code on standard JVM and then to move on scalable JVM when needed.

Improving the efficiency of parmap

Localization

Skills

C/assembly programming.

Description

Recently, SimGrid started trying to take advantage of multicore machine. We use at carefully chosen places a parmap loop that indicates that a bunch of things can be done in parallel. Several threads can be used and hence all the classical issues (thread management overhead, synchronization overhead, static load-balancing, reactivity, active polling vs. yielding…). Currently, we use POSIX threads, futex, sometimes directly test_and_set. The point is writing our stuff on top of the POSIX API is not the most efficient thing to do. Somehow, we're rewriting our own minimalistic user-level thread library. Maybe already existing user-level thread libraries (like marcel) would provide the right tools…

Improving website

I had a great meeting with Stéphane Ribas 4 months ago. Here are some notes (in French) on what he suggested:

  • "open source" pas présent sur le site web. LGPL (cadre juridique). The software can be downloaded here. Lien sur debian/ubuntu…
  • Afficher les valeurs auxquelles on croit: Open source, Open science dans team/about, performance, pas réinventer la roue.
  • Architecture de participation, collaboration guidelines: Mailing list trop basse, forum ? Gestion des wew-comers.
    • Things to start with
      • Avoir une doc user, c'est un ticket d'entrée pour utilisateurs
      • Avoir une doc developpeur
      • Il y a un chat IRC, un forum, bug tracker à mettre en avant, demande de features
  • liens pour Tool dissemination
  • liens vers les vidéos de SC
  • Publications: les titres sont à remonter dans les onglets, la découpe est pas clair
  • Team: Tableau avec les morceaux du logiciel et identifiant qui en est responsable et qui contribue. Identifier ces rôles motive les gens.
  • Contrib -> Friends,
  • Scientific Roadmap. Afficher les avec USS, SONGS, développement logiciel (remplaçant l'actuel History). Ça permet aux arrivants de mieux voir que le projet est actif et comment ils peuvent s'insérer.
  • Communication et affiches (comment nous référencer ?) page ranking à bouger.
  • Google alerts pour suivre les bloggers et ceux qui parlent de ça.
  • Article dans programmez.com si on veut.
  • Avoir éventuellement une accroche du genre "C'est difficile à utiliser mais voilà ce que vous gagnez" avec des benchmarks pour montrer comment on se compare à la concurence.
  • Lien forge en haut à droite plutôt qu'en onglet.

Additional ideas to details

FFI (binding unification)

Maestro éteint la lumière

PlanetSim

XP 24h

algo P2P

Pastry/splay

XP chord avec peer.xml

tâches à tiroir (fusion API)

EPR

Entered on [2012-01-24 mar. 18:16]