SimGrid "easy tasks" or internships

Table of Contents

Sitemap

---> misc
| ---> 2016
| ---> 2015
| ---> 2014
| ---> 2013
| ---> 2012
`--> Agenda

Edit on <2013-03-20 mer.>: This is a list Martin and I came up with a long time ago. Ideally, I would keep it updated but right now I'm putting my blog back into work.

Stock implementations of P2P algorithms

Localization

Skills

C/java/lua/… programming. Basics in distributed algorithms may help but this may be a good opportunity for learning.

Description

Despite our efforts for providing examples, SimGrid still lacks a decent collection of inspiring examples. The reason for that is that our codes based on SimGrid are generally too complex or too specific to be useful to demonstrate how to use SimGrid. Hence we only have a few rather simple examples. SimGrid has now a very efficient simulation engine that enable to conduct Peer-to-Peer simulations that are extremely scalable without trading accuracy for speed. Currently, we only have an implementation of the Chord protocol. Providing implementations of standard DHT such as CAN, Pastry, or Tapestry would allow researcher to easily compare new protocols with standard one. It would also allow teacher to more easily build their lectures on top of SimGrid.

Future Event Set

Localization

Skills

Good level of C programming. Valgrind/callgrind would be appreciated but this may be a good opportunity for learning.

Description

The core of a simulator manages events and dates thanks to a data structure called the future event set. The simulation kernel keeps inserting and removing events (do this at time t) in this data structure. Currently, SURF (SimGrid's kernel) uses a simple heap that has a guaranteed complexity but is not necessarily that efficient in practice. Recently a very nice data structure has been proposed for this purpose: the ladder queue. This data structure has amortized cost O(1) for every operation, regardless of the statistic properties of the workload and seems very promising. SURF would need a little refactoring and

Surfing on multi-core machines

Localization

Skills

Good level of C programming. Average knowledge in concurrency (threads) and synchronization. Valgrind/callgrind would be appreciated but this may be a good opportunity for learning.

Description

Recently, SimGrid started trying to take advantage of multicore machine. Simix (SimGrid's abstraction for managing user process and synchronization) is now multi-core (see a recent INRIA research report) ready but other pieces of our code could benefit from the same kind of development, in particular SURF (SimGrid's kernel). An easy approach is simply to run models in parallel. In principle, there should be no difficulty but one needs to make sure about this. Then, some kernels may be parallelized internally but it's more complex.

Improving Lua bindings performances

Localization

Skills

Good level of C and lua programming. Basic knowledge of Valgrind/callgrind.

Description

SimGrid offers several bindings for higher-level languages (java, lua, ruby,…) that make rapid prototyping of simulations much more efficient for users. Lua is very lightweight and somehow designed for this kind of usage. The performance loss of lua simulations when compared to C simulations should be around 10% but no more and our current implementation of lua bindings seems far from this. One needs to investigate the reason for this performance loss and fix it.

File storage API in MSG

Localization

Skills

C programming. Good knowledge of SimGrid layers and internal logic but this may be a good opportunity for learning.

Description

There is no notion of files in SimGrid as modeling the performance of disks is very challenging. Yet recently, we have started collaborating with CERN researchers that really need such abstraction and have a lots of exciting ideas about storage performance modeling. One needs to propose an API at all levels of SimGrid (MSG, SIMIX, SURF, XML description) and to implement it. The surf models will need to be traced with SimGrid's tracing mechanism so that disk usage and the impact of I/O can then be visualized with tools like Triva or Paje.

Power consumption

Localization

Skills

C programming. Good knowledge of SimGrid layers and internal logic but this may be a good opportunity for learning.

Description

Reducing power consumption has become a major issue when managing today's distributed platforms (for cloud providers, for large computing centers, …) and applications. For example, there has been recently a workshop on this topic. The first presentation was done by Johnatan Pecero (Université du Luxembourg) and evaluates scheduling algorithms with the power consumption/application response time trade-off in mind. Such study could be very easily done in SimGrid (especially since we already have a collection of DAG scheduling heuristics implemented with SimDAG) if SimGrid could keep track of power consumption. One needs to propose an API (scale CPU frequency down/up, turn down/up the machine, …) at all levels of SimGrid (MSG, SIMIX, SURF, XML description) and to implement it. Keeping track of power consumption should be done with SimGrid's tracing mechanism so that power consumption can then be visualized with tools like Triva or Paje.

Support for stochastic models in SimGrid

Localization

Skills

C programming. Basic knowledge in statistics and probabilities would be appreciated but this may be a good opportunity for learning.

Description

Since the creation of SimGrid, SimGrid has some support for handling resource availability variations with traces. This is very convenient when one wants to study the sensibility of algorithms to background load or unexpected load variations. A few years ago, we have reworked on this mechanism (see Bruno De Moura Donassolo, Henri Casanova, Arnaud Legrand, Pedro Velho . Fast and Scalable Simulation of Volunteer Computing Systems Using SimGrid. Workshop on Large-Scale System and Application Performance (LSAP), 2010 PDF) so that it works for Volunteer Computing systems. A drawback of the current approach is that one needs to feed all (possibly thousands) hosts with their own (and large) availability trace file and the simulation has become so fast now that a non-negligible amount of time is spent parsing these trace files. In several situations, one knows that