|Equipe-Projet INRIA : MESCAL||Organisme étranger partenaire : UC Berkeley (UCB)|
|Centre de recherche INRIA : Grenoble
Thème INRIA : Num
|Pays : USA|
|Nom, prénom||Kondo,||Anderson, David P.|
|Grade/statut||CR2 INRIA||Research scientist, Director of BOINC & SETI@home|
(précisez le département et/ou le laboratoire)
|MESCAL project team||U.C. Berkeley Space Sciences Laboratory|
|Adresse postale|| Laboratoire LIG
ENSIMAG - antenne de Montbonnot
ZIRST 51, avenue Jean Kuntzmann
38330 MONBONNOT SAINT MARTIN, France
| 7 Gauss Way
Berkeley, CA 94720
Titre de la thématique de collaboration (en français et en anglais) :
Cloud Computing over Internet Volunteer Resources.
Cloud computing sur ressources de calcul bénévoles.
Recently, a new vision of cloud computing has emerged where the complexity of an IT infrastructure is completely hidden from its users. At the same time, cloud computing platforms provide massive scalability, 99.999% reliability, and speedy performance at relatively low costs for complex applications and services. In this proposed collaboration, we investigate the use of cloud computing for large-scale and demanding applications and services over the most unreliable but also most powerful resources in the world, namely volunteered resources over the Internet. The motivation is the immense collective power of volunteer resources (evident by FOLDING@home's 3.9 PetaFLOPS system), and the relatively low cost of using such resources. We will focus on three main challenges. First, we will determine, design and implement statistical and prediction methods for ensuring reliability, in particular, that a set of N resources are available for T time. Second, we will incorporate the notion of network distance among resources. Third, we will create a resource specification language so that applications can easily leverage these mechanisms for guaranteed availability and network proximity. We will address these challenges drawing on the experience of the BOINC team which designed and implemented BOINC (a middleware for volunteer computing that is the underlying infrastructure for SETI@home), and the MESCAL team which designed and implemented OAR (an industrial-strength resource management system that runs across France's main 5000-node Grid called Grid'5000).
La notion de «cloud computing» apparue récemment a pour objectif de complètement masquer la complexité des infrastructures numériques actuelles à leurs utilisateurs. Ces plates-formes de cloud computing doivent donc massivement passer à l'échelle, être extrêmement fiables et fournir d'excellentes performances à faible coût aux applications et aux services qu'elles offrent. Dans le cadre de la collaboration que nous proposons, nous regarderons comment mettre en oeuvre cette idée de cloud computing pour des applications complexes et exigeantes sur les plates-formes les plus instables qui soient, mais également les plus puissantes: les ressources de calcul bénévole accessibles à partir d'Internet. La motivation derrière ce défi est l'incroyable puissance de calcul résultant de ces ressources (3.9 PetaFLOPS, rien que pour FOLDING@home) et leur très faible coût. Nous nous concentrerons sur trois axes principaux. Premièrement, nous chercherons à concevoir et à mettre en oeuvre des méthodes statistiques et de prédiction permettant d'améliorer la fiabilité de ces ressources. En particulier, nous nous intéresserons à la probabilité qu'un ensemble de N machines soit disponible durant une période de temps T. Deuxièmement, nous couplerons ces garanties de disponibilité de ressources avec des informations de proximité réseau entre les unes et les autres. Enfin, nous étendrons un langage de spécification de ressources classique afin que les applications et les services soient à même d'exprimer des besoins en terme de disponibilité et de proximité réseau. Nous relèverons ces défis en nous appuyant sur les expériences respectives des équipes impliquées: l'équipe BOINC a conçu et mis en oeuvre le middleware BOINC qui est utilisé à travers le monde entier et est est l'infrastructure sous-jacente à SETI@home; l'équipe MESCAL a conçu et mis en oeuvre OAR, un outil de gestion de ressources et de tâches de qualité industrielle et qui est notamment au coeur de Grid'5000.
Partners: Walfredo Cirne (Google), Sheng Di (INRIA) , Derrick Kondo (INRIA)
Prediction of host load in Cloud systems is critical for achieving service-level agreements. However, accurate prediction of host load in Clouds is extremely challenging because it fluctuates drastically at small timescales. We have designed a prediction method based on Bayes model to predict the mean load over a long-term time interval, as well as the mean load in consecutive future time intervals. We have identified novel predictive features of host load that capture the expectation, predictability, trends and patterns of host load. We also have determineed the most effective combinations of these features for prediction. Our method was evaluated using a detailed one-month trace of a Google data center with thousands of machines. Experiments show that the Bayes method achieves high accuracy with a mean squared error of 0.0014. Moreover, the Bayes method improves the load prediction accuracy by 5.6-50% compared to other state-of-the-art methods based on moving averages, auto-regression, and/or noise filters.
This joint work was published at the IEEE/ACM Supercomputing Conference 2012.
Partners: Walfredo Cirne (Google), Sheng Di (INRIA) , Derrick Kondo (INRIA)A new era of Cloud Computing has emerged, but the characteristics of Cloud load in data centers is not perfectly clear. Yet this characterization is critical for the design of novel Cloud job and resource management systems. In Cluster 2012, we comprehensively characterize the job/task load and host load in a real-world production data center at Google Inc. We use a detailed trace of over 25 million tasks across over 12,500 hosts. We study the differences between a Google data center and other Grid/HPC systems, from the perspective of both work load (w.r.t. jobs and tasks) and host load (w.r.t. machines). In particular, we study the job length, job submission frequency, and the resource utilization of jobs in the different systems, and also investigate valuable statistics of machine's maximum load, queue state and relative usage levels, with different job priorities and resource attributes. We find that the Google data center exhibits finer resource allocation with respect to CPU and memory than that of Grid/HPC systems. Google jobs are always submitted with much higher frequency and they are much shorter than Grid jobs. As such, Google host load exhibits higher variance and noise.
This joint work was published at the IEEE Cluster Conference 2012.
This work was published at the IEEE International Symposium on Cluster Computing and the Grid (CCGrid'12).
During their visits to UCB and Geneva, Bahman Javadi, Derrick Kondo, and David Anderson discussed new methods for modelling resource availability of Internet hosts.
Invariably, volunteer platforms are composed of heterogeneous hosts whose individual availability often exhibit different statistical properties (for example stationary versus non-stationary behavior) and fit different models (for example Exponential, Weibull, or Pareto probability distributions). They developed an effective method for discovering subsets of hosts whose availability have similar statistical properties and can be modelled with similar probability distributions. They applied this method to ~230,000 host availability traces obtained SETI@home at UCB. The main contribution is the finding that hosts clustered by availability form 6 groups. These groups can be modelled by two probability distributions, namely the hyper-exponential and Gamma probability distributions. Both distributions are suitable for Markovian modelling. We believe that this characterization is fundamental in the design of stochastic algorithms for resource management across large-scale systems where host availability is uncertain.
This joint work was published at the MASCOTS 2009 conference and in the IEEE Transactions on Parallel and Distributed Systems 2011. This work is also connected to the constitution of the Failure Trace Archive whose presentation was done at CCGRID 2010
As the main goal of this project is to create a cloud computing platform using volunteer resources, we first studied the tradeoffs of a volunteer cloud platform versus a traditional cloud platform hosted on a dedicated data center. In particular, we studied the performance and monetary cost-benefits of two volunteer clouds (SETI@home and XtremLab) and a popular traditional cloud (Amazon's EC2, EBS, and S3). To conduct this study, David Anderson (UCB) provided the resource usage and volunteer statistics of SETI@home, a popular volunteer cloud hosted at UCB. Paul Malecot (INRIA) provided the resource usage and volunteer statistics of XtremLab, a volunteer cloud at INRIA. Together, David Anderson (UCB), Paul Malecot, Bahman Javadi, and Derrick Kondo (INRIA) determined the following:
During his visit to UCB and China, Derrick Kondo discussed
with David Anderson ways in which traditional clouds could be
combined with volunteer resources. The cost-benefit
analysis in our first study showed that deploying
applications across a mixture of traditional cloud
resources and volunteer resources could lower costs even
further. The results of our second study gave us models
of availability that could be applied for improving the
reliability of volunteer resources. Thus, in this work,
we investigate the benefit for web services of using a
mixture of cloud and volunteer resources, leveraging our
predictive availability models.
We discuss an operational model which guarantees long-term availability despite of host churn, and we study multiple aspects necessary to implement it. These include: ranking of non-dedicated hosts according to their long-term availability behavior, short-term availability modeling of these hosts, and simulation of migration and group availability levels using real-world availability data from 10,000 non-dedicated hosts participating in the SETI@home project provided by UCB. We also study the tradeoff between a larger share of dedicated hosts vs. higher migration rate in terms of costs and SLA objectives. This yields an optimization approach where a service provider can choose from a Pareto-optimal set of operational modes to find a suitable balance between costs and service quality. The experimental results show that it is possible to achieve a wide spectrum of such modes, ranging from 3.6 USD/hour to 5 USD/hour for a group of at least 50 hosts available with probability greater than 0.90.
This joint work was published at the NOMS 2010 conference.With respect to idle data centers, Amazon Inc. is currently selling the idle resources of their data centers using Spot Instances. With the recent introduction of Spot Instances in the Amazon Elastic Compute Cloud (EC2), users can bid for resources and thus control the balance of reliability versus monetary costs. A critical challenge is to determine bid prices that minimize monetary costs for a user while meeting Service Level Agreement (SLA) constraints (for example, sufficient resource availability to complete a computation within a desired deadline). We propose a probabilistic model for the optimization of monetary costs, performance, and reliability, given user and application requirements and dynamic conditions. Using real instance price traces and workload models, we evaluate our model and demonstrate how users should bid optimally on Spot Instances to reach different objectives with desired levels of confidence. To harness idle resources on the Amazon Cloud via BOINC, we have successfully deployed a BOINC client and server in Amazon's cloud and are working on further integration.
This was work published at the 18th IEEE/ACM International Symposium on Modeling,
Analysis and Simulation of Computer and Telecommunication
Systems (MASCOTS), September, 2010
and the 3rd International Conference on Cloud Computing (IEEE CLOUD 2010), July, 2010.
One sub-goal of Research Thrust #1 was the incorporation of virtual machines in BOINC. A major issue in this incorporation is the configuration of the virtual machine, in particular the image, itself. Scientists should be able to easily customize a virtual image of the operating system, and required libraries and applications. This image can then be deployed across a distributed computing environment, such as BOINC or Grid'5000. Y. Geogiou et al. have implemented a virtualization tool called Kameleon that enables such customization of the virtual image. This tool takes as input the image specification and description. The specification describes various parameters such as the OS distribution (for example, debian lenny, centos 5), kernel, packages/libraries (OpenMPI, MPICH), benchmarks (LINPACK, NAS), and output image format (ISO, COS, TGZ). The description details what actions to take using those parameters in macrosteps and microsteps. Examples of macrosteps are initializing the module repository, and configuring the package manager. Examples of microsteps are checking the command error code, executing a bash script on the host system, or writing to a file of the image. The Kameleon Engine was implemented in Ruby. The configuration files are specified using YAML, which is a markup language designed to be easily mapped to data structures such as lists, hashes, and scalars.
The tool was intentionally designed to be
generic, and applicable to any system, such as Grid'5000 or
BOINC, that uses virtualization. The next step will be
to integrate the tool with virtualization in BOINC.
OAR is the current production batch scheduler
of Grid'5000. However, it was not originally designed
for using Internet volunteer computing resources. In
particular, volunteer computing resources must use HTTP to
communicate with the OAR server, and pull (versus push) jobs
from the server to bypass firewalls. We made
significant changes to the OAR server API such that an
desktop agent can get jobs (executable, inputs), post the job
state, post results, and kill jobs from the OAR server all
through a REST architecture. The REST (short for
Representational State Transfer) architecture is advantageous
because it inherently allows for scalable component
interactions and provides a general interface. We
implemented a desktop agent that applies that API to
retrieve, execute, and return jobs.
The OAR desktop agent will fit in the
CloudShare deployment as follows. A virtual machine
image will have the desktop agent pre-installed and configure
to start and execute on system boot-up. This virtual
machine (VM) image will be deployed and controlled by the
BOINC client, running on desktop volunteers. Once the
VM image is deployed and booted-up, the desktop agent will
execute, and pull and execute jobs from the OAR server using
the REST API. The BOINC server will essentially only be
used to deploy the virtual machine, and all interaction with
jobs will between the desktop agent and OAR, bypassing the
BOINC server completely. The design choice was made to
prevent issues of synchronization between the OAR and BOINC
servers, in terms of job and host tables.
The source code for the REST API of OAR and
the desktop agent is
At the joint
meeting among David Anderson, Derrick Kondo, and Sangho Yi at
UC Berkeley, they discussed ways to improve BOINC so that it
can support demanding real-time applications in the
One major challenge is the server-side management of these tasks, which often number in tens or hundreds of thousands on a centralized server. In the work described in Euro-Par 2010, we design and implement a real-time task management system for many-task computing, called RT-BOINC. The system gives low O(1) worst-case execution time for task management operations, such as task scheduling, state transitioning, and validation. We implemented this system on top of BOINC. Using micro and macro-benchmarks executed in emulation experiments, we showed that RT-BOINC provides significantly lower worst-case execution time, and lessens the gap between the average and the worst-case performance compared with the original BOINC implementation.
The joint work was published in European Conference on Parallel and Distributed
Computing (Euro-Par) in 2010.
During his visit to UC Berkeley, Arnaud
Legrand discussed with David Anderson the issue of
large-scale volunteer computing simulation and its
visualization and analysis.
To evaluate different policies for
fault-tolerance achieved via virtual machines, we need a
volunteer computing (VC) simulator to allow for reproducible
and configurable experiments. The main issue when
developing VC simulators is scalability: How to perform
simulations of large-scale VC platforms with reasonable
amounts of memory and reasonably fast? To achieve
scalability, state-of-the-art VC simulators employ simplistic
simulation models and/or target on narrow platform and
application scenarios. In this work, we enable VC simulations
using the general-purpose SimGrid simulation framework, which
provides significantly more realistic and flexible simulation
capabilities than the aforementioned simulators. The key
contribution is a set of improvements to SimGrid so that it
brings these benefits to VC simulations while achieving good
The scalability of simulations was evaluated
using traces collected of SETI@home collected by the BOINC
team. Also, the simulator developed used a simulation
model of the BOINC client, which uses several complex
policies for dealing with host failures, application
deadlines, and user constraints.
This work was published in LSAP 2010 and in Concurrency
and Computation: Practice and Experience, 2011 and