Direction des Relations Internationales (DRI)
EQUIPE ASSOCIEE |
CloudComputing@home |
sélection
|
2009 |
Equipe-Projet INRIA : MESCAL | Organisme étranger partenaire : UC Berkeley |
Centre de recherche INRIA : Grenoble
Rhône-Alpes Thème INRIA : Num |
Pays : USA |
Coordinateur
français
|
Coordinateur
étranger
|
|
Nom, prénom | Kondo, Derrick | Anderson, David P. |
Grade/statut | CR INRIA | Research scientist, Director of BOINC & SETI@home |
Organisme d'appartenance (précisez le département et/ou le laboratoire) |
MESCAL project team | U.C. Berkeley Space Sciences Laboratory |
Adresse postale | Laboratoire LIG ENSIMAG - antenne de Montbonnot ZIRST 51, avenue Jean Kuntzmann 38330 MONBONNOT SAINT MARTIN, France |
7 Gauss Way Berkeley, CA 94720 |
URL | http://mescal.imag.fr/membres/derrick.kondo/index.html | http://boinc.berkeley.edu/anderson |
Téléphone | +33/0.4.76.61.20.61 | +1.510.642.4921 |
Télécopie | +33/0.4.76.61.20.99 |
+1.510.643.7629 |
Courriel | derrick.kondo@inria.fr | davea@ssl.berkeley.edu |
Titre de la thématique de collaboration (en français et en anglais) : Cloud Computing over Internet Volunteer Resources. Cloud computing sur ressources de calcul bénévoles. |
Descriptif:
Recently, a new vision of cloud computing has emerged where the complexity of an IT infrastructure is completely hidden from its users. At the same time, cloud computing platforms provide massive scalability, 99.999% reliability, and speedy performance at relatively low costs for complex applications and services. In this proposed collaboration, we investigate the use of cloud computing for large-scale and demanding applications and services over the most unreliable but also most powerful resources in the world, namely volunteered resources over the Internet. The motivation is the immense collective power of volunteer resources (evident by FOLDING@home's 3.9 PetaFLOPS system), and the relatively low cost of using such resources. We will focus on three main challenges. First, we will determine, design and implement statistical and prediction methods for ensuring reliability, in particular, that a set of N resources are available for T time. Second, we will incorporate the notion of network distance among resources. Third, we will create a resource specification language so that applications can easily leverage these mechanisms for guaranteed availability and network proximity. We will address these challenges drawing on the experience of the BOINC team which designed and implemented BOINC (a middleware for volunteer computing that is the underlying infrastructure for SETI@home), and the MESCAL team which designed and implemented OAR (an industrial-strength resource management system that runs across France's main 5000-node Grid called Grid'5000). La notion de «cloud computing» apparue récemment a pour objectif de complètement masquer la complexité des infrastructures numériques actuelles à leurs utilisateurs. Ces plates-formes de cloud computing doivent donc massivement passer à l'échelle, être extrêmement fiables et fournir d'excellentes performances à faible coût aux applications et aux services qu'elles offrent. Dans le cadre de la collaboration que nous proposons, nous regarderons comment mettre en oeuvre cette idée de cloud computing pour des applications complexes et exigeantes sur les plates-formes les plus instables qui soient, mais également les plus puissantes: les ressources de calcul bénévole accessibles à partir d'Internet. La motivation derrière ce défi est l'incroyable puissance de calcul résultant de ces ressources (3.9 PetaFLOPS, rien que pour FOLDING@home) et leur très faible coût. Nous nous concentrerons sur trois axes principaux. Premièrement, nous chercherons à concevoir et à mettre en oeuvre des méthodes statistiques et de prédiction permettant d'améliorer la fiabilité de ces ressources. En particulier, nous nous intéresserons à la probabilité qu'un ensemble de N machines soit disponible durant une période de temps T. Deuxièmement, nous couplerons ces garanties de disponibilité de ressources avec des informations de proximité réseau entre les unes et les autres. Enfin, nous étendrons un langage de spécification de ressources classique afin que les applications et les services soient à même d'exprimer des besoins en terme de disponibilité et de proximité réseau. Nous relèverons ces défis en nous appuyant sur les expériences respectives des équipes impliquées: l'équipe BOINC a conçu et mis en oeuvre le middleware BOINC qui est utilisé à travers le monde entier et est est l'infrastructure sous-jacente à SETI@home; l'équipe MESCAL a conçu et mis en oeuvre OAR, un outil de gestion de ressources et de tâches de qualité industrielle et qui est notamment au coeur de Grid'5000. |
1. Objectifs scientifiques de la proposition
Background on Cloud Computing
Recently, a new vision of cloud computing has emerged where the complexity of an IT infrastructure is completely hidden from its users. At the same time, cloud computing platforms are attractive because they provide massive scalability, 99.999% reliability, and speedy performance at relatively low costs.
From the perspective of a user, there are two main types of cloud computing platforms. The first type is Software-as-a-Service, where applications (such as Google App's) are accessed through the web. The second type is Platform-as-a-Service, where a third party provides and manages a computing or storage infrastructure, and users purchase time or resources in this infrastructure. Typical Platform-as-a-Service architectures, such Amazon's EC2 and S3, are large-scale and centralized data and compute centers.
From the perspective of developers, computing clouds consists of three types of entities [2]. The first type is Enablers, which provide data center automation or server virtualization technologies that form the fundamental building blocks for cloud computing. An example of an Enablers are hypervisors from Xen or VMWare. The second type is Providers, which provide the infrastructure that provide Software or Platform as-a-Service, and which maintain their own billing models for use of those services. An example of a Provider is Amazon's EC2 cloud. The third type is Consumers which utilize cloud infrastructures for software or platform services. An example of a Consumer is Google Apps.
Background on Volunteer Computing
Goals of collaboration
The goal of this proposal is to develop a cloud computing middleware that enables the execution of complex services and applications over unreliable volunteered resources over the Internet. In terms of reliability, these resources are often unavailable 40% of the time, and exhibit frequent churn (several times a day). In terms of "real, complex services and applications", we refer to large-scale service deployments, such as Amazon's EC2, the TeraGrid, and the EGEE, and also applications with complex dependencies among tasks. These commercial and scientific services and applications need guaranteed availability levels of 99.999% and knowledge of network proximity in order to have efficient and timely execution.
The main motivation of using volunteered resources is that they can be orders of magnitude cheaper and more powerful than dedicated cloud environments. Computing clouds such as Amazon's EC2 have a pay-per-use billing model where charges are incurred per CPU hour or per GB of data transferred. Given that FOLDING@home uses 3.9 PetaFLOPS, this would cost the project about 6.7 million USD per year if it could run over Amazon's EC2 (assuming a rate of $0.10 per small-instance hour). Even for an "average" volunteer computing project that uses 20 TeraFLOPS, this equates to about 340,000 USD per year. By contrast, we propose to offer a cloud computing service over volunteered resources where costs for the application are almost zero. For example, we would offer our cloud computing platform as a computational co-op where the relatively low monetary costs to maintain the infrastructure would not increase with resource usage or the number of CPU's or volunteers.
With this goal, our collaboration will focus on the following three research thrusts:
The collaboration will drawn upon the
complementary
research interests and experience of the two teams. The
Berkeley
team will provide their expertise in developing the scalable and
fault-tolerant middleware BOINC for use over volunteered resources.
The MESCAL team will provide their expertise in availability
prediction, simulation of distributed systems, and development of
resource management systems. In particular, we will leverage
the
work in the MESCAL team used to develop the OAR resource management
system over unreliable resources. OAR is an
industrial-strength resource management system for the main nation-wide
French Grid
called
Grid'5000. It is used daily to execute thousands of jobs on
over
5000 nodes in Grid'5000. Development of the OAR resource
management system is led by Olivier Richard, a member of this proposed
collaboration. In essence, we aim to combine and augment the
resource management mechanisms OAR with novel mechanisms for
fault-tolerance in BOINC,
allowing complex applications and services to leverage the power of
volunteer computing.
Research thrust #1: Virtually dedicated platform
Volunteer resources are arguably the most volatile computational resources world-wide. They exhibit low availability (on average 60%), and high churn (on the order of hours). Nevertheless, our vision is to provide an illusion of a fully dedicated platform where resources appear to be 100% available. This in turn would help enable the deployment of complex applications over volatile resources. Our approach is use statistical and predictive methods combined with virtual machine technology to ensure the availability of a collection of resources.
Our preliminary results on prediction of collective availability described in [1] are promising. In that joint study, we gathered real-world availability data from over 48,000 Internet hosts participating in the SETI@home project. With this trace data, we showed how to reliably and efficiently predict that a collection of N hosts will be available for T time. The results indicate that by using prediction combined with replication it is feasible to deploy enterprise services or applications over volatile resource pools with little overheads.
In step 1, we use our prediction model to determine which set of machines is available during some prediction interval T. Out of those machines, we choose N+X machines where N is the number of machines required by the application, and X is the number of extra machines used to replicate the service or application. The application or service is then deployed over this initial group over T time. The idea is that if a certain number of predictions are wrong and several machines fail (up to X in number), the application or service can continue to execute over the remaining N machines. In step 2, after the period T expires, we rerun the predictor to determine a new set of available machines and replace any failed machines from the selected pool with these machines.
One major issue is how to ensure uninterrupted execution of an application or service despite machine failures during the process described above. We propose to use virtual machines as a means of conducting lightweight process migration across resources. Clearly, in low-bandwidth Internet environments, one cannot afford to migrate entire virtual machines gigabytes in size to checkpoint servers.
Instead our approach is to only migrate a portion of the virtual machine (VM) that has changed locally on a volunteer resources. Specifically, virtual machines deployed over volunteer machines are often stored locally as a read-only raw image. VM instances are created from that raw image, and modifications to that image cause a copy-on-write (COW). An overlay image stores modifications to the raw image as the application or service executes within the VM instance. These overlay images are often relatively much smaller than the whole raw images. As such, we propose to have overlay images stored periodically at a remote site. The remote site could be a centralized and dedicated data server, or a fully P2P network such as BitTorrent.
Research thrust #2: Network distance
Clearly, in a distributed Internet environment, distributed and parallel applications need not only high resource availability, but also the notion of network distance. Fortunately, a plethora of related work has focused on methods for determining network distance in distributed systems. As such, we will apply the most applicable results of this research rather than invent our own. To the best of our knowledge, there is no volunteer computing system that currently provides knowledge of network proximity, and the details of network location in commercial cloud architectures are cloudy. Thus, our contribution will be the design and implementation of a network distance service in the context of a new cloud computing middleware, and to show its effectiveness.
In particular we will build upon the work in [3]. In this work, the authors describe a method where resources divide and place themselves in bins; nodes are placed in bins depending on how close they are in terms of network latency. This is determined by pinging a small set of well-known landmarks.
There are three main advantages of this approach. First, it is simple in that no special measurement infrastructure is needed nor is intra-node communication required. (Note that intra-node communication would be a major issue as BOINC does not have a P2P architecture but instead relies on a distributed set of project servers.) Nodes simply need to know their distance from a relatively few number (~15) of well-known landmarks, which can be discovered with a simple ping. Nodes do not need to know the distance among landmarks nor distance of other nodes from these landmarks. Second, it is scalable in that nodes do not need any global information to determine network proximity. Also, the method has been shown to work with over 1 million nodes using less than 15 landmarks. The BOINC client that runs on each volunteer resource by default is already installed with knowledge of 25 project servers. As such, these project servers could serve as landmarks required for determining network proximity. Third, it is accurate as the authors showed that the method will place nearby nodes in the same bin.
Resource thrust #3: Resource Description Language
In the previous sections, we described mechanisms for ensuring availability of machines and determining their network proximity. Certainly, services and applications will have different requirements for resource selection when using these mechanisms.
In particular, we would like to support resource selection that describe hard constraints of the form:
We will augment the OAR resource management system by integrating information about resource availability and predictions, and also network distance data. OAR itself has a resource management system based on the relational database engine MySQL. OAR users can already specify detailed resource requirements for their jobs through a human-readable, structured, and hierarchical language. For example, one can specify requirements for clusters, switches, nodes, CPU's and even cores. We will add new capabilities to the resource description language of OAR so that job requirements concerning resource availability and network proximity can easily be expressed.
Les différentes équipes participantes.
The Berkeley Open Infrastructure for Network Computing (BOINC) is a research project team that develops middleware for volunteer computing. The BOINC middleware is used by SETI@home, Climateprediction.net, Einstein@home, and a number of other scientific computing projects. Since 2000, over 100 scientific publications (appearing in prestigious journals such as Nature and Science) have resulted from volunteer computing projects. Since the project's initial creation in 2002, BOINC has become one of the most powerful, and most influential distributed computing systems in the world, harnessing about 1.1 PetaFLOPS of computing power from over a million Internet hosts.
MESCAL (Middleware Efficiently SCALable)
is a joint
team with members from CNRS, INPG, INRIA and UJF, all of whom are in
the LIG laboratory.
The goal of MESCAL is to design software solutions for the efficient
exploitation of large distributed architectures at metropolitan,
national and international scales. The main applications are intensive
scientific computations.
The methodology is based on:
La
liste
des chercheurs impliqués dans la proposition ainsi qu'un
bref CV du responsable.
David P. Anderson (Research Scientist, UC Berkeley)is
a pioneer and global leader in volunteer and distributed computing.
D. Anderson received his PhD in Computer Science from the University of
Wisconsin - Madison in 1985. Currently, he is director of BOINC
(Berkeley Open Infrastructure for Network Computing), a research
project that develops middleware for volunteer computing. He
also
directs the SETI@home project, which is a research project that
uses Internet-connected computers to analyze radio-telescope data in
the search for extraterrestrial intelligence (SETI). Since its public
launch in May 1999, over 3,000,000 people have contributed 2 million
years of computer time, making SETI@home the largest computation ever
performed.
In the past, he served on the faculty in the Computer Science
Department at UC Berkeley, where he led research in operating systems,
distributed systems and networks, and multimedia systems. He has
authored 69 papers in Computer Science, with 17 papers in refereed
journals (including Communications of the ACM, ACM Transactions on
Graphics, ACM Transactions on Computer Systems, IEEE Computer, and IEEE
Transactions on Software Engineering), 28 conference papers, and 13
unpublished technical reports. He has held a number of leadership
positions in industry, including Chief Technology Officer of United
Devices, an enterprise desktop grid company. For details,
please see David
Anderson's CV.
Derrick Kondo (CR, INRIA MESCAL) received his PhD in Computer Science from the University of California at San Diego, and his BS from Stanford University. His interests lie in the area of volunteer computing and desktop grids. In particular, he leads research on the measurement and characterization of Internet distributed systems, their simulation and modelling, and resource management. He founded and continues to serve as co-chair of the Workshop on Volunteer Computing and Desktop Grids, and also co-chaired the last BOINC workshop on volunteer computing and distributed thinking. Currently, he is serving as guest co-editor of a special issue of the Journal of Grid Computing on volunteer computing and desktop grids. His recent paper on characterizing computation errors in Internet-distributed systems won the best paper award at the 13th European Conference on Parallel and Distributed Computing. For details, please see Derrick Kondo's CV.
Rom Walton (Senior Staff, UC Berkeley)
Olivier Richard (Prof. UJF, INRIA MESCAL)
Arnaud Legrand (CR, CNRS, INRIA MESCAL)
Les étudiants impliqués dans la proposition.
L'historique de la collaboration entre les équipes.
Joint publications:
Statistics: 65
attendees (17 young researchers/students from INRIA [MOAIS, GRAAL,
MESCAL, ALGORILLE], CNRS, LIP6, or other French
organizations), 13 countries, 22 affiliations.
Visitor (team) | Host (team) | Date | Purpose |
David Anderson, Rom Walton (BOINC) | Derrick Kondo (MESCAL) | 09/2008 | Discussed use of prediction in BOINC |
David Anderson (BOINC) | Derrick Kondo (MESCAL) | 04/2008 | Discussed characterization of availability traces |
Derrick Kondo (MESCAL) | David Anderson (BOINC) | 12/2007 | Discussed scheduling in BOINC |
Derrick Kondo (MESCAL) | David Anderson (BOINC) | 12/2006 | Discussed simulation of scheduling heuristics in BOINC |
Derrick Kondo (MESCAL) | David Anderson (BOINC) | 05/2006 | Discussed trace collection and scheduling in BOINC. |
David Anderson (BOINC) | Derrick Kondo (MESCAL) | 11/2004 | Discussed design and implementation of BOINC |
Derrick Kondo (MESCAL) | David Anderson (BOINC) | 04/2002 | Discussed statistics collected in SETI@home |
Joint and Related Professional Activities
David Anderson et al. and Derrick Kondo et
al. each
separately started two volunteer computing workshops (the Pan-Galactic
BOINC workshop and the PCGrid workshop respectively). D. Anderson has
served on the program committee of PCGrid for the past two years since
the workshop's creation, and he has also been an invited speaker. D.
Kondo presented his most recent work at the last two BOINC workshops.
Derrick Kondo hopes to extend the
collaboration between UC Berkeley and INRIA by forming an Associate
Team. Arnaud Legrand is currently co-principle investigator
on a "Young
Researchers" ANR-funded project on volunteer computing titled "The Design and Optimization of
Collaborative
Computing Architectures" (DOCCA). The main goal is
to combine theoretical tools and metrics from the parallel computing
and network communities to develop algorithmic and analytical solutions
to specific resource management problems of volunteer computing
systems. For example, one issue is the incentives needed to ensure both
fair and optimal usage of resources. Potentially
the algorithms in that project could be applied and integrated with the
system detailed in this proposal.
Thus the expertise and research interests of the MESCAL team
complements those of the UC Berkeley team that designed and developed
BOINC.
Les pages des personnes, laboratoires, organismes.
3. Impact
In terms of scientific objectives of the two teams, the formation of this Associate Team would allow the collaboration between INRIA MESCAL and UC Berkeley to grow team-wide, facilitating a bi-directional transfer of science and technology. The strength of the BOINC system is its mechanisms for fault-tolerance and error detection as it was designed to run over the world's most volatile resources. BOINC was designed to handle failures or errors anywhere in the software (application, operating system) or hardware stack (CPU's, memory, disk). As such, it exemplifies a software system built to handle all types failures. The strength of OAR is its advance mechanisms for resource management (for example, hierarchical resource requests, Gantt-chart scheduling, and batch and interactive jobs), and its robust and scalable design. For the MESCAL team, the collaboration would orient the design and implementation of OAR to consider the extensive fault-tolerant techniques used in BOINC. For the BOINC team, the collaboration would allow for the improvement of resource management in BOINC so that users can more easily submit jobs and express their requirements. Moreover, with joint efforts focused on building a cloud computing platform from BOINC and OAR components with integration of new network proximity, VM and prediction technologies, the collaboration could result in a new middleware with the "best of both worlds".
At a high level, the real-world issues affecting BOINC (one of the world's largest, most volatile distributed systems) would be used to steer applicable research in the INRIA MESCAL team and conversely, the novel scheduling and predictive algorithms, and advanced resource management systems (such as OAR) designed by INRIA MESCAL could influence and be implemented and integrated in a planetary-scale distributed system.
Introduction
The goal of this proposal is to develop a cloud computing middleware that enables the execution of real, complex services and applications over unreliable resources. We target computational resources across the Internet that are volunteered. In terms of reliability, these resources are often unavailable 40% of the time, and exhibit frequent churn (several times a day). In terms of "real, complex services and applications", we refer to large-scale service deployments, such as Amazon's EC2, the TeraGrid, and the EGEE, and also applications with complex dependencies among tasks. These commercial and scientific services and applications need guaranteed availability levels of 99.999% and knowledge of network proximity in order to have efficient and timely execution.
With this goal, our collaboration will focus on the following three research thrusts:
To achieve these goals, we would like to leverage research and development efforts in both teams. In particular, we would like to integrate the systems of BOINC and OAR shown in the figure below, which depicts the high-level system architecture.
In the sections below, we detail each research thrust and new component (highlighted in yellow shown the figure above), describing the challenges, our current progress in addressing them, and the proposed collaborative work.
Research Thrust #1: Virtually-dedicated platform
Research
Thrust #2: Network distance
Conclusion
We propose to develop a new cloud computing middleware based on the BOINC software for volunteer computing and the OAR resource management system for clusters. The goal is to be able to run complex applications and services across unreliable volunteer resources. The approach is to use statistical and prediction methods to guarantee the availability of collections of resources, to use virtual machines to hide machine unavailability, and to use node-binning techniques to provide network proximity information. In addition, we will provide a resource management system that will allow applications or services to easily leverage these new capabilities. Ultimately, we will test the utility of this system for a real barrier synchronization application.
Bibliography
1. Echanges
The goal of each of the following
meetings (ordered
chronologically) will be to coordinate the joint scientific and
professional activities of the two teams. For each participant,
there will be one short visit to the other team's site, grouping visits
at the same time when possible.
In addition, we plan to organize two 3-day workshops, one in the 18th month on cloud and volunteer computing at UC Berkeley, and the other in the 36th month at INRIA Rhône-Alpes. The workshops, which will include tutorials, aim to stimulate new developments and activities related to cloud and volunteer computing, by allowing users to share their experience and requirements, giving developers the opportunity to outline their plans, and stimulating new collaborations between participants. We expect about 65 international participants. Note that these workshops would follow last year's successful workshop, organized jointly by the MESCAL and BOINC teams, and hosted at INRIA Rhône-Alpes. We estimate the cost of organizing the workshop to be 3,000 euros each year (based on previous years' costs).
1. ESTIMATION DES DÉPENSES EN MISSIONS INRIA VERS LE PARTENAIRE |
Nombre de personnes
|
Coût
estimé
|
Chercheurs confirmés | 3 | 9,999 |
Post-doctorants |
1 | 3,333 |
Doctorants | 2 | 6,666 |
Stagiaires |
||
Autre (précisez)
:
2 workshops |
5,000 | |
Total
|
6 | 25,000 |
2. ESTIMATION DES DÉPENSES EN INVITATIONS DES PARTENAIRES |
Nombre de personnes
|
Coût
estimé
|
Chercheurs confirmés | 2 | 6,666 |
Post-doctorants |
||
Doctorants | ||
Stagiaires |
||
Autre (précisez) : |
||
Total
|
2 | 6,666 |
2. Cofinancement
In the past (including for last year), IBM through UC Berkeley has given 2000 euros towards the organization of a workshop. For the two workshops, IBM should be able to cover 4,000 euros in total. Last year, the conference organization committee of INRIA Rhône-Alpes contributed 1,000 euros towards the workshop organization, and we believe they would also be able to contribute 1,000 euros for the workshop held in France.
ESTIMATION PROSPECTIVE DES CO-FINANCEMENTS | |
Organisme |
Montant |
UC Berkeley | 6,666 |
IBM through UC Berkeley | 4,000 |
INRIA Rhône-Alpes | 1,000 |
Total
|
11,666 |
We will actively pursue other
opportunities for
co-financing this collaboration in order to strengthen ties between UC
Berkeley and INRIA. We will apply for a grant offered by the
France-Berkeley Fund for
supporting collaborative research between the University of California
and France.
3. Demande budgétaire
Commentaires
|
Montant
|
A. Coût global de la proposition (total des tableaux 1 et 2 : invitations, missions, ...) | 31,666 |
B. Cofinancements utilisés (financements autres que Equipe Associée) | 11,666 |
Financement
"Équipe Associée" demandé (A.-B.)
(maximum 20 K€) |
20,000 |