MPI-INF Logo
Campus Event Calendar

Event Entry

What and Who

Robust replication

Allen Clement
Max Planck Institut for Software Systems
SWS Colloquium

Allen Clement is a Postdoctoral Researcher at the Max Planck Institute for Software Systems. He received a Ph.D. from the University of Texas at Austin and an A.B. in Computer Science from Princeton University. His research focuses on the challenges of building robust and reliable distributed systems. In particular, he has investigated practical Byzantine fault tolerant replication, systems robust to both Byzantine and selfish behaviors, consistency in geo-replicated environments, and how to leverage the structure of social networks to build Sybil-tolerant systems.
SWS  
MPI Audience
English

Date, Time and Location

Monday, 13 February 2012
10:30
90 Minutes
G26
206
Kaiserslautern

Abstract

The choice between Byzantine and crash fault tolerance is viewed as a fundamental design decision when building fault tolerant systems. We show that this dichotomy is not fundamental, and present a unified model of fault tolerance in which the number of tolerated faults of each type is a configuration choice. Additionally, we observe that a single fault is capable of devastating the performance of existing Byzantine fault tolerant replication systems. We argue that fault tolerant systems should, and can, be designed to perform well even when failures occur. In this talk I will expand on these two insights and describe our experience leveraging them to build a generic fault tolerant replication library that provides flexible fault tolerance and robust performance. We use the library to build a fault tolerant version of the Hadoop Distributed File system.

Contact

Vera Laubscher
+4963193039603
--email hidden

Video Broadcast

Yes
Saarbrücken
E1 5
Wartburg 5th floor
passcode not visible
logged in users only

Vera Laubscher, 03/01/2012 09:24
Brigitta Hansen, 02/28/2012 10:01
Vera Laubscher, 02/10/2012 12:43 -- Created document.