MPI-INF Logo
Campus Event Calendar

Event Entry

What and Who

Caribou -- Intelligent Distributed Storage for the Datacenter

Zsolt Istvan
ETH Zurich
SWS Colloquium

Zsolt Istvan is a recent PhD graduate of the Systems Group at ETH Zurich. His research looks at
using FPGAs in the context of databases and distributed systems, with the goal of building hybrid
solutions and specialized accelerators for data intensive tasks. Before graduate school he was a
Master's student at ETH Zurich, Switzerland, and a Bachelor's student at the
Technical University of Cluj-Napoca, Romania.
SWS, RG1, MMCI  
AG Audience
English

Date, Time and Location

Thursday, 8 February 2018
10:30
90 Minutes
E1 5
029
Saarbrücken

Abstract

In the era of Big Data, datacenter and cloud architectures decouple compute and storage resources from each other for better scalability. While this design choice enables elastic scale-out, it also causes unnecessary data movements. One solution is to push parts of the computation down to storage where data can be filtered more efficiently. Systems that do this are already in use and rely either on regular server machines as storage nodes or on network attached storage devices. Even though the former provide complex computation and rich functionality since there are plenty of conventional cores available to run the offloaded
computation, this solution is quite inefficient because of the over-provisioning of computing capacity and the bandwidth mismatches between storage, CPU, and network.  Networked storage devices, on the
other hand, are better balanced in terms of bandwidth but at the price of offering very limited options for offloading data processing.

With Caribou, we explore an alternative design that offers rich offloading functionality while matching the available line-rate processing performance of either storage or network. It also does this in a much more efficient package (size, energy consumption) than regular servers. Our FPGA-based prototype system has been designed such that the internal data management logic can saturate the network for most operation mixes, without being over-provisioned. As a result, it can extract and process data from storage at multi-GB/s rate before sending it to the computing nodes, while at the same time offering features such as replication for fault-tolerance.

Caribou has been released as open source. Its modular design and extensible processing pipeline make it a convenient platform for exploring domain-specific processing inside storage nodes.

Contact

Claudia Richter
93039103
--email hidden

Video Broadcast

Yes
Kaiserslautern
G26
111
passcode not visible
logged in users only

Claudia Richter, 02/05/2018 10:34 -- Created document.