Rethinking Storage Space Management in High-Performance Computing Centers
Ali R. But
Virginia Tech
SWS Colloquium
Ali R. Butt is an Assistant Professor of Computer Science at Virginia Tech,
USA. Ali received the Ph.D. in Electrical and Computer Engineering from
Purdue University in 2006. His research interests are in experimental
computer systems, especially in file and storage systems. His current work
focuses on I/O and storage issues of modern High Performance Computing
systems and data-intensive computing. Ali is the recipient of NSF CAREER
Award (2008), IBM Faculty Award (2008), and a Virginia Tech College of
Engineering “Outstanding New Assistant Professor” Award (2009).
Modern scientific applications, such as computer models for analyzing data
from particle colliders or space observatories, process data that is
exponentially growing in size. High-performance computing (HPC) centers that
support such applications are now faced with a data deluge, which can no
longer be managed using ad hoc approaches in use today. Consequently, a
fundamental reevaluation of the data management tools and techniques is
required. In this talk, I will describe a fresh approach to HPC storage
space management, especially for the center scratch space --- a high speed
storage used for servicing currently running and soon to run applications
--- which effectively treats the storage as a tiered cache and provide
comprehensive integrated storage management. I will discuss how the caching
model is achieved, and how its mechanisms are supported through just-in-time
staging and timely offloading of data. Finally, I will show how this
approach can also mitigate the effects of center storage failures. The
overall goal is to improve HPC center serviceability and resource
utilization.