New for: D3
high bandwidth in applications involving massive data sets, but
algorithms for parallel disks can be difficult to devise. To
combat this problem, we define a useful and natural duality
between writing to parallel disks and the seemingly more difficult
problem of prefetching. We first explore this duality for applications
involving read-once accesses using parallel disks. We get a
simple linear time algorithm for computing optimal prefetch
schedules and analyze the efficiency of the resulting schedules
for randomly placed data and for arbitrary interleaved accesses to
striped sequences. Duality also provides an optimal schedule for
the integrated caching and prefetching problem, in which blocks
can be accessed multiple times. Another application of this
duality gives us the first parallel disk sorting algorithms
that are provably optimal up to lower order terms.
One of these
algorithms is a simple and practical variant of multiway merge
sort, addressing a question that has been open for some time.