MPI-INF Logo
Publications

Server    domino.mpi-inf.mpg.de

Proceedings Article, Paper
@InProceedings
Beitrag in Tagungsband, Workshop

Author, Editor
Author(s):
Strzodka, Robert
Shaheen, Mohammed
Pajak, Dawid
Seidel, Hans-Peter
dblp
dblp
dblp
dblp
Not MPG Author(s):
Pajak, Dawid
Editor(s):
BibTeX cite key*:
StShPa_10CORALS
Title, Booktitle
Title*:
Cache oblivious parallelograms in iterative stencil computations
CORALS.pdf (215.87 KB)
Booktitle*:
ICS '10 : Proceedings of the 24th ACM International Conference on Supercomputing
Event, URLs
Conference URL::
http://pcsostres.ac.upc.edu/ics-conference/archive/ics10/
Downloading URL:
http://doi.acm.org/10.1145/1810085.1810096
Event Address*:
Tsukuba, Ibaraki, Japan
Language:
English
Event Date*
(no longer used):
Organization:
Association for Computing Machinery (ACM)
Event Start Date:
1 June 2010
Event End Date:
5 June 2010
Publisher
Name*:
ACM
URL:
Address*:
New York, NY
Type:
Vol, No, Year, pp.
Series:
Volume:
Number:
Month:
Pages:
49-59
Year*:
2010
VG Wort Pages:
ISBN/ISSN:
978-1-4503-0018-6
Sequence Number:
DOI:
10.1145/1810085.1810096
Note, Abstract, ©
(LaTeX) Abstract:
We present a new cache oblivious scheme for iterative stencil computations that performs beyond system bandwidth limitations as though gigabytes of data could reside in an enormous on-chip cache. We compare execution times for 2D and 3D spatial domains with up to 128 million double precision elements for constant and variable stencils against hand-optimized naive code and the automatic polyhedral parallelizer and locality optimizer PluTo and demonstrate the clear superiority of our results. The performance benefits stem from a tiling structure that caters for data locality, parallelism and vectorization simultaneously. Rather than tiling the iteration space from inside, we take an exterior approach with a predefined hierarchy, simple regular parallelogram tiles and a locality preserving parallelization. These advantages come at the cost of an irregular work-load distribution but a tightly integrated load-balancer ensures a high utilization of all resources.
URL for the Abstract:
http://www.mpi-inf.mpg.de/~strzodka/papers/info/StShPa_10CORALS.htm
Download
Access Level:
Public

Correlation
MPG Unit:
Max-Planck-Institut für Informatik
MPG Subunit:
Computer Graphics Group
Appearance:
MPII WWW Server, MPII FTP Server, MPG publications list, university publications list, working group publication list, Fachbeirat, VG Wort



BibTeX Entry:

@INPROCEEDINGS{StShPa_10CORALS,
AUTHOR = {Strzodka, Robert and Shaheen, Mohammed and Pajak, Dawid and Seidel, Hans-Peter},
TITLE = {Cache oblivious parallelograms in iterative stencil computations},
BOOKTITLE = {ICS '10 : Proceedings of the 24th ACM International Conference on Supercomputing},
PUBLISHER = {ACM},
YEAR = {2010},
ORGANIZATION = {Association for Computing Machinery (ACM)},
PAGES = {49--59},
ADDRESS = {Tsukuba, Ibaraki, Japan},
ISBN = {978-1-4503-0018-6},
DOI = {10.1145/1810085.1810096},
}


Entry last modified by Anja Becker, 03/11/2011
Hide details for Edit History (please click the blue arrow to see the details)Edit History (please click the blue arrow to see the details)

Editor(s)
[Library]
Created
01/05/2011 05:23:45 PM
Revisions
3.
2.
1.
0.
Editor(s)
Anja Becker
Mohammed Shaheen
Anja Becker
Mohammed Shaheen
Edit Dates
11.03.2011 09:55:09
03/04/2011 02:35:26 PM
19.01.2011 12:41:53
01/05/2011 05:23:45 PM


File Attachment Icon
CORALS.pdf