Proceedings Article, Paper
@InProceedings
Beitrag in Tagungsband, Workshop


Show entries of:

this year (2014) | last year (2013) | two years ago (2012) | Notes URL

Action:

login to update

Options:




Library Locked Library locked




Author, Editor

Author(s):

Strzodka, Robert
Shaheen, Mohammed
Pajak, Dawid
Seidel, Hans-Peter

dblp
dblp
dblp
dblp

Not MPG Author(s):

Pajak, Dawid

Editor(s):





BibTeX cite key*:

StShPa_10CORALS

Title, Booktitle

Title*:

Cache oblivious parallelograms in iterative stencil computations


CORALS.pdf (215.87 KB)

Booktitle*:

ICS '10 : Proceedings of the 24th ACM International Conference on Supercomputing

Event, URLs

URL of the conference:

http://pcsostres.ac.upc.edu/ics-conference/archive/ics10/

URL for downloading the paper:

http://doi.acm.org/10.1145/1810085.1810096

Event Address*:

Tsukuba, Ibaraki, Japan

Language:

English

Event Date*
(no longer used):


Organization:

Association for Computing Machinery (ACM)

Event Start Date:

1 June 2010

Event End Date:

5 June 2010

Publisher

Name*:

ACM

URL:


Address*:

New York, NY

Type:


Vol, No, Year, pp.

Series:


Volume:


Number:


Month:


Pages:

49-59

Year*:

2010

VG Wort Pages:


ISBN/ISSN:

978-1-4503-0018-6

Sequence Number:


DOI:

10.1145/1810085.1810096



Note, Abstract, ©


(LaTeX) Abstract:

We present a new cache oblivious scheme for iterative stencil computations that performs beyond system bandwidth limitations as though gigabytes of data could reside in an enormous on-chip cache. We compare execution times for 2D and 3D spatial domains with up to 128 million double precision elements for constant and variable stencils against hand-optimized naive code and the automatic polyhedral parallelizer and locality optimizer PluTo and demonstrate the clear superiority of our results. The performance benefits stem from a tiling structure that caters for data locality, parallelism and vectorization simultaneously. Rather than tiling the iteration space from inside, we take an exterior approach with a predefined hierarchy, simple regular parallelogram tiles and a locality preserving parallelization. These advantages come at the cost of an irregular work-load distribution but a tightly integrated load-balancer ensures a high utilization of all resources.

URL for the Abstract:

http://www.mpi-inf.mpg.de/~strzodka/papers/info/StShPa_10CORALS.htm



Download
Access Level:

Public

Correlation

MPG Unit:

Max-Planck-Institut für Informatik



MPG Subunit:

Computer Graphics Group

Appearance:

MPII WWW Server, MPII FTP Server, MPG publications list, university publications list, working group publication list, Fachbeirat, VG Wort



BibTeX Entry:

@INPROCEEDINGS{StShPa_10CORALS,
AUTHOR = {Strzodka, Robert and Shaheen, Mohammed and Pajak, Dawid and Seidel, Hans-Peter},
TITLE = {Cache oblivious parallelograms in iterative stencil computations},
BOOKTITLE = {ICS '10 : Proceedings of the 24th ACM International Conference on Supercomputing},
PUBLISHER = {ACM},
YEAR = {2010},
ORGANIZATION = {Association for Computing Machinery (ACM)},
PAGES = {49--59},
ADDRESS = {Tsukuba, Ibaraki, Japan},
ISBN = {978-1-4503-0018-6},
DOI = {10.1145/1810085.1810096},
}


Entry last modified by Anja Becker, 03/11/2011
Show details for Edit History (please click the blue arrow to see the details)Edit History (please click the blue arrow to see the details)
Hide details for Attachment SectionAttachment Section

View attachments here:


File Attachment Icon
CORALS.pdf