Publications

The Parallel Random Access Machine, \de{PRAM}, is the dominant

theoretical parallel computer model. It consists of a number of

processing units, \de{PU}s, which operate synchronously, and which

can access a shared memory in constant time. Unfortunately, this

high-level model is hardly realizable in hardware using current

technology. One possibility is to directly design algorithms for the

distributed memory computer, \de{DMC}, under consideration, but this

involves many hairy details, and is not portable at all. More

practical, is to develop for every DMC a library of basic algorithms

(sorting, matrix multiplication, list ranking, \dots), a PRAM

simulator, and a `compiler'. In this way, the programmer can program

in a high-level language, which is essential for the acceptance of

parallel computing. In this paper, we sketch our experiences with

a deterministic PRAM simulator for two-dimensional meshes.

The success of a PRAM simulator depends on the achieved speed-up. We

made considerable progress. Presently, the simulation of one step of

a PRAM with $65536 = 16 \cdot 64^2$ PUs, on a $64 \times 64$ mesh,

takes about 12000 routing steps, if all 65536 PRAM PUs simultaneously

make an access. The time depends on the precise access pattern, and

CRCW steps are slightly more expensive than EREW steps. Larger speed-up

is achieved if the mesh is larger or if more PRAM PUs are simulated on

every mesh PU. It may appear disappointing that we do not achieve

substantially better. However, on an $n \times n$ mesh, we can never

expect to achieve a speed-up larger than $n / (c + o(n))$, for some

constant $c$. In our case $c$ is about 12. Our techniques are directly

applicable to higher dimensional meshes. There the speed-up is larger.

Furthermore, in our conception, the PRAM simulator should only be used

for those operations that are not available in the library of DMC

algorithms. So, a moderate speed-up for them, does not necessarily

impair the speed-up of the whole program.

We developed a small PRAM language which allows to almost directly

simulate the basic PRAM algorithms presented in textbooks. It

was used to simulate a constant time algorithm for computing the

maximum \cite[Sec. 2.6.1]{Ja}, and for the standard logarithmic time

summation algorithm. The summation of 65536 numbers requires 33 PRAM

steps, which can be simulated on a $36 \times 36$ in 55158 routing

steps with maximal queue size~$126$.

In order to be able to analyze the details of the PRAM simulation,

we have performed it on a mesh simulator. Actually, we developed a

complete programming environment, containing many basic mesh

operations: several variants of routing and sorting algorithms,

ranking, etc. All these algorithms have been optimized, in order

to get competitive results for the moderate mesh sizes under

consideration.

PRAM Simulation, Parallel Computation, Meshes, Deterministic Algorithms

January

2021

@MASTERSTHESIS{Meyer95,

AUTHOR = {Meyer, Ulrich},

TITLE = {Deterministische Simulation einer PRAM auf Gitterrechnern},

SCHOOL = {Universit{\"a}t des Saarlandes},

YEAR = {1995},

TYPE = {Master's thesis}

}

AUTHOR = {Meyer, Ulrich},

TITLE = {Deterministische Simulation einer PRAM auf Gitterrechnern},

SCHOOL = {Universit{\"a}t des Saarlandes},

YEAR = {1995},

TYPE = {Master's thesis}

}

Entry last modified by Ulrich Meyer, 03/02/2010

Editor(s)Evelyn Haak | Created04/08/1997 04:28:09 PM | |

Revision1. 0. | EditorUlrich Meyer Evelyn Haak | Edit Date07/09/98 16:44:57 08/04/97 16:30:49 |