Optimal and near optimal parallel algorithms for several fundamental problems are proposed for a parallel organization consistmg of n processors, each having access to a row and a column of an n × n array of memory modules. Parallel computations are implemented on such an organization by decomposing them into alternating orthogonal processing phases. A number of efficient data movement techniques are developed for the proposed organization which lead to optimal or near optimal solutions to several communication-intensive problems such as sorting, performing permutations, list ranking (data dependent parallel prefix), and problems on graphs represented by an unsorted list of n2 edges. It is also shown that the proposed organization is capable of simulating any fixed-degree network on n2 processors with O(n) loss in time, which is optimal. Finally, an enhanced organization having p processors, 1 ≤ p ≤ n2, and O(n2) memory locations is presented, and is shown to provide optimal speedups for adjacency-matrix based graph problems for any number of processors in the range [1, n3/2].
ASJC Scopus subject areas
- Theoretical Computer Science
- Hardware and Architecture
- Computer Networks and Communications
- Artificial Intelligence