Performance Evaluation and Modeling of Reduction Operations on the IBM RS/6000 SP Parallel Computer

by Paraskevi Fragopoulou and Ole H. Nielsen,
UNI-C, Technical University of Denmark, Bldg. 304, DK-2800 Lyngby, Denmark.
and Center for Atomic-scale Materials Physics (CAMP), Physics Dept., Technical University of Denmark, Bldg. 307, DK-2800 Lyngby, Denmark.

in proceedings of Workshop on Applied Parallel Computing in Industrial Problems and Optimization, August 18-21, 1996 (Lyngby, Copenhagen, Denmark), to be published in Springer Lecture Notes in Computer Science.

Abstract

We discuss algorithms for global reduction (or combine) operations (e.g., global sums) for numbers of processors that need not be a power of 2, and implement these using standard message-passing techniques on distributed-memory parallel computers. We present performance results measured on an SP2 parallel computer at UNI-C. Significant performance improvements are obtained by using a recursive doubling method with a vector splice/gather approach.


PostScript file of the full paper, 157 kB.