by Paraskevi Fragopoulou and
Ole H. Nielsen,
UNI-C,
Technical University of Denmark, Bldg. 304,
DK-2800 Lyngby, Denmark.
and
Center for Atomic-scale Materials Physics (CAMP), Physics Dept., Technical
University of Denmark, Bldg. 307, DK-2800 Lyngby, Denmark.
in proceedings of Workshop on Applied Parallel Computing in Industrial Problems and Optimization, August 18-21, 1996 (Lyngby, Copenhagen, Denmark), to be published in Springer Lecture Notes in Computer Science.
We discuss algorithms for global reduction (or combine) operations (e.g., global sums) for numbers of processors that need not be a power of 2, and implement these using standard message-passing techniques on distributed-memory parallel computers. We present performance results measured on an SP2 parallel computer at UNI-C. Significant performance improvements are obtained by using a recursive doubling method with a vector splice/gather approach.