IBM Books

MPI Subroutine Reference

MPI_REDUCE, MPI_Reduce

Purpose

Applies a reduction operation to the vector sendbuf over the set of tasks specified by comm and places the result in recvbuf on root.

C synopsis

#include <mpi.h>
int MPI_Reduce(void* sendbuf,void* recvbuf,int count,
    MPI_Datatype datatype,MPI_Op op,int root,MPI_Comm comm);

C++ synopsis

#include mpi.h
void MPI::Comm::Reduce(const void* sendbuf, void* recvbuf, int count, 
		       const MPI::Datatype& datatype, const MPI::Op& op, 
		       int root) const;

FORTRAN synopsis

include 'mpif.h' or use mpi
MPI_REDUCE(CHOICE SENDBUF,CHOICE RECVBUF,INTEGER COUNT,
    INTEGER DATATYPE,INTEGER OP,INTEGER ROOT,INTEGER COMM,
    INTEGER IERROR)

Parameters

sendbuf
is the address of the send buffer (choice) (IN)

recvbuf
is the address of the receive buffer (choice, significant only at root) (OUT)

count
is the number of elements in the send buffer (integer) (IN)

datatype
is the datatype of elements of the send buffer (handle) (IN)

op
is the reduction operation (handle) (IN)

root
is the rank of the root task (integer) (IN)

comm
is the communicator (handle) (IN)

IERROR
is the FORTRAN return code. It is always the last argument.

Description

This subroutine applies a reduction operation to the vector sendbuf over the set of tasks specified by comm and places the result in recvbuf on root.

The input buffer and the output buffer have the same number of elements with the same type. The arguments sendbuf, count, and datatype define the send or input buffer. The arguments recvbuf, count and datatype define the output buffer. MPI_REDUCE is called by all group members using the same arguments for count, datatype, op, and root. If a sequence of elements is provided to a task, the reduction operation is executed element-wise on each entry of the sequence. Here's an example. If the operation is MPI_MAX and the send buffer contains two elements that are floating point numbers (count = 2 and datatype = MPI_FLOAT), recvbuf(1) = global max(sendbuf(1)) and recvbuf(2) = global max(sendbuf(2)).

Users can define their own operations or use the predefined operations provided by MPI. User-defined operations can be overloaded to operate on several datatypes, either basic or derived. The argument datatype of MPI_REDUCE must be compatible with op. See IBM Parallel Environment for AIX: MPI Programming Guide for a list of the MPI predefined operations.

The "in place" option for intracommunicators is specified by passing the value MPI_IN_PLACE to the argument sendbuf at the root. In this case, the input data is taken at the root from the receive buffer, where it will be replaced by the output data.

If comm is an intercommunicator, the call involves all tasks in the intercommunicator, but with one group (group A) defining the root task. All tasks in the other group (group B) pass the same value in argument root, which is the rank of the root in group A. The root passes the value MPI_ROOT in root. All other tasks in group A pass the value MPI_PROC_NULL in root. Only send buffer arguments are significant in group B and only receive buffer arguments are significant at the root.

MPI_IN_PLACE is not supported for intercommunicators.

When you use this subroutine in a threads application, make sure all collective operations on a particular communicator occur in the same order at each task. See IBM Parallel Environment for AIX: MPI Programming Guide for more information on programming with MPI in a threads environment.

Notes

See IBM Parallel Environment for AIX: MPI Programming Guide.

The MPI standard urges MPI implementations to use the same evaluation order for reductions every time, even if this negatively affects performance. PE MPI adjusts its reduce algorithms for the optimal performance on a given task distribution. The MPI standard suggests, but does not mandate, this sacrifice of performance. PE MPI chooses to put performance ahead of the MPI standard's recommendation. This means that two runs with the same task count may produce results that differ in the least significant bits, due to rounding effects when evaluation order changes. Two runs that use the same task count and the same distribution across nodes will always give identical results.

Errors

Fatal errors:

Invalid count
count < 0

Invalid datatype

Type not committed

Invalid op

Invalid root

For an intracommunicator: root < 0 or root >= groupsize

For an intercommunicator: root < 0 and is neither MPI_ROOT nor MPI_PROC_NULL, or root >= groupsize of the remote group

Invalid communicator

Unequal message lengths

Invalid use of MPI_IN_PLACE

MPI not initialized

MPI already finalized

Develop mode error if:

Inconsistent op

Inconsistent datatype

Inconsistent root

Inconsistent message length

Related information

MPE_IREDUCE
MPI_ALLREDUCE
MPI_OP_CREATE
MPI_REDUCE_SCATTER
MPI_SCAN


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]