Discontinuous Galerkin Library
#include "dg/algorithm.h"
Loading...
Searching...
No Matches
Collaboration diagram for MPI Distributed Gather and Scatter:

Classes

struct  dg::MPIGather< Vector >
 Optimized MPI Gather operation. More...
 
struct  dg::MPIKroneckerGather< Vector >
 Communicator for asynchronous communication of MPISparseBlockMat. More...
 

Functions

template<class MessageType >
std::map< int, MessageType > dg::mpi_permute (const std::map< int, MessageType > &messages, MPI_Comm comm)
 Exchange messages between processes in a communicator.
 
template<class ContainerType >
void dg::mpi_gather (const thrust::host_vector< std::array< int, 2 > > &gather_map, const ContainerType &gatherFrom, ContainerType &result, MPI_Comm comm)
 Un-optimized distributed gather operation.
 
template<class ContainerType >
void dg::mpi_scatter (const thrust::host_vector< std::array< int, 2 > > &scatter_map, const ContainerType &toScatter, ContainerType &result, MPI_Comm comm, bool resize_result=false)
 Un-optimized distributed scatter operation.
 
template<class Integer >
thrust::host_vector< std::array< Integer, 2 > > dg::mpi_invert_permutation (const thrust::host_vector< std::array< Integer, 2 > > &p, MPI_Comm comm)
 Invert a globally bijective index map.
 

Detailed Description

MPI distributed gather and scatter operations

In order to understand what this is about you should first really(!) understand what gather and scatter operations are, so grab pen and paper!

Note
The dg library only implements optimized MPI gather operations. There is an un-optimized dg::mpi_scatter though. The reason is that gather operations are more easy to understand and implement because of the possible reduction in scatter operations.

Primer: Gather and scatter operations

First, we note that gather and scatter are most often used in the context of memory buffers. The buffer needs to be filled wih values (gather) or these values need to be written back into the original place (scatter).

Imagine a buffer vector w and an index map \( \text{g}[i]\) that gives to every index \( i\) in this vector w an index \( \text{g}[i]\) into a source vector v.

We can now define: Gather values from v and put them into w according to \( w[i] = v[\text{g}[i]] \)

Loosely we think of Scatter as the reverse operation, i.e. take the values in w and write them back into v. However, simply writing \( v[\text{g}[j]] = w[j] \) is a very bad definition. What should happen if \( g[j] = g[k]\) for some j and k? What if some indices \( v_i\) are not mapped at all?

It is more accurate to represent the gather and scatter operation by a matrix.

Gather matrix: A matrix \( G\) of size \( m \times N\) is a gather matrix if it consists of only 1's and 0's and has exactly one "1" in each row. \( m\) is the buffer size, \( N \) is the vector size and \( N\) may be smaller, same or larger than \(m\). If \( \text{g}[i]\) is the index map then

\[ G_{ij} := \delta_{\text{g}[i] j}\]

We have \( w = G v\)

Scatter matrix: A matrix \( S \) is a scatter matrix if its transpose is a gather matrix.

This means that \( S\) has size \( N \times m \) consists of only 1's and 0's and has exactly one "1" in each column. If \( \text{g}[j]\) is the index map then

\[ S_{ij} := \delta_{i \text{g}[j]}\]

We have \( v = S w\)

All of the following statements are true

Of the scatter and gather matrices permutations are especially interesting A matrix is a permutation if and only if it is both a scatter and a gather matrix. In such a case it is square \( m \times m\) and

\[ P^{-1} = P^T\]

. The buffer \( w\) and vector \( v\) have the same size \(m\).

The following statements are all true

MPI distributed gather and scatter

Now we turn the case that v and w are distributed across processes. Accordingly, the index map \( g\) is also distributed across processes (in the same way w is). The elements of \( g\) are global indices into v that have to be transformed to pairs

\[ i = [r, j]\]

where j is the local index into v and r is the rank in communicator) according to a user provided function. The user has to provide the index map as vector of mentioned pairs.

Imagine now that we want to perform a globally distributed gather operation. Notice that there a Bootstrap problem involved. The given index map tells each rank from where to receive data but each rank also needs to know where to send its own data to. This means in order to setup the communication we need to communicate to start with (the dg::mpi_permute function does that):

In total we thus describe the global gather as

\[ w = G v = G_1 P_{G,MPI} G_2 v\]

The global scatter operation is then simply

\[ v = S w = G_2^T P^T_{G,MPI} G^T_1 w = S_2 P_{S,MPI} S_1 w \]

(The scatter operation is constructed the same way as the gather operation, it is just the execution that is different)

Note
If the scatter/gather operations are part of a matrix-vector multiplication then \( G_1\) or \( S_1\) can be absorbed into the matrix

\[ M v = R G v = R G_1 P_{G,MPI} G_2 v = R' P_{G,MPI} G_2 v\]

. If R was a coo matrix the simple way to obtain R' is replacing the column indices with the map \( g_1\).

Note
To give the involved vectors unique names we call v the "vector", \( s = G_2 v\) is the "store" and, \( b = P s\) is the "buffer".

For

\[ M v = S C v = S_2 P_{S,MPI} S_1 C v = S_2 P_{S,MPI} C' v\]

. Again, if C was a coo matrix the simple way to obtain C' is replacing the row indices with the map \( g_1\).

Simplifications can be achieved if \( G_2 = S_2 = I\) is the identity or if \( P_{G,MPI} = P_{S,MPI} = P_{MPI}\) is symmetric, which means that in-place communication can be used.

Note
Locally, a gather operation is trivially parallel but a scatter operation is not in general (because of the possible reduction operation).

Function Documentation

◆ mpi_gather()

template<class ContainerType >
void dg::mpi_gather ( const thrust::host_vector< std::array< int, 2 > > & gather_map,
const ContainerType & gatherFrom,
ContainerType & result,
MPI_Comm comm )

Un-optimized distributed gather operation.

Template Parameters
ContainerTypeA (host) vector
Parameters
gather_mapEach element consists of {rank within comm, local index on that rank} pairs, which is equivalent to the global address of a vector element in gatherFrom
gatherFromLocal part of the vector from which the calling and other ranks can gather indices
result(Same size as gather_map on output) On output contains the elements that gather_map referenced
commThe MPI communicator within which to exchange elements
See also
MPI distributed gather and scatter operations

◆ mpi_invert_permutation()

template<class Integer >
thrust::host_vector< std::array< Integer, 2 > > dg::mpi_invert_permutation ( const thrust::host_vector< std::array< Integer, 2 > > & p,
MPI_Comm comm )

Invert a globally bijective index map.

Parameters
pEach element consists of {rank, local index on that rank} pairs, which is equivalent to the global address of a vector element
Attention
Must be bijective i.e. globally distinct elements in toScatter must map to distince elements in result and all elements in result must be mapped
Parameters
commThe MPI communicator within which to exchange elements
Returns
inverse map
See also
MPI distributed gather and scatter operations

◆ mpi_permute()

template<class MessageType >
std::map< int, MessageType > dg::mpi_permute ( const std::map< int, MessageType > & messages,
MPI_Comm comm )

Exchange messages between processes in a communicator.

This happens in two communication phases (find more details in MPI distributed gather and scatter)

  1. Call MPI_Allgather with given communicator to setup the communication pattern among processes
  2. Call MPI_Alltoallv to send the actual messages.

For example

int rank, size;
MPI_Comm_rank( MPI_COMM_WORLD, &rank);
MPI_Comm_size( MPI_COMM_WORLD, &size);
// Send an integer to the next process in a "circle"
std::map<int, int> messages = { {(rank + 1)%size, rank}};
INFO( "Rank "<<rank<<" send message "<<messages[(rank+1)%size]<<"\n");
auto recv = dg::mpi_permute( messages, MPI_COMM_WORLD);
// Each rank received a message from the previous rank
INFO("Rank "<<rank<<" received message "<<recv[(rank+size-1)%size]<<"\n");
CHECK( recv[(rank+size-1)%size] == (rank+size-1)% size);
// If we permute again we send everything back
auto messages_num = dg::mpi_permute( recv, MPI_COMM_WORLD);
CHECK( messages == messages_num);
Template Parameters
MessageTypeCan be one of the following
  1. A primitive type like int or double
  2. A (host) vector of primitive types like std::vector<int> or thrust::host_vector<double>
  3. A (host) vector of std::array of primitive types like thrust::host_vector<std::array<double,3>>
Parameters
messages(in) messages[rank] contains the message that the calling process sends to the process rank within comm
commThe MPI communicator within which to exchange messages. All processes in comm need to call this function.
Returns
received_messages[rank] contains the message that the calling process receveived from the process with rank in comm
Note
This can be used to bootstrap mpi gather operations if elements is an index map "recvIdx" of local indices of messages to receive from rank, because it "tells" every process which messages to send
This function is a permutation i.e.
recvIdx == dg::mpi_permute( dg::mpi_permute(recvIdx, comm), comm);
std::map< int, MessageType > mpi_permute(const std::map< int, MessageType > &messages, MPI_Comm comm)
Exchange messages between processes in a communicator.
Definition mpi_permutation.h:91
Template Parameters
ContainerTypeShared ContainerType.
See also
MPI distributed gather and scatter operations
Also can be used to invert a bijective mpi gather map in dg::mpi_invert_permutation

◆ mpi_scatter()

template<class ContainerType >
void dg::mpi_scatter ( const thrust::host_vector< std::array< int, 2 > > & scatter_map,
const ContainerType & toScatter,
ContainerType & result,
MPI_Comm comm,
bool resize_result = false )

Un-optimized distributed scatter operation.

Template Parameters
ContainerTypeA (host) vector
Parameters
scatter_mapEach element consists of {rank, local index on that rank} pairs, which is equivalent to the global address of an element in result
Attention
Must be injective i.e. globally distinct elements in toScatter must map to distince elements in result
Parameters
toScatterSame size as scatter_map. The scatter_map tells where each element in this vector is sent to
resultIn principle we must know the size of result beforehand (because how else did you come up with a scatter_map)
commThe MPI communicator within which to exchange elements
resize_resultIf true we resize the result to the correct size (mainly needed for dg::mpi_invert_permutation)
See also
MPI distributed gather and scatter operations