Optimized MPI Gather operation.
More...
|
| MPIGather (MPI_Comm comm=MPI_COMM_NULL) |
| no communication
|
|
| MPIGather (const std::map< int, thrust::host_vector< int > > &recvIdx, MPI_Comm comm) |
| Short for MPIGather( recvIdx, 1, comm)
|
|
| MPIGather (const std::map< int, thrust::host_vector< int > > &recvIdx, unsigned chunk_size, MPI_Comm comm) |
| Construct from global index map.
|
|
template<class ArrayVec = thrust::host_vector<std::array<int,2>>, class IntVec = thrust::host_vector<int>> |
| MPIGather (const ArrayVec &gather_map, IntVec &bufferIdx, MPI_Comm comm) |
| Convert an unsorted and possible duplicate global index list to unique stable_sorted by pid and duplicates map.
|
|
template<template< typename > typename OtherVector> |
| MPIGather (const MPIGather< OtherVector > &src) |
| Construct from other execution policy.
|
|
MPI_Comm | communicator () const |
| The internal MPI communicator used.
|
|
bool | isContiguous () const |
| Check whether the message from the constructor is contiguous in memory.
|
|
unsigned | buffer_size () const |
| The local size of the buffer vector w = local map size.
|
|
bool | isCommunicating () const |
| True if the gather/scatter operation involves actual MPI communication.
|
|
template<class ContainerType0 , class ContainerType1 > |
void | global_gather_init (const ContainerType0 &gatherFrom, ContainerType1 &buffer) const |
| \( w' = P_{G,MPI} G_2 v\). Globally (across processes) asynchronously gather data into a buffer
|
|
template<class ContainerType > |
void | global_gather_wait (ContainerType &buffer) const |
| Wait for asynchronous communication to finish and gather received data into buffer.
|
|
template<template< class > class Vector>
struct dg::MPIGather< Vector >
Optimized MPI Gather operation.
This class stores the communication pattern given in its constructor and derives an optimized MPI communication to implement it.
- Template Parameters
-
Vector | a thrust Vector e.g. thrust::host_vector or thrust::device_vector Determines the internal buffer type of the \( G_2\) gather operation |
- See also
- MPI distributed gather and scatter operations A un-optimized version is available in
dg::mpi_gather
◆ MPIGather() [1/5]
template<template< class > class Vector>
dg::MPIGather< Vector >::MPIGather |
( |
MPI_Comm | comm = MPI_COMM_NULL | ) |
|
|
inline |
no communication
- Parameters
-
comm | optional MPI communicator: the purpose is to be able to store MPI communicator even if no communication is involved in order to construct MPI_Vector with it |
◆ MPIGather() [2/5]
template<template< class > class Vector>
dg::MPIGather< Vector >::MPIGather |
( |
const std::map< int, thrust::host_vector< int > > & | recvIdx, |
|
|
MPI_Comm | comm ) |
|
inline |
Short for MPIGather( recvIdx, 1, comm)
◆ MPIGather() [3/5]
template<template< class > class Vector>
dg::MPIGather< Vector >::MPIGather |
( |
const std::map< int, thrust::host_vector< int > > & | recvIdx, |
|
|
unsigned | chunk_size, |
|
|
MPI_Comm | comm ) |
|
inline |
Construct from global index map.
- Parameters
-
recvIdx | recvIdx[PID] consists of the local indices on PID that the calling process receives from that PID ( = gather map) |
chunk_size | If the communication pattern consists of equally sized chunks one can specify in recvIdx only the starting indices of each chunk and use chunk_size to specify how many indices should be sent |
comm | The MPI communicator participating in the gather operations |
- Note
- Messages will be received in
buffer
in the order that for(
auto& msg : recvIdx)
is unrolled
◆ MPIGather() [4/5]
template<template< class > class Vector>
template<class ArrayVec = thrust::host_vector<std::array<int,2>>, class IntVec = thrust::host_vector<int>>
dg::MPIGather< Vector >::MPIGather |
( |
const ArrayVec & | gather_map, |
|
|
IntVec & | bufferIdx, |
|
|
MPI_Comm | comm ) |
|
inline |
Convert an unsorted and possible duplicate global index list to unique stable_sorted by pid and duplicates map.
- Parameters
-
gather_map | Each element consists of {rank, local index on that rank} pairs, which is equivalent to the global address of a vector element in gatherFrom . gather_map can be unsorted and contain duplicate entries. The implementation will only send the unique indices through the network. |
bufferIdx | (Write only) On output resized to gather_map.size() . On output contains index into the resulting buffer vector in global_gather_init and global_gather_wait that corresponds to the requested gather_map |
- Note
bufferIdx
is the index map for \( G_1\) in MPI distributed gather and scatter If gather_map
stems from the column indices of a row distributed matrix then bufferIdx
becomes the new column index of that matrix acting on the local buffer
- Parameters
-
comm | The MPI communicator participating in the gather operations |
◆ MPIGather() [5/5]
template<template< class > class Vector>
template<template< typename > typename OtherVector>
Construct from other execution policy.
This makes it possible to construct an object on the host and then copy everything on to a device
- Template Parameters
-
OtherVector | other container type |
- Parameters
-
◆ buffer_size()
template<template< class > class Vector>
The local size of the buffer vector w = local map size.
- Returns
- buffer size (may be different for each process)
- Note
- may return 0, which just means that the calling rank does not receive any data from any other rank including itself. The calling rank may still need to send data in
global_gather_init
- Attention
- It is therfore not valid to check for zero buffer size if you want to find out whether a given rank needs to send MPI messages or not. The right way to do it is to call
isCommunicating()
- See also
- isCommunicating()
◆ communicator()
template<template< class > class Vector>
The internal MPI communicator used.
- Returns
- MPI Communicator
◆ global_gather_init()
template<template< class > class Vector>
template<class ContainerType0 , class ContainerType1 >
void dg::MPIGather< Vector >::global_gather_init |
( |
const ContainerType0 & | gatherFrom, |
|
|
ContainerType1 & | buffer ) const |
|
inline |
\( w' = P_{G,MPI} G_2 v\). Globally (across processes) asynchronously gather data into a buffer
- Template Parameters
-
ContainerType | Can be any shared vector container on host or device, e.g.
- thrust::host_vector<double>
- thrust::device_vector<double>
- thrust::device_vector<thrust::complex<double>>
|
- Parameters
-
gatherFrom | source vector v; data is collected from this vector |
buffer | The buffer vector w, must have buffer_size() |
- Attention
- It is unsafe to write values to
gatherFrom
or to read values in buffer
until global_gather_wait
has been called
- Note
- If
!isCommunicating()
then this call will not involve MPI communication but will still gather values according to the given index map
◆ global_gather_wait()
template<template< class > class Vector>
template<class ContainerType >
void dg::MPIGather< Vector >::global_gather_wait |
( |
ContainerType & | buffer | ) |
const |
|
inline |
Wait for asynchronous communication to finish and gather received data into buffer.
Call MPI_Waitall
on internal MPI_Request
variables and manage host memory in case of cuda-unaware MPI. After this call returns it is safe to use the buffer and the gatherFrom
variable from the corresponding global_gather_init
call
- Parameters
-
buffer | (write only) where received data resides on return; must be identical to the one given in a previous call to global_gather_init() |
◆ isCommunicating()
template<template< class > class Vector>
True if the gather/scatter operation involves actual MPI communication.
This is more than just a test for zero message size. This is because even if a process has zero message size indicating that it technically does not need to send any data at all it might still need to participate in an MPI communication (sending an empty message to indicate that a certain point in execution has been reached). Only if none of the processes in the process group has anything to send will this function return false. This test can be used to avoid the gather operation alltogether in e.g. the construction of a MPI distributed matrix.
- Note
- this check involves MPI communication itself, because a process needs to check if itself or any other process in its group is communicating.
- Returns
- False, if the global gather can be done without MPI communication (i.e. the indices are all local to each calling process), or if the communicator is
MPI_COMM_NULL
. True else.
- See also
- buffer_size()
◆ isContiguous()
template<template< class > class Vector>
Check whether the message from the constructor is contiguous in memory.
◆ MPIGather
template<template< class > class Vector>
template<template< class > class OtherVector>
The documentation for this struct was generated from the following file: