This is not very efficient. There are three copies in addition to the exchange of data between processes. We prefer
But this requires that either that MPI_Send not return until the data has been delivered or that we allow a send operation to return before completing the transfer. In this case, we need to test for completion later.