Monday, November 10, 2014

FIRST EXAMPLES IN MPI

Goals

The way to learn about the message passing interface (MPI) is to actually use it. Several examples are provided in the following that build gradually in their use of MPI commands. Copy these programs and compile them on the cluster, using either the Fortran or C versions. Make small changes and try to anticipate the consequences. Learn about the MPI commands:
  • MPI_Init
  • MPI_Finalize
  • MPI_Comm_rank
  • MPI_Comm_size
  • MPI_Send
  • MPI_Recv

First MPI Example

Consider the following program, called mpisimple1.c. This program is written in C with MPI commands included. It initializes MPI, executes a single print statement, then Finalizes (Quits) MPI.
In C:
#include < mpi.h>   /* PROVIDES THE BASIC MPI DEFINITION AND TYPES */
#include < stdio.h>

int main(int argc, char **argv) {
  
  MPI_Init(&argc, &argv); /*START MPI */
  printf("Hello world\n");
  MPI_Finalize();  /* EXIT MPI */
}
In Fortran:
 program mpisimple
      
      implicit none
      
      integer ierr

      include 'mpif.h'

      call mpi_init(ierr)

c     print message to screen

      write(6,*) 'Hello World!'

      call mpi_finalize(ierr)

      end


To compile this code, type:
 mpicc -o simple1 mpisimple1.c
or
 mpif77 -o simple1 mpisimple.f
To run this compiled code, type:
 mpirun -np 4 simple1
In the above example, the code "simple1" will execute on four processors (-np 4). Output printed to the screen will look like:
Hello world
Hello world
Hello world
Hello world
Discussion: The four processors each perform the exact same task. Each processor prints a single line. MPI_Init and MPI_Finalize take care of all of the details associated with running the code on multiple processors. MPI_Init must be called before any other MPI command. MPI_Finalize must be called last. So, these two commands form a wrapper around the body of the parallel code. Note that errors may result if more processors are specified than are available on the system. A typical (and uninformative!) error message looks like:
tsn004.acomp.usf.edu: No route to host
bm_list_15782:  p4_error: interrupt SIGINT: 2
rm_l_1_18402:  p4_error: interrupt SIGINT: 2
p0_15781:  p4_error: interrupt SIGINT: 2
rm_l_2_18291:  p4_error: interrupt SIGINT: 2
rm_18401:  p4_error: interrupt SIGINT: 2
rm_l_3_18120:  p4_error: interrupt SIGINT: 2
rm_18290:  p4_error: interrupt SIGINT: 2
rm_18119:  p4_error: interrupt SIGINT: 2

Example 2

Most parallel codes assign different tasks to different processors. For example, parts of an input data set might be divided and processed by different processors, or a finite difference grid might be divided among the processors available. This means that the code needs to identify processors. In this example, processors are identified by rank - an integer from 0 to total number of processors - 1.
In C:
#include < mpi.h>   /* PROVIDES THE BASIC MPI DEFINITION AND TYPES */
#include  < stdio.h>

int main(int argc, char **argv) {
  
  int my_rank; 
  int size;
  MPI_Init(&argc, &argv); /*START MPI */

/*DETERMINE RANK OF THIS PROCESSOR*/
  MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); 

 /*DETERMINE TOTAL NUMBER OF PROCESSORS*/
  MPI_Comm_size(MPI_COMM_WORLD, &size);


  printf("Hello world! I'm rank (processor number) %d of size %d\n", my_rank, size);

  MPI_Finalize();  /* EXIT MPI */
  
}
in Fortran:
  program simple2
      
      implicit none
      
      integer ierr,my_rank,size

      include 'mpif.h'

      call mpi_init(ierr)

      call mpi_comm_rank(MPI_COMM_WORLD,my_rank,ierr)
      call mpi_comm_size(MPI_COMM_WORLD,size,ierr)

c     print rank and size to screen

      write(6,100) my_rank, size

      call mpi_finalize(ierr)

 100  format('Hello World! I am rank ', I2, ' of size ', I2)

      end


Compiling and executing the code using 4 processors yields this result:
Hello world! I'm rank 0 of size 4
Hello world! I'm rank 2 of size 4
Hello world! I'm rank 1 of size 4
Hello world! I'm rank 3 of size 4
Discussion: Two additional MPI commands appear in this code: MPI_Comm_rank(,) and MPI_Comm_size(,). MPI_Comm_rank(,) returns the processor id assigned to each processor during start-up. This rank is returned as an integer (in this case called my_rank). The first parameter (MPI_COMM_WORLD) is predefined in MPI, and includes information about all the processors started when the program execution begins. Similarly, MPI_Comm_size returns the total number of processors - in this case 4. It is important to know the total number of processors so the problem set (data for example) can be divided among all available. Note that the code does not print the output in any particular order. In this case, rank 2 prints before rank 1. This order changes depending on the vagaries of communication between the nodes. Additional MPI commands are needed to structure communication more effectively.

Example 3

A further example illustrates the need to control flow in the program. The following code is essentally the same a Example 2. An additional conditional ("If") statement has been added near the end of the code. If the rank of the processor is 0, let the world know that you are finished.
In C:
#include  < mpi.h>   /* PROVIDES THE BASIC MPI DEFINITION AND TYPES */
#include < stdio.h>

int main(int argc, char **argv) {
  
  int my_rank; 
  int size;
  MPI_Init(&argc, &argv); /*START MPI */
  MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); /*DETERMINE RANK OF THIS PROCESSOR*/
  MPI_Comm_size(MPI_COMM_WORLD, &size); /*DETERMINE TOTAL NUMBER OF PROCESSORS*/


  printf("Hello world! I'm rank (processor number) %d of size %d\n", my_rank, size);

  if (rank == 0) printf("That is all for now!\n");
  MPI_Finalize();  /* EXIT MPI */
    
  exit(0);
  
}
In Fortran:

 program simple3
      
      implicit none
      
      integer ierr,my_rank,size

      include 'mpif.h'

      call mpi_init(ierr)

      call mpi_comm_rank(MPI_COMM_WORLD,my_rank,ierr)
      call mpi_comm_size(MPI_COMM_WORLD,size,ierr)

c     print message, rank and size to screen

      write(6,100) my_rank, size

      if(my_rank.eq.0) then
         write(6,*) 'That is all for now!'
      end if

      call mpi_finalize(ierr)

 100  format('Hello World! I am rank ', I2, ' of size ', I2)

      end


Output from this code will likely look like:
Hello world! I'm rank (processor number) 0 of size 4
That is all for now!
Hello world! I'm rank (processor number) 1 of size 4
Hello world! I'm rank (processor number) 2 of size 4
Hello world! I'm rank (processor number) 3 of size 4
Discussion: Note that the "If" statement does not have the desired effect. Although the if statement appears later in the code that earlier print statments, this step happens to execute faster on processor 0 than on the other processors. So the program claims it is finished before it should. This can lead to serious problems if, for example, subsequent calculations depend on prior results from multiple processors. One method of avoiding this problem is given in the next example.

Example 4

Two additional MPI commands may be used to direct traffic (message queuing) during the program execution: MPI_Send() and MPI_Recv(). It is safe to say these two commands are at the heart of MPI. Use of these statements makes the program appear more complicated, but it is well worth it if the flow of the program needs to be controlled. In C:
#include < mpi.h>   /* PROVIDES THE BASIC MPI DEFINITION AND TYPES */
#include < stdio.h>

int main(int argc, char **argv) {
  
  int my_rank; 
  int partner;
  int size, i,t;
  char greeting[100];
  MPI_Status stat;
  
  MPI_Init(&argc, &argv); /*START MPI */
  MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); /*DETERMINE RANK OF THIS PROCESSOR*/
  MPI_Comm_size(MPI_COMM_WORLD, &size); /*DETERMINE TOTAL NUMBER OF PROCESSORS*/
  
  
  sprintf(greeting, "Hello world: processor %d of %d\n", my_rank, size);
  
  /* adding a silly conditional statement like the
     following graphically illustrates "blocking" and
     flow control during program execution
     
     if (my_rank == 1) for (i=0; i<1000000000; i++) t=i;
     
  */
  
  if (my_rank ==0) {
    fputs(greeting, stdout);
    for (partner = 1; partner < size; partner++){
      
      MPI_Recv(greeting, sizeof(greeting), MPI_BYTE, partner, 1, MPI_COMM_WORLD, &stat);
      fputs (greeting, stdout);
      
    }
  }
  else {
    MPI_Send(greeting, strlen(greeting)+1, MPI_BYTE, 0,1,MPI_COMM_WORLD);
  }
  
  
  
  if (my_rank == 0) printf("That is all for now!\n");
  MPI_Finalize();  /* EXIT MPI */
  
}
In Fortran:
  program simple4
      
      implicit none
      
      integer ierr,my_rank,size,partner
      CHARACTER*50 greeting

      include 'mpif.h'
      integer status(MPI_STATUS_SIZE)


      call mpi_init(ierr)

      call mpi_comm_rank(MPI_COMM_WORLD,my_rank,ierr)
      call mpi_comm_size(MPI_COMM_WORLD,size,ierr)

      write(greeting,100) my_rank, size


      if(my_rank.eq.0) then
         write(6,*) greeting
         do partner=1,size-1
         call mpi_recv(greeting, 50, MPI_CHARACTER, partner, 1, 
     &    MPI_COMM_WORLD, status, ierr)
            write(6,*) greeting
         end do
      else
         call mpi_send(greeting, 50, MPI_CHARACTER, 0, 1, 
     &    MPI_COMM_WORLD, ierr)
      end if

      if(my_rank.eq.0) then
         write(6,*) 'That is all for now!'
      end if

      call mpi_finalize(ierr)

 100  format('Hello World: processor ', I2, ' of ', I2)

      end

Output from this code looks like:
Hello world: processor 0 of 4
Hello world: processor 1 of 4
Hello world: processor 2 of 4
Hello world: processor 3 of 4
That is all for now!
Discussion: The structure of the program has been changed to assure that the output is in the proper order (the processors are now listed in ascending order). Furthermore, the statement "That is all for now!" prints last. Program execution was effectively blocked until all processors had the opportunity to print. This was done by communicating between processors. Rather than simply printing, most processors now send the greeting back to processor 0 - usually called the root of the master processor. This processor first prints its own greeting, then polls successive processors - waiting to receive a message from each one. Only when the message is received does proccessor 0 move on. Using the MPI_Send and MPI_Recv commands blocks program execution. This blocking is illustrated graphically by inserting a long loop in the code, causing one of the processors to take a long time to complete its tasks. The cost of this structure is added syntax.
Here is the syntax for MPI_Send and MPI_Recv:
int MPI_Send (
                      message ,    /* actual information sent */
                      length,         /* length of the message */
                      datatype,   /* MPI datatype of the message */
                      destination, /* rank of the processor getting the message */
                      tag,             /* tag helps sort messages - likely an int and the same as in MPI_Recv
                      MPI_Comm  /* almost alway MPI_Comm_World 
                      )   

int MPI_Recv (
                      message ,     /* actual information received*/
                      length,           /* length of the message */
                      datatype,     /* MPI datatype of the message */
                      source,         /* rank of the processor sending the message */
                      tag,               /* tag helps sort messages - likely an int and the same as in MPI_Recv* /
                      MPI_Comm,  /* almost alway MPI_Comm_World */
                      Status           /* a data structure that contains info on what was recv'd */
                      )   
A few additional notes:
  • MPI_Recv is "blocking" in the sense that when the process (in this case my_rank ==0) reaches the MPI_Recv statement, it will wait until it actually receives the message (another process sends it). If the other process is not ready to Send, then the process running on my_rank == 0 will simply remain idle. If the message is never sent, my_rank==0 will wait a very long time!
  • The message is of type "datatype" and is predefined in MPI. Possible datatypes are summarized in the following table.
    MPI Datatype C Datatype
    MPI_FLOAT float
    MPI_DOUBLE double
    MPI_LONG_DOUBLE long double
    MPI_INT signed int
    MPI_LONG signed long int
    MPI_SHORT signed short int
    MPI_CHAR signed char
    MPI_UNSIGNED unsigned int
    MPI_UNSIGNED_SHORT unsigned short int
    MPI_UNSIGNED_LONG unsigned long int
    MPI_UNSIGNED_CHAR unsigned char
    MPI_BYTE
    MPI_PACKED
  • The length of the Recv'd message does not need to be the same length as the Sent message. Typically the length of the sent message is known exactly, but the space allocated for the Recv'd message is larger, because it may not be known.

No comments:

Post a Comment