Showing posts with label sources. Show all posts

Monday, 30 May 2011

Meeting 6 [25/05] and how to use the tool

The meeting

Even thought a month came by, few new functionalities appeared into the project. Firstly because it was revision and examination period. During that period effort was made to insure the continuation of the project as design and researches. Some ideas on how to implement the synchronisation and possible functionalities were discuss (like the communicator registration).

The next step is to generate documentation for the project, as a proper webpage (and because some part of the code isn't much commented - as it was evolving quickly). Effort has to be make to make the project's folder organised. And some implementation will be carried on:

communicator registration, to organise the MPI_Request saving
syncrhonous communication (wait for the user to click continue before actually performing the MPI function

Was already in place the basis to do both implementation (MPI_Request are saved into a linked list and the backbone of the synchronisation is implemented - but not functional).

Using the tool

This tool aims to help people learning MPI behaviour. The sources have therefore been open to the "public". The first attempt was on an internal machine - Ness - that didn't work correctly. Therefore the project will be registered on source-forge, as it was planned, in advance.

This part is to explain how to use the current version of the code, and shouldn't change much in the future releases.

The project is composed of a library - the profiler - and an executable - the interface. The project should be organised into folders, one per deliverable. And should include tests. A general Makefile should be available to compile each of the deliverable, and a configure script may be available to automatise the variable generation (installation path, MPI flags to compile from the MPI compilers, Qt path, ...).

Compiling the profiler

Compiling the profiler requires:

a C MPI implementation
a C compiler

The profiler is available in both static or dynamic linking format, as only the linking stage changes. It is important for the user to be able to choose one or the other, as it appeared some MPI installation do not accept another type of library to use the MPI profiling interface.

Either mode could be compiled and installed, but note that if both are installed, it appears that dynamic linking is used by default.

Running make static or make dynamic should compile and install the library, by default in a local install folder composed of the classical lib and includes folders.

Compiling the interface

Compiling the interface requires:

Qt 4.6 or later (note that Qt 4.7 was used but none of the used functionalities where introduced on that release).
an C++ MPI implementation that supports multi-threading (see a previous note).
a C++ compiler
the headers from the profiler

The interface should be compilable from the main Makefile. A typical Qt project needs a project file to be generated that will generate the Makefile to compile it. Normally this process should be automatic, as the main Makefile should do so. If a configure script is available it should handles the variable generation, otherwise some variable needs to be set up:

INSTALL_ROOT should contains the path to the installation folder (default: ../install as it is relative to the interface folder where it is built).
MPI_INCLUDE should contain the path to the MPI headers. It can be retrieved by using mpicc -showme and is generally like -I/usr/local/include. However the -I should be REMOVED from the project option as QMake will generate it automatically.
MPI_LINK should contain the linking options given by mpicc -showme and is generally like -pthread -L/usr/local/lib -lmpi_cxx -lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl.
MPI_EXTRA_FLAGS should be set up to -DMPICH_IGNORE_CXX_SEEK when using MPICH2 to avoid conflict with standard C++ file handling.

When the project file is done, and named as mpidisplay.pro, running make display should take care of the 2 compilation steps and of the installation. Nonetheless the steps are:

The generation of the Makefile qmake mpidisplay.pro. You can specify the previously stated variables in the command line or in the file itself (example: qmake mpidisplay.pro INSTAL_ROOT=../install).
Compiling the executable with the generated Makefile: make -f Makefile.qt
Installing the executable is done by calling make -f Makefile.qt install

Using a MPI program with the library

Compiling

In order to compile the library with the profiler options, you need to know where the profiler library is installed. Let's assume ~/local/, meaning that the library is in ~/local/lib and the headers in ~/local/includes. The location of the mpidisplay interface isn't important yet, but it certainly in ~/local/bin.

Note that to compile - even as a dynamic library - you do not need the LD_LIBRARY_PATH to be updated, but you will need it to run the software later. You don't need to update the variable if you use static linking as the library is completely added to your executable. To set the path simple execute export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/local/lib in your shell - or add it to your bashrc file.

Compiling your program is done exactly the same way than any other MPI program with additional library. You need to add the headers path to the compiler flags and the library location and name to the linking flags. In a generic Makefile this is done by adding CFLAGS+=-I~/local/includes and LDFLAGS+=-L~/local/lib -lmpi_wrap.

The only source modification is then to add to your code files:

#include <mpi_wrap.h>

In theory you can even remove the flags if you don't want to use the library, but be aware that the compiler might complain about not finding "mpi_wrap.h". Therefore you can define a precompiler macro WITH_MPIWRAP by adding CFLAGS+=-I~/local/includes -DWITH_MPIWRAP and doing your include as

#ifdef WITH_MPIWRAP
#include <mpi_wrap.h>
#endif

Running

As stated before your LD_LIBRARY_PATH should be updated if you are using a dynamic linking. Otherwise simply run your MPI program as usual. Assuming your program executable is called ring you usually ran mpiexec -n 4 ring to run with 4 MPI processes. With the library it is the same!

The profiler library will write the port on the standard output by default. But you can add command line arguments to define another way:

Standard output with --port-in-stdout
Standard error output with --port-in-stderr
A text file with --port-in-file file

Start the interface

Starting message box of the interface (GNU/Linux Gnome 3)

In order to start the mpidisplay interface you need to add its location to the PATH with the same technique than the LD_LIBRARY_PATH: export PATH=$PATH:~/local/bin. Then simply run mpidisplay to see the connection window. If you exported the port on the standard output move to the "manual" tab and write the ports in the fields. You can change the number of processors in the list - and the order of the port does not matter. If you used a text file, click the button and select it; the ports information will be loaded on the text edit area underneath if you need edition (and the number of processors should be updated).

Then simply click OK to start the interface.

You can find for the moment 2 main information in the interface: the number of calls to a sample of MPI routines and the time spent in them.

Sunday, 20 February 2011

Using MPI sockets

This article will present how to use MPI to create a remote socket and use it through MPI calls. Remember that we have the profiler - the library part that uses the profiling interface of MPI to profile the program - and the display - that displays the information sent by the profiler - parts that communicate.

First of all a research was made in order to try to find out how to create a socket with MPI on the profiler and communicate with some other socket library on the display. So far no example were found using that approach, and as this is a technical test, no real implementation was done that way.

The approach used here is to bind the profiler and display communicators using a technique similar to MPI_Spawn but that doesn't require the 2 softwares to be tight together. This is done using the MPI_Open_port functions.

The code wasn't modifier a lot from the MPI Spawn approach, as you are going to see. The reference used to understand and develop that approach was actually the MPI standard website: 5.4.6. Client/Server Examples

The profiler side - server side

The global idea of that approach is for the profiler to open a port, and wait for some display to connect on it. The idea can be pushed further, if needed, to allow several display to connect on a single profiler (sharing the view of the program on several display for example).

Actually what was modified from the Spawn example is the way to connect the profiler and the display together. Rather than calling MPI_Spawn, MPI_Open_port was used, and few lines were added just before finalizing the execution.

Opening the port

int start_child(char* command, char* argv[])
{
  MPI_Open_port(MPI_INFO_NULL, port_name);

  /* child doesn't find it...
    sprintf(published, "%s-%d\0", PROFNAME, world_rank);

    MPI_Publish_name(published, MPI_INFO_NULL, port_name);*/

  fprintf(stderr, "!profiler!(%d) open port '%s'\n", world_rank, port_name);

  fprintf(stderr, "!profiler!(%d) waiting for a child...\n", world_rank);

  MPI_Comm_accept(port_name, MPI_INFO_NULL, 0, MPI_COMM_SELF, &intercomm);

  fprintf(stderr, "!profiler!(%d) got a child!\n", world_rank);

  int r;
  MPI_Comm_rank(intercomm, &r);
  fprintf(stderr, "!profiler!(%d) is %d on parent!\n", world_rank, r);

  // wait for a message that "I'm dying"
  if ( PMPI_Irecv(&(quitmessage[0]), INTRA_MESSAGE_SIZE, INTRA_MESSAGE_MPITYPE, CHILD_RANK, INTERCOMM_TAG, intercomm, &dead_child) != MPI_SUCCESS )
    {
      intercomm = MPI_COMM_NULL;

      fprintf(stderr, "!profiler!(%d) communication failed!\n", world_rank);
      intercomm = MPI_COMM_NULL;
      return FAILURE;
    }

  char mess[INTRA_MESSAGE_SIZE];
  sprintf(mess, "%d IsYourFather\0", world_rank);


  sendto_child(mess);

  PMPI_Barrier(MPI_COMM_WORLD);


  return SUCCESS;
}

Finalizing the communication

int wait_child(char* mess)
{
  // send my death
  if ( sendto_child(mess) == SUCCESS )
    {
      // wait his death
      if ( PMPI_Wait(&dead_child, MPI_STATUS_IGNORE) == MPI_SUCCESS )
        {
          fprintf(stderr, "!profiler!(%d) received its child death!\n", world_rank);
          //MPI_Unpublish_name(published, MPI_INFO_NULL, port_name);
          MPI_Close_port(port_name);
          return SUCCESS;
        }
    }

  return FAILURE;
}

The display side - the client side

On the display side, the same kind of modification had to be done. Rather that using information from the father's communicator, a connection to a port is performed.

The MPIWatch::getWatcher method

MPIWatch* MPIWatch::getWatcher(char port_name[])
{
    if ( instance == 0 )
    {
        MPI::Init();

        std::cout << "Try to connect to " << port_name << std::endl;

        parent = MPI::COMM_WORLD.Connect(port_name, MPI::INFO_NULL, 0);

        if ( parent == MPI::COMM_NULL )
        {
            std::cerr << "Cannot connect with the parent program! Aborting." << std::endl;
            MPI::Finalize();
            return 0;
        }

        std::cout << "Connection with parent completed!" << std::endl;

        instance = new MPIWatch();
    }

    return instance;
}

Running it!

The main difference here is that on the previous version the display was starting by itself. Now it has to be started separately, and actually one per MPI process. Some attempts were made to use the name publication described in the standard (see the reference further up) but for a unknown reason the display part never found the profiler published name.So far, 1 port is open per MPI process - or 1 name was published - and each display connect on 1 of them through command line input.

Console 1: run MPI

$> mpiexec -n 2 mpi_ring
!profiler!(0) open port '3449421824.0;tcp://192.168.0.2:48251+3449421825.0;tcp://192.168.0.2:36965:300'
!profiler!(1) open port '3449421824.0;tcp://192.168.0.2:48251+3449421825.1;tcp://192.168.0.2:52304:300'

Console 2-3: run the display

$> mpidisplay '3449421824.0;tcp://192.168.0.2:48251+3449421825.0;tcp://192.168.0.2:36965:300'

$> mpidisplay '3449421824.0;tcp://192.168.0.2:48251+3449421825.1;tcp://192.168.0.2:52304:300'

The current implementation is a little more complicated to run than the spawn version, but doesn't have any error code when finishing. It also allows more flexibility in the future, to allow more than one display on a single profiler, and any other idea that requires a more flexible approach than a spawn process (like been able to connect a display in the middle of a run and disconnect at will, to see if the program is deadlock etc).

Limitations

The port information are rather long and it is quite not user friendly to have to lookup the profiler output and copy/paste the port information into the display. Further investigation have to be made on that part, in order to either manage to find the name publication problem, or to find a way to look for the port with a more automatic fashion.The actual name publication idea was to publish a name, like 'profiler-mpirank' to look up for - or with any string given by the user instead of profiler. This will allow the display to be started in a single command, that will only need to know 2 information: the base name of the profiler and the number of MPI process to connect to!

The other limitation is not a real one, but more like a bug on the current implementation. A barrier was added to wait for every MPI process to get a display, and isn't that much of a problem, as no high performance are required for that project. The problem arises when one display is closed while the program is running. The current implementation doesn't catch it, and deadlocks. Further investigation will obviously be done on that problem later on.

Source code

As for the previous version the source code is available on http://www.megaupload.com/?d=ZXJGHBPQ. It is a test version, not very clean, and buggy (as explained above). Later on a post will be done on how to use the library with a MPI code in C.

Further work

The preliminary technical overview of the project is about to be over. Now that the basis of the project techniques are setted up, are more detailed reasoning will be done on the project functional requirements. As part of the Project Preparation course of the MSc, some risk analysis and workplan for the overall project has to be done as well and will be published here as well.

Tuesday, 15 February 2011

A bit of software engineering

This article will only details some changes on the code done in order to have a more adaptable test software. It will also explain how to use the library with an MPI program in C.

The Project

The project is so far organised around 2 things: the profiler and the display. The profiler is produced as a library, that patches some of the MPI calls. The display is an executable that only displays information from the profiler.

The current directory architecture reflects that organisation, where the display is actually in a subdirectory of the profiler (the interface one).

When build 3 folders are created;

dynamic containing the library as a .so - or static with the .a
includes which contains the header to add to the MPI executable you want to use with the profiler (the mpi_wrap.h file is the one, the intra_comm.h just defines some of the way for the display and profiler to communicate and can be used later on to develop another display)
display that obviously is the folder where the display executable is stored.

The actual profiler is done in C, and therefore uses MPICC (on my machine GCC - No build was really done on Ness, as for the moment Qt isn't installed on it).

The display is implemented in C++ using both Qt and MPI and uses the powerful .pro files to handle compilation.

The profiler

The profiler is organised so far around:

mpi_basic.c and mpi_communication.c that implements the MPI functions defined in mpi_wrap.h.
child_comm.c, child_comm.h and intra_comm.h that implements the profiler/display communication.

MPI overloading

Only the defined function in mpi_wrap.h are overloaded, and this is so far the only file that has to be included from the original MPI program. Each of the function will call some of the child_comm module to communicate with the child, and the user doesn't have to bother with them.

The child_comm module

Actually very few type of communication is required with the display. The header is rather small:

child_comm.h

#ifndef CHILDCOMM
#define CHILDCOMM

int start_child(char* command, char* argv[]);
int alive_child();
int sendto_child(char* mess);
int wait_child(char* mess);

#endif // CHILDCOMM

start_child starts the child, and therefore is called in MPI_Init()
sendto_child sends information to the child, the message is of a defined size in intra_comm.h
wait_child is to wait for the child death (i.e. be sure he received every information before closing communication) and is thus called in MPI_Finalize()
aline_child() return either SUCCESS or FAILURE (defined in intra_comm.h) to inform that the child is still running or not.

Such approach allows different way of communication with the child without affecting directly the MPI overloaded functions and vice versa.

The display

The display is developed using the Qt library, and uses a classical directory organisation. Qt provides a excellent tool, qmake, to generate Makefiles from a project file (here mpidisplay.pro) and will adapt to it. From a platform to another just minor modification have to be made on the file, such as the 2 first lines that defines MPICC flags. Note that Qt uses GCC as a compiler.

Extract of the mpidisplay.pro

# using 'mpicxx -showme:compile' and 'mpicxx -showme:link'
MPICXX_COMPILE = -I/usr/local/include -pthread
MPICXX_LINK = -pthread -L/usr/local/lib -lmpi_cxx -lmpi -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl

Qt provides also a good interface designer, that will be used to generate the GUI, and the forms generated are stored in the forms folder. The src folder contains the sources.

The code organisation

The display code is organised around 2 classes so far:

MPIWatch that is implemented as a singleton and is the only one to deals with MPI communication (i.e. communicates with the profiler). It therefore uses some information from intra_comm.h.

It is inheriting from the QThread class, that is a portable thread for Qt (using pthreads on Unix certainly) and allows communication and display actualisation to be separated.

The communication with the other class is done through Qt internals signals, that are kind of remote calls. When a message is received from the profiler, it is stored on a message stack, and the signal newMessage() is emitted.
CommStat that is a classical QWidget displaying basic information on the number of sends and receives. It pops information from the MPIWatch object each time this one signals a new message.

How to use the mpi_wrap library?

Using the library is a very easy, and standard.

Add the #include line to the code that uses MPI.
Compile the files with the path of the include files (usually -I)
Link the executable with the path of the library, and the library name (usually -L and -libmpi_wrap).

Example in a Makefile

# path where the library is installed
MPI_WRAPPER = /home/workspace/project/current
# linking is either static or dynamic, will look in $MPI_WRAPPER/$linking
linking = dynamic

DEFINES+=
CC= mpicc
CFLAGS= -g $(DEFINES) -I${MPI_WRAPPER}/includes


LFLAGS= -lm -L${MPI_WRAPPER}/$(linking) -lmpi_wrap

EXE= ring

SRC= ring.c

OBJ= $(SRC:.c=.o)

.c.o:
 $(CC) $(CFLAGS) -c $<

all: $(EXE)

$(EXE): $(OBJ) 
 $(CC) $(CFLAGS) -o $@ $(OBJ) $(LFLAGS)
 @echo "don't forget export LD_LIBRARY_PATH='$(MPI_WRAPPER)/$(linking)'"
 @echo "don't forget to add $(MPI_WRAPPER)/display to the PATH!"

clean:
 rm -f $(OBJ) $(EXE)

The sources

The sources are available on http://www.megaupload.com/?d=DDUQP5QH.