[DRMAA-WG] About obtaining the machines names in a parallel job
yves.caniou at ens-lyon.fr
Thu Mar 18 04:11:25 CDT 2010
Discussions yesterday were great.
I understand why you don't want to put a mean to get the hostnamesfile for an
MPI code, since it's should be transparently done in the configName (correct
name if my rememberings are well).
But I thought of a different use case: a code is just launched on all
machines. This code is a socket based one, thus it needs to know the other
machine names to be able to run correctly.
Of course, this could be bypassed with the use of an external machine where a
daemon runs, and where running codes can register -- I think of it like an
omniNames running for example. Another solution is to encapsulate
applications in an MPI code just to, maybe, have that information.
But don't you think that the cost is very big (if possible: a lot of policy is
to not let run user code on the frontal, and a machine only knows that itself
is taking part to the parallel run) compared to the possibility to at least
having the possibility to copy the file containing the hostnames to all
Bon courage for the discussions today!
Associate Professor at Université Lyon 1,
Member of the team project INRIA GRAAL in the LIP ENS-Lyon,
Délégation CNRS in Japan French Laboratory of Informatics (JFLI),
* in Information Technology Center, The University of Tokyo,
2-11-16 Yayoi, Bunkyo-ku, Tokyo 113-8658, Japan
* in National Institute of Informatics
2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430, Japan
More information about the drmaa-wg