[DRMAA-WG] misssing exceptions and thread safety

Peter Tröger peter at troeger.eu
Tue Mar 2 15:35:12 CST 2010

GFD.133 has a good statement in the description of the control()  

"This routine SHALL return once the action has been acknowledged by  
the DRM system, but does not necessarily wait until the action has  
been completed."

This underlines Dan's argumentation, the point of synchronization  
resp. atomicity is the DRMS itself. It was no problem in DRMAAv1,  
since we carefully avoided to demand any kind of state saving in the  
library. This changed with the new persistency features. We discussed  
possible new race conditions in Hamburg, but couldn't find anything  
unsolvable. The new concept demands only the storage of identifiers so  
far - for sessions (if supported by the DRM) and jobs. The state still  
must be retrieved from the DRM on every usage.

>>>>    The Job object methods should throw following exceptions:
>>>>    - "JobAlreadySuspendedException" from suspend method when
>>>>      job is already suspended. The DRMAA implementation have
>>>>      to make sure that suspend job is just called once. It is not  
>>>> enough
>>>>      for the DRMAA implementation to rely on own state, it should
>>>>      check the state automatically in order to avoid problems when
>>>>      the state is set outside of DRMAA. Should DRMAA deal with
>>>>      such cases?

Can you provide a link for this text ? I cannot find it. It also makes  
no real sense - job state NEVER EVER should be persisted in the DRMAA  
library itself.

>>> *Can* DRMAA deal with such cases?  These are two operations which
>>> are usually not atomic (1: check for state, 2: suspend) - so how can
>>> a DRMAA client side library ensure that the remote state does not
>>> change between these two calls, e.g. due to a 3rd part API call?

It cannot, and it is no problem. A "test-and-set" semantic of the  
library is not expected here. The DRMS should tell the library that  
suspend() is not allowed with the current state. Or in other words -  
we expect the job control functions of the DRM system to act (more or  
less) like the DRMAA equivalents. So far, this worked out.


More information about the drmaa-wg mailing list