[DRMAA-WG] DRMAA2 comments
mamonski at man.poznan.pl
Wed May 4 13:58:34 CDT 2011
On 29 April 2011 13:10, Nadav Brandes <nadavbrandes at gmail.com> wrote:
> Hi guys,
> My team and I have finished going over the latest draft of DRMAA2, and we
> have some comments, suggestions and questions about it.
> We want to hear your opinion about these issues.
> Given a jobId, you can easily get its Job object using the method
> JobSession::getJobs(in JobInfo filter), if you give has as a filter a
> JobInfo with the wanted jobId (maybe it would be an easier shorthand if
> DRMAA had a method JobSession::getJob(string jobId), but this is a different
> issue). But, given a jobArrayId, there is no way to get its JobArray object,
> which is a great limit of DRMAA that doesn't really let users to use the
> JobArray feature in DRMAA as it is used in most batch systems. I think that
> there should be added a similar method JobSession::getJobArrays(in
> JobArrayInfo filter), or at least a method JobSession::getJobArray(string
> A very important feature that many batch systems support is the ability to
> limit the number of jobs in a job array that may run simultaneously (in LSF
> it's called "Slot Limit" and you can read about it at
> I think that DRMAA can also support this feature by:
> Change the method JobSession::runBulkJobs so it will also accept an optional
> argument in long slotLimit (if it's UNSET then no slot limit will be
> assigned to the new job array).
Torque also supports this feature. What about Grid Engine?
> Add a new method JobArray::changeSlotLimit(in long slotLimit)
> There are some parameters that most batch systems allow changing for already
> submitted jobs, but DRMAA doesn't support changing them. For example, DRMAA
> doesn't let you change the priority or queue of an already submitted jobs. I
> think that methods Job::changePriority(in long priority) and
> Job::changeQueue(in string queueName) should be added.
> Many batch systems allow rerunning existing jobs. Although DRMAA has a field
> called rerunnable in the JobTemplate struct, it doesn't allow users to
> actually rerun jobs. Maybe a method Job::rerun() could be added to DRMAA.
> I have a question. Does DRMAA support Generic Resources? (for example, if I
> have a cluster where some of its nodes have GPU cards, and I want to submit
> jobs that require a certain amount of GPUs, so I would like the batch system
> to manage it for me, as many batch systems know how to manage).
> Thank you for reading all of this. I would very like to hear what you think
> about each of the bullets above.
> drmaa-wg mailing list
> drmaa-wg at ogf.org
More information about the drmaa-wg