[DRMAA-WG] maxSlots attribute

Andre Merzky andre at merzky.net
Wed Mar 24 13:30:25 CDT 2010

Quoting [Daniel Templeton] (Mar 24 2010):
> So the program is going to go read the queue names via the API and
> then present them to the user to pick from?  Doesn't the user
> still need to go reference a web page somewhere to figure out
> which queue is the right one for what he's doing?

Yes, the user need to have some idea what the queue names mean.
Queue names like 'bigjob', 'preempt', 'checkpt', 'workq' are,
however, rather intuitive.  Thats of course not always the case.
BTW: the above names are from our current systems on loni.org.

Anyway, as said: that is not a hard requirement, but our users seem
to find this useful.

Cheers, Andre.

> Daniel
> On 03/24/10 10:59, Andre Merzky wrote:
> >Quoting [Daniel Templeton] (Mar 24 2010):
> >>
> >>What does that mean when you say "retrieve them programatically?"  What
> >>tool are they using to do it?  What is the interface?
> >
> >
> >I mean retrieving via an API like DRMAA or SAGA, instead of
> >retrieving by some other means outside of the program (reading a web
> >page and entering the queue name in the code or some config file).
> >
> >Cheers, Andre.
> >
> >
> >>Daniel
> >>
> >>On 03/24/10 06:03, Andre Merzky wrote:
> >>>Hi,
> >>>
> >>>Quoting [Peter Tr?ger] (Mar 23 2010):
> >>>>Hi,
> >>>>
> >>>>>As long as the MonitoringSession::drmsQueueNames is nothing more
> >>>>>than an opaque set of strings that are the valid values for
> >>>>>JobTemplate::queueName, I can live with it.  I can see where that
> >>>>>would be useful for a portal.  I thought, however, that we had come
> >>>>>to the conclusion previously that portals and user interfaces were
> >>>>>not really our target applications.  (Anyone remember what feature
> >>>>>spawned that conclusion?)  I thought that DRMAA was specifically
> >>>>>focused on applications integrating with clusters.  If so, a list of
> >>>>>opaque strings is useless.
> >>>>
> >>>>We dropped the portal example, that's true. The most convincing DRMAA
> >>>>applications at the moment are high-level APIs and meta-schedulers on
> >>>>top of / with DRMAA support.
> >>>>
> >>>>I did some field study to get the picture right. LSF, PBS, SGE,
> >>>>LoadLeveler, SAGA, Globus and GridWay can submit jobs to particular
> >>>>queues. In LoadLeveler, queues are called "classes". Condor, JSDL and
> >>>>OGSA-BES seem to have no queue concept - correct me if I am wrong. The
> >>>>retrieval of the list of queue names is only supported in:
> >>>>
> >>>>LSF: bqueues (http://www.vub.ac.be/BFUCC/LSF/man/bqueues.1.html)
> >>>>PBS: qstat -q (http://linux.die.net/man/1/qstat)
> >>>>SGE: qstat -f
> >>>>LoadLeveler: llclass -l (http://www.ccs.ornl.gov/Cheetah/
> >>>>LL.html#Classes)
> >>>>
> >>>>So if we add the monitoring facility, an empty return value must be
> >>>>still valid.
> >>>>
> >>>>>By the way, you'll also have to give a little thought to reconciling
> >>>>>the 1:1 queue/host model with the 1:n and n:m models, as far as
> >>>>>identifying them in a list goes.
> >>>>
> >>>>This is the true counter argument. If DRMAA monitoring gives no
> >>>>additional hints here, invalid combinations of valid machine / queue
> >>>>names in the job template could occur.
> >>>>Let's wait if any defender of queue list monitoring stands up.
> >>>>Otherwise, I propose to keep only the queue name attribute in the job
> >>>>template.
> >>>
> >>>It's not a hard requirement from our side, but I think its utterly
> >>>useful.  In general, our end users are complaining more often than
> >>>not if they need to manually retrieve resource details like service
> >>>contact URLs or queue names from some obscure web page, instead of
> >>>being able to retrieve those information programatically.  The
> >>>manual way is simply too error prone, tedious and static.
> >>>
> >>>Cheers, Andre.
> >>>
> >>>
> >>>>
> >>>>/Peter.
> >>>>
> >>>>
> >>>>>Daniel
> >>>>>
> >>>>>On 3/23/10 10:27 AM, Peter Tröger wrote:
> >>>>>>>As I said in the email I just wrote, I'm willing to be convinced
> >>>>>>>of the
> >>>>>>>value of adding queues to the job submission side of things.  I am,
> >>>>>>>however, fundamentally opposed to adding queues to the monitoring
> >>>>>>>side.
> >>>>>>
> >>>>>>I will heavily insist on queue support in DRMAAv2, This is a long
> >>>>>>demanded feature, which also popped up again in the survey.
> >>>>>>
> >>>>>>>The various concepts of queues are too different for that to make
> >>>>>>>any
> >>>>>>>sense.  There is absolutely no way we will be able to model both
> >>>>>>>LSF and
> >>>>>>>SGE queues in a way that is abstract enough to be consistent and
> >>>>>>>still
> >>>>>>>specific enough to be meaningful and accurate.  We'll talk on the
> >>>>>>>next
> >>>>>>>call. :)
> >>>>>>
> >>>>>>The intention of the current model is that JobTemplate::queueName
> >>>>>>and MonitoringSession:: drmsQueueNames act as counterparts. DRMAA
> >>>>>>would promise that all strings that show up in MonitoringSession::
> >>>>>>drmsQueueNames are valid input for JobTemplate::queueName. Nothing
> >>>>>>more.
> >>>>>>The use case are DRMAA-based portals and command-line applications.
> >>>>>>The interpretation of what a queue is can be provided by the
> >>>>>>library implementation - at the end, the user anyway has to reason
> >>>>>>about the meaning of queue names.
> >>>>>>
> >>>>>>We could relax the conditions so that other values are also allowed
> >>>>>>in JobTemplate::queueName. This would allow MonitoringSession::
> >>>>>>drmsQueueNames to return nothing in SGE. This must be anyway
> >>>>>>possible - Condor has no queue concept at all.
> >>>>>>
> >>>>>>I could also agree to remove MonitoringSession::
> >>>>>>queueMaxWallclockTime and  MonitoringSession::
> >>>>>>queueMaxSlotsAllowed, since these two attributes are the ones that
> >>>>>>demand a particular understanding of what a queue is.
> >>>>>>
> >>>>>>Best,
> >>>>>>Peter.
> >>
> >
> >
> >

Nothing is ever easy.

More information about the drmaa-wg mailing list