From Stephen.Pickles at manchester.ac.uk Tue Oct 4 09:04:23 2005 From: Stephen.Pickles at manchester.ac.uk (Stephen M Pickles) Date: Tue, 4 Oct 2005 15:04:23 +0100 Subject: [graap-wg] Usage Scenarios Document Message-ID: <20051004150423398.00000001460@Wombat> The GRAAP Usage Scenarios document "Usage Scenarios for a Grid Resource Allocation Agreement Protocol" never actually been published. It occurs to me that now might be a good time to remedy this. Flushing out public comments on the use cases could be useful while the working group is considering its next steps. It would also make it possibly to cite the document properly. Would anyone object if I took steps to get this document into the GGF Editor pipeline as an informational document? Does anyone think it needs another revision first? Stephen ==================== Stephen M. Pickles ==================== Technical Director, Grid Operations Support Centre Software Infrastructure Manager, RealityGrid Manchester Computing Room G49.1, Kilburn Building The University of Manchester tel: +44 161 275 5974 Oxford Road fax: +44 161 275 6800 Manchester M13 9PL stephen.pickles at manchester.ac.uk From Wolfgang.Ziegler at scai.fraunhofer.de Tue Oct 4 10:17:16 2005 From: Wolfgang.Ziegler at scai.fraunhofer.de (Wolfgang Ziegler) Date: Tue, 04 Oct 2005 17:17:16 +0200 Subject: [graap-wg] Usage Scenarios Document In-Reply-To: <20051004150423398.00000001460@Wombat> References: <20051004150423398.00000001460@Wombat> Message-ID: <43429CFC.5010504@scai.fraunhofer.de> Dear Stephen, good point, indeed it should be published. We probably don't need another revision. However, I would like to read it again which I could do later today and come back to the list then with a better founded statement. Best regards Wolfgang Stephen M Pickles wrote: > The GRAAP Usage Scenarios document > "Usage Scenarios for a Grid Resource Allocation Agreement Protocol" > never actually been published. > > It occurs to me that now might be a good time to > remedy this. Flushing out public comments on the > use cases could be useful while the working group > is considering its next steps. It would also make > it possibly to cite the document properly. > > Would anyone object if I took steps to get this > document into the GGF Editor pipeline as an informational > document? > > Does anyone think it needs another revision first? > > Stephen > > ==================== Stephen M. Pickles ==================== > > Technical Director, Grid Operations Support Centre > Software Infrastructure Manager, RealityGrid > Manchester Computing > Room G49.1, Kilburn Building > The University of Manchester tel: +44 161 275 5974 > Oxford Road fax: +44 161 275 6800 > Manchester M13 9PL stephen.pickles at manchester.ac.uk > -- Fraunhofer-Institute for Algorithms and Scientific Computing (SCAI) Schloss Birlinghoven, D-53754 Sankt Augustin, Germany Tel: +49 2241 14 2258 Fax: +49 2241 14 42258 http://www.scai.fraunhofer.de "Heut ist nicht so kalt wie gestern, trotzdem dass heut kaelter ist" -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 3761 bytes Desc: S/MIME Cryptographic Signature Url : http://www.ogf.org/pipermail/graap-wg/attachments/20051004/d0219965/attachment.bin From maclaren at cct.lsu.edu Fri Oct 7 07:05:38 2005 From: maclaren at cct.lsu.edu (Jon MacLaren) Date: Fri, 7 Oct 2005 08:05:38 -0400 Subject: [graap-wg] A Highly available, Fault tolerant Co-scheduling System Message-ID: For those of you not in Boston this week, I gave an overview of a Highly available, Fault tolerant Co-scheduling System, which I've built at LSU. We successfully used this system at the iGrid 2005 meeting to co-schedule 10 compute jobs and 2 Calient DiamondWave switches. The system uses the Paxos Commit protocol (Lamport, Gray) to overcome the problems associated with distributed 2-phase commit. The co- scheduler, when deployed as 5 processes, has a mean-time-to-failure of about 12 years. The software is available for download, but, as I said at the meeting, I'm still writing up the installation instructions. There is a mailing list about the work, though. And if you go to the web- site for this stuff, you can sign up, and receive updates. The link is: http://www.cct.lsu.edu/personal/maclaren/CoSched/ Cheers, Jon MacLaren. From karlcz at univa.com Sat Oct 8 01:36:22 2005 From: karlcz at univa.com (Karl Czajkowski) Date: Sat, 8 Oct 2005 13:36:22 +0700 Subject: [graap-wg] A Highly available, Fault tolerant Co-scheduling System In-Reply-To: References: Message-ID: <20051008063622.GB19929@moraine.localdom> On Oct 07, Jon MacLaren modulated: ... > The system uses the Paxos Commit protocol (Lamport, Gray) to overcome > the problems associated with distributed 2-phase commit. Jon: As no doubt you'll remember, it has been proposed that advance reservation is an approach to distributed co-allocation. Specifically, advance reservation agreements can be seen as the "prepare" step in the 2PC protocol and the subsequent claiming agreements can be seen as the "commit" step. As such, we can envision WS-Agreement being used in the protocol between the 2PC transaction manager and the resources (as well as between the initiator and the transaction manager). Anyway, I read up on Paxos a bit, and as far as I can tell it has these same underlying mechanisms of prepare/commit at the individual resources. In essence, it is a way of making distributed transaction managers as a group consensus on top of the same basic parties: one who initiates the transaction and N who participate in it. It adds 2F additional processes in between the initiator and resources to tolerate F process failures. Having actually studied and implemented such a system, do you think this is an accurate summary? Is there is anything you can identify that is missing from WS-Agreement that would allow it to be used at each resource in the Paxos Commit protocol in the same manner that we have intended it to be used in the "prepare" and "commit" steps of the 2PC protocol? E.g. two separate agreements at each resource to represent the two phases? My understanding is that there needs to be a way to name the agreement such that each of the Paxos processes can find the same answer to the "prepare" step at each resource. Can Paxos elect a "leader" who initiates the prepare step, e.g. CreateAgreement, so that the others can just check the result status, e.g. RP query? Or would a truly idempotent CreateAgreement process be required so that any process can initiate the prepare step and all will learn the same result using the same message pattern, regardless of which contacts the resource first? By the way, I think this latter behavior could be solved at the WS binding level, using the current WS-Agreement definitions. This would be a different application of the same idempotent-submit mechanism we use in WS-GRAM for simple reliability... Of course, this use of agreements for the phases requires a certain set of additional assumptions about how deterministic the claim step is, once a reservation is held; otherwise, the semantics of the "prepare" step (and the whole transaction) becomes wishy-washy. Particularly, if the reservation agreements are constrained in time (e.g. a typical wall-clock advance reservation scenario), the commit protcol can be violated because the preparation can expire before the commit phase is completed (violating the ACID properties). As I understand it, Paxos can reduce the likelihood of delays due to transaction manager failure, but arbitrary delay is still a hazard with realistic messaging models, i.e. Internet-based services, because of unbounded message delay/loss to the distributed resources that are being coordinated. Thoughts? karl -- Karl Czajkowski karlcz at univa.com From maclaren at cct.lsu.edu Mon Oct 10 10:58:19 2005 From: maclaren at cct.lsu.edu (Jon MacLaren) Date: Mon, 10 Oct 2005 10:58:19 -0500 Subject: [graap-wg] A Highly available, Fault tolerant Co-scheduling System In-Reply-To: <20051008063622.GB19929@moraine.localdom> References: <20051008063622.GB19929@moraine.localdom> Message-ID: <49415BC6-1249-4EB7-A83F-5C8E6100E627@cct.lsu.edu> Karl, Thanks for the email. I was sorry that you weren't at the presentation. I've replied to stuff inline below. However, a couple of general points/observations. First, what would I gain from using WS-Agreement in the way you propose? At the moment, we have a nice co-allocation scheme, where the co-allocators don't need to know anything about the payload of the message - it can even be encrypted (An important separation, in my opinion.) Also, my scheme currently uses XML over HTTP. It could use WS-I, if we wanted to add SOAP. But it's just XML messages, essentially. If WS-Agreement was "merged" with this scheme, I'd then need to use WS-RF, which is not amenable to everyone. (In any case, the impression from reading this below, is that to use WS-Agreement in this way feels like a bit of a hack.) Also, below, you are talking about using WS-Agreement as the protocol between the entity doing the co-scheduling and the resource managers (RMs). I'm not envisaging the user doing co-scheduling directly - it's complex, and I'd rather encapsulate it, as in my implementation. If you had a service doing this for you, your scheme would need two levels of WS-Agreement, between the user and co- scheduler and the co-scheduler and RMs. Is this what you imagine, or do you think the user should take this on directly? On Oct 8, 2005, at 1:36 AM, Karl Czajkowski wrote: > On Oct 07, Jon MacLaren modulated: > ... > >> The system uses the Paxos Commit protocol (Lamport, Gray) to overcome >> the problems associated with distributed 2-phase commit. >> > > Jon: > > As no doubt you'll remember, it has been proposed that advance > reservation is an approach to distributed co-allocation. > Specifically, advance reservation agreements can be seen as the > "prepare" step in the 2PC protocol and the subsequent claiming > agreements can be seen as the "commit" step. As such, we can envision > WS-Agreement being used in the protocol between the 2PC transaction > manager and the resources (as well as between the initiator and the > transaction manager). I remember. As I pointed out, this hides the nature of what is going on (co-allocation) from the resource manager (RM). In my implementation, the RM is aware of the difference between a prepare- >prepared->abort sequence and a normal reserve followed by cancellation. Hiding this difference, as you propose, has implications on charging schemes, allocation quotas, etc. - it is, as I believe, too restrictive. > Anyway, I read up on Paxos a bit, and as far as I can tell it has > these same underlying mechanisms of prepare/commit at the individual > resources. In essence, it is a way of making distributed transaction > managers as a group consensus on top of the same basic parties: one > who initiates the transaction and N who participate in it. It adds 2F > additional processes in between the initiator and resources to > tolerate F process failures. Having actually studied and implemented > such a system, do you think this is an accurate summary? Basically that's correct. Only a majority of acceptors have to remain operable. There are some other nice properties too though: 1. All messages are idempotent (multiple delivery is OK) 2. Messages do not have to be reliable or arrive in a timely manner There are other "goodies" too - others looking at this should chase down the references (see my coscheduling web page). There is one instance of Paxos Consensus for each RM decision (prepared/aborted). > Is there is anything you can identify that is missing from > WS-Agreement that would allow it to be used at each resource in the > Paxos Commit protocol in the same manner that we have intended it to > be used in the "prepare" and "commit" steps of the 2PC protocol? > E.g. two separate agreements at each resource to represent the two > phases? Yes. If you look again at the "Consensus on Transaction Commit" paper, you'll see that the RMs have to report their prepared/aborted decision to all the acceptors. WS-Agreement does not do this. (It could work with classic 2-phase commit, with a single transaction manager, but neither of us are interested in this.) > My understanding is that there needs to be a way to name the agreement > such that each of the Paxos processes can find the same answer to the > "prepare" step at each resource. Can Paxos elect a "leader" who > initiates the prepare step, e.g. CreateAgreement, so that the others > can just check the result status, e.g. RP query? Or would a truly > idempotent CreateAgreement process be required so that any process can > initiate the prepare step and all will learn the same result using the > same message pattern, regardless of which contacts the resource first? > By the way, I think this latter behavior could be solved at the WS > binding level, using the current WS-Agreement definitions. This > would be a different application of the same idempotent-submit > mechanism we use in WS-GRAM for simple reliability... The initial leader *does* initiate the prepare (see the paper again). However, you don't know if that message got through. Prepare is not resent to the RMs. Instead, if the response does not arrive within a given time, the leader will ask for another ballot in the RM's instance of Paxos. If the ballot is "free", i.e. no acceptor has seen a response from the RM, then the leader will propose the value "aborted" for this round of Paxos. You can't rely on the creation of the RP thing in order to discover the decision later on. What if the RM is down? > Of course, this use of agreements for the phases requires a certain > set of additional assumptions about how deterministic the claim step > is, once a reservation is held; otherwise, the semantics of the > "prepare" step (and the whole transaction) becomes wishy-washy. > Particularly, if the reservation agreements are constrained in time > (e.g. a typical wall-clock advance reservation scenario), the commit > protcol can be violated because the preparation can expire before the > commit phase is completed (violating the ACID properties). As I > understand it, Paxos can reduce the likelihood of delays due to > transaction manager failure, but arbitrary delay is still a hazard > with realistic messaging models, i.e. Internet-based services, because > of unbounded message delay/loss to the distributed resources that are > being coordinated. This is true. However, Paxos handles message delays/non-arrival by having subsequent ballots. It recovers automatically from this - it doesn't just block. So individual messages being delayed is not a problem. For Paxos not to make progress, you need to engineer a situation where there is no majority of acceptors still working. What do you think the chances are of messages being systematically delayed between a number of processes? If you crunch the numbers on all these failures (I used an example of acceptors being inoperable for one hour out of 24 hours), you find that the likelihood of a 5-acceptor Paxos round blocking is very, very small (once in a number of years). That's good enough for me. Jon. > Thoughts? > > karl -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.ogf.org/pipermail/graap-wg/attachments/20051010/9a72e36f/attachment.html From karlcz at univa.com Tue Oct 11 00:03:23 2005 From: karlcz at univa.com (Karl Czajkowski) Date: Tue, 11 Oct 2005 12:03:23 +0700 Subject: [graap-wg] A Highly available, Fault tolerant Co-scheduling System In-Reply-To: <49415BC6-1249-4EB7-A83F-5C8E6100E627@cct.lsu.edu> References: <20051008063622.GB19929@moraine.localdom> <49415BC6-1249-4EB7-A83F-5C8E6100E627@cct.lsu.edu> Message-ID: <20051011050323.GD12255@moraine.localdom> On Oct 10, Jon MacLaren modulated: > Karl, > > Thanks for the email. I was sorry that you weren't at the > presentation. > > I've replied to stuff inline below. However, a couple of general > points/observations. > > First, what would I gain from using WS-Agreement in the way you > propose? At the moment, we have a nice co-allocation scheme, where the > co-allocators don't need to know anything about the payload of the > message - it can even be encrypted (An important separation, in my > opinion.) Also, my scheme currently uses XML over HTTP. It could use > WS-I, if we wanted to add SOAP. But it's just XML messages, > essentially. If WS-Agreement was "merged" with this scheme, I'd then > need to use WS-RF, which is not amenable to everyone. (In any case, > the impression from reading this below, is that to use WS-Agreement in > this way feels like a bit of a hack.) > What you would gain is access to a world full of resources managed by WS-Agreement! What the community would gain is precisely the goal of GRAAP: to improve resource federation in practice by having common/normalized RM services that are able to support a range of different virtual organizations and distributed management strategies. > Also, below, you are talking about using WS-Agreement as the protocol > between the entity doing the co-scheduling and the resource managers > (RMs). I'm not envisaging the user doing co-scheduling directly - it's > complex, and I'd rather encapsulate it, as in my implementation. If > you had a service doing this for you, your scheme would need two levels > of WS-Agreement, between the user and co-scheduler and the co-scheduler > and RMs. Is this what you imagine, or do you think the user should > take this on directly? > Yes, I imagine two (or more) levels. I find it telling that you brought "user" into the picture. The core thrust of SNAP and all my inputs to GRAAP have been on the machine-to-machine RM agreement angle. I completely expect to see WS-Agreement in the layer between brokers and resources. That it might also serve in the layer between users/clients and the "first" broker is almost accidental. I say almost, because the ability for the model to recurse cleanly was always important in its design. > I remember. As I pointed out, this hides the nature of what is going > on (co-allocation) from the resource manager (RM). In my > implementation, the RM is aware of the difference between a prepare-> > prepared->abort sequence and a normal reserve followed by cancellation. > Hiding this difference, as you propose, has implications on charging > schemes, allocation quotas, etc. - it is, as I believe, too > restrictive. > I agree it has implications, but I think it is also a prerequisite to true federation of resources. Just as I make reservations with multiple providers when I co-allocate my travel itinerary, I think users (or their agents/services) in a large-scale Grid are going to have to make agreements with autonomous resource providers. These resource providers may have no interest (or trust) in each other, but only in their relationship to the user/consumer. I would point out though, that the domain-specific semantics of the agreement could be extended to include contextual information about the co-allocation goals. This could be useful for audit, authorization, or even differentiated pricing if the co-allocation is being managed by a broker/transaction manager that the resources trust to do a good job and not cause thrashing! On the other extreme, you could imagine some RMs who _only_ want to hear from accredited brokers, because they do not trust end-users to make worthwhile requests. > You can't rely on the creation of the RP thing in order to discover the > decision later on. What if the RM is down? > I guess I don't understand Paxos from my quick reading... how is the RM being down any different than message loss that is supposed to be tolerated? Any remote process has a tri-state understanding of the RM's status in the transaction: prepared, not prepared, or unknown. Right? I think it would be interesting to see how something like Paxos can be layered on top of "normal" RM messages instead of deploying Paxos-specific entities to each resource. It's all just a matter of syntax, assuming the semantics of preparation and commitment can be mapped appropriately. I understand there could be significant legwork to re-validate the formal proofs about Paxos, and no I am not volunteering. :-) > This is true. However, Paxos handles message delays/non-arrival by > having subsequent ballots. It recovers automatically from this - it > doesn't just block. So individual messages being delayed is not a > problem. For Paxos not to make progress, you need to engineer a > situation where there is no majority of acceptors still working. What > do you think the chances are of messages being systematically delayed > between a number of processes? > It isn't a concern of blocking that I raised. It is that co-allocation of real resources, e.g. simultaneous use of computers, is based in wall-clock time and the abstract ACID transactions only hold water as long as the commit phase completes before the actual wall-clock time when the co-allocation is meant to commence. You cannot "fix it up" with persistent logs after the fact, if the time actually elapsed and the resources were not operating in the allocated mode. (Conversely, you cannot do speculative allocation and then undo in the event of a transaction abort. There is an opportunity cost either way.) This is the core concern I have been raising for years now about the inherent hazard of distributed resource management. I keep raising it because I think people focus on the wrong kinds of "reliability" and "correctness" metrics when talking about things like co-scheduling of distributed computations and data-paths. I worry that we're somehow talking past each other if you still think I am just talking about "progress" in the abstract transactional sense. Failed transactions that lead to idle resources is a potential livelock hazard for the resource operators, no matter how elegantly the consensus problem is phrased to suggest it completed. :-) One solution to this problem is markets: specifically by having cost models for reservation and cancellation, market forces can push the risks out to the coordinators who are trying to make risky transactions. This gives them incentive to act as efficiently as they know how. > If you crunch the numbers on all these failures (I used an example of > acceptors being inoperable for one hour out of 24 hours), you find that > the likelihood of a 5-acceptor Paxos round blocking is very, very small > (once in a number of years). > > That's good enough for me. > > Jon. > I have my doubts about the failure model, e.g. pairwise messaging failures between an RM and an acceptor are not really independent in the Internet, since quite likely a large number of acceptors are using the same network links to talk to a particular RM. Partitioning can be unkind. However, I am not really trying to debate the pros or cons of Paxos per se, but to understand how we can get to a world where standard, normalized, and interoperable RM services can be deployed and shared by different brokers, VOs, and coordination strategies. I think the architecture should be very agnostic and policy-free so that different policies and "markets" can evolve. Your pursuit of this other coordination strategy makes you an interesting candidate to talk to about WS-Agreement mechanisms... it isn't so interesting to preach to the choir. I think the future of Grid computing is in the human policies and federating models, and not in the plumbing. The plumbing just needs to be there and be well behaved, without obstructing the kinds of experimental and production policies that organizations wish to deploy. karl -- Karl Czajkowski karlcz at univa.com From jim_pruyne at hp.com Wed Oct 19 01:36:28 2005 From: jim_pruyne at hp.com (Jim Pruyne) Date: Wed, 19 Oct 2005 01:36:28 -0500 Subject: [graap-wg] reminder: telecon on 10/19 Message-ID: <4355E96C.70805@hp.com> All, We will have a telecon. on Wed. morning/evening at our usual time. Dial-in numbers will be the same. The time is: 10:00AM Central Time US (GMT-0600) which should be: 11:00AM Eastern Time US 1600 UK 1700 Germany midnight Japan 2200 Thailand Phone Number: 866-673-8466 in the US. 702-477-6031 for those outside the US. Conference code #8578310. --- Jim From Wolfgang.Ziegler at scai.fraunhofer.de Wed Oct 19 03:32:36 2005 From: Wolfgang.Ziegler at scai.fraunhofer.de (Wolfgang Ziegler) Date: Wed, 19 Oct 2005 10:32:36 +0200 Subject: [graap-wg] reminder: telecon on 10/19 In-Reply-To: <4355E96C.70805@hp.com> References: <4355E96C.70805@hp.com> Message-ID: <435604A4.9070108@scai.fraunhofer.de> Dear all, please find attached the draft of the GGF15 session notes taken by Philipp Wieder. Please send comments and modifications to Philipp and me by end of this week. We will then finalise the minutes and upload them to the gridforge space. Best regards Wolfgang Jim Pruyne wrote: > All, > > We will have a telecon. on Wed. morning/evening at our usual time. > Dial-in numbers will be the same. The time is: > 10:00AM Central Time US (GMT-0600) > which should be: > 11:00AM Eastern Time US > 1600 UK > 1700 Germany > midnight Japan > 2200 Thailand > > Phone Number: 866-673-8466 in the US. 702-477-6031 for those outside > the US. Conference code #8578310. > > --- Jim > -- Fraunhofer-Institute for Algorithms and Scientific Computing (SCAI) Schloss Birlinghoven, D-53754 Sankt Augustin, Germany Tel: +49 2241 14 2258 Fax: +49 2241 14 42258 http://www.scai.fraunhofer.de "Heut ist nicht so kalt wie gestern, trotzdem dass heut kaelter ist" -------------- next part -------------- A non-text attachment was scrubbed... Name: GRAAP-GGF15-minutes.doc Type: application/msword Size: 56832 bytes Desc: not available Url : http://www.ogf.org/pipermail/graap-wg/attachments/20051019/41127d8d/attachment.doc -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 3761 bytes Desc: S/MIME Cryptographic Signature Url : http://www.ogf.org/pipermail/graap-wg/attachments/20051019/41127d8d/attachment.bin From jim_pruyne at hp.com Wed Oct 19 10:40:58 2005 From: jim_pruyne at hp.com (Jim Pruyne) Date: Wed, 19 Oct 2005 10:40:58 -0500 Subject: [graap-wg] [Fwd: [ogsa-wg] UML Diagram for today's call] Message-ID: <4356690A.3090701@hp.com> I think that we should work with the OGSA-EMS team on the inclusion of WS-Agreement at the level shown in this sequence diagram. Comments? --- Jim -------- Original Message -------- Subject: [ogsa-wg] UML Diagram for today's call Date: Wed, 19 Oct 2005 10:27:27 +0100 From: Steven Newhouse To: ogsa-wg Following on from the GGF15 F2F and last week's call here is a UML diagram illustrating stage 2.3 of the EMS Composition Roadmap) for the next call. Comments to the list or save it for the call if you will be there! Steven -- ---------------------------------------------------------------- Dr Steven Newhouse Tel:+44 (0)2380 598789 Deputy Director, Open Middleware Infrastructure Institute (OMII) Suite 6005, Faraday Building (B21), Highfield Campus, Southampton University, Highfield, Southampton, SO17 1BJ, UK -------------- next part -------------- A non-text attachment was scrubbed... Name: EMS-Stage3.pdf Type: application/pdf Size: 14830 bytes Desc: not available Url : http://www.ogf.org/pipermail/graap-wg/attachments/20051019/18f111f6/attachment.pdf From nakata at mtg.biglobe.ne.jp Wed Oct 19 11:06:58 2005 From: nakata at mtg.biglobe.ne.jp (Toshiyuki Nakata) Date: Thu, 20 Oct 2005 01:06:58 +0900 Subject: [graap-wg] reminder: telecon on 10/19 In-Reply-To: <435604A4.9070108@scai.fraunhofer.de> References: <4355E96C.70805@hp.com> <435604A4.9070108@scai.fraunhofer.de> Message-ID: <43566F22.5060407@mtg.biglobe.ne.jp> Hi: Thank you very much for the minutes \Quote Unpublished GRAAP documents (Wolfgang Ziegler) ? The ?Usage Scenarios for a Grid Resource Allocation Agreement Protocol? document has not been published yet, it is still in the GGF7 state. The suggestion is to revise it and submit it as an Informal Document to the GGF editor. This will be done by Toshiyuki Nakata. \End Quote I mentioned that I would be interested in looking at the doc, but I think the submission should be done by the original authors.. Best Regards Toshi Wolfgang Ziegler wrote: > Dear all, > > please find attached the draft of the GGF15 session notes taken by > Philipp Wieder. Please send comments and modifications to Philipp > and me by end of this week. We will then finalise the minutes and > upload them to the gridforge space. > > Best regards > > Wolfgang > > > > Jim Pruyne wrote: > >> All, >> >> We will have a telecon. on Wed. morning/evening at our usual time. >> Dial-in numbers will be the same. The time is: >> 10:00AM Central Time US (GMT-0600) >> which should be: >> 11:00AM Eastern Time US >> 1600 UK >> 1700 Germany >> midnight Japan >> 2200 Thailand >> >> Phone Number: 866-673-8466 in the US. 702-477-6031 for those outside >> the US. Conference code #8578310. >> >> --- Jim >> > -- Toshiyuki Nakata t-nakata at cw.jp.nec.com +81-44-431-7653 (NEC Internal 8-22-60210) From maclaren at cct.lsu.edu Mon Oct 24 09:56:33 2005 From: maclaren at cct.lsu.edu (Jon MacLaren) Date: Mon, 24 Oct 2005 09:56:33 -0500 Subject: [graap-wg] A Highly available, Fault tolerant Co-scheduling System In-Reply-To: <20051011050323.GD12255@moraine.localdom> References: <20051008063622.GB19929@moraine.localdom> <49415BC6-1249-4EB7-A83F-5C8E6100E627@cct.lsu.edu> <20051011050323.GD12255@moraine.localdom> Message-ID: Hi Karl, I guess that ultimately, I don't have a lot of enthusiasm for the combination you propose in the message. The messaging in Paxos has been well thought out. I don't think layering them on top of the WS- Agreement call-response pattern would work well (at least it would not work as well for failure cases - when everything is working it'd be fine). Also, it's more messages than are necessary. The devil, as always is in the details. Indeed, it's when you consider the messaging patterns that you see the problems with what you proposed before about using WS-Agreement between the Acceptors and RMs. Although you wouldn't make Paxos inconsistent (this is impossible), you would raise the chances of the transaction being needlessly aborted in certain failure modes. As for mentioning the user, I make no apology for this. I don't think that it's "telling" at all. The protocol we proposed can do machine-to-machine stuff. I know this, because I have written clients for the protocol, and it's very easy. It should be clear from the slides I presented that this is the case. (There is certainly nothing as complicated as the Template stuff in WS- Agreement, where a client has to download a template, understand the format and then construct a document based upon that.) Cheers, Jon. On Oct 11, 2005, at 12:03 AM, Karl Czajkowski wrote: > On Oct 10, Jon MacLaren modulated: > >> Karl, >> >> Thanks for the email. I was sorry that you weren't at the >> presentation. >> >> I've replied to stuff inline below. However, a couple of general >> points/observations. >> >> First, what would I gain from using WS-Agreement in the way you >> propose? At the moment, we have a nice co-allocation scheme, >> where the >> co-allocators don't need to know anything about the payload of the >> message - it can even be encrypted (An important separation, in my >> opinion.) Also, my scheme currently uses XML over HTTP. It could >> use >> WS-I, if we wanted to add SOAP. But it's just XML messages, >> essentially. If WS-Agreement was "merged" with this scheme, I'd then >> need to use WS-RF, which is not amenable to everyone. (In any case, >> the impression from reading this below, is that to use WS- >> Agreement in >> this way feels like a bit of a hack.) >> >> > > > What you would gain is access to a world full of resources managed by > WS-Agreement! > > > What the community would gain is precisely the goal of GRAAP: to > improve resource federation in practice by having common/normalized RM > services that are able to support a range of different virtual > organizations and distributed management strategies. > > > >> Also, below, you are talking about using WS-Agreement as the protocol >> between the entity doing the co-scheduling and the resource managers >> (RMs). I'm not envisaging the user doing co-scheduling directly - >> it's >> complex, and I'd rather encapsulate it, as in my implementation. If >> you had a service doing this for you, your scheme would need two >> levels >> of WS-Agreement, between the user and co-scheduler and the co- >> scheduler >> and RMs. Is this what you imagine, or do you think the user should >> take this on directly? >> >> > > Yes, I imagine two (or more) levels. > > I find it telling that you brought "user" into the picture. The core > thrust of SNAP and all my inputs to GRAAP have been on the > machine-to-machine RM agreement angle. I completely expect to see > WS-Agreement in the layer between brokers and resources. That it > might also serve in the layer between users/clients and the "first" > broker is almost accidental. I say almost, because the ability for > the model to recurse cleanly was always important in its design. > > > >> I remember. As I pointed out, this hides the nature of what is going >> on (co-allocation) from the resource manager (RM). In my >> implementation, the RM is aware of the difference between a prepare-> >> prepared->abort sequence and a normal reserve followed by >> cancellation. >> Hiding this difference, as you propose, has implications on >> charging >> schemes, allocation quotas, etc. - it is, as I believe, too >> restrictive. >> >> > > I agree it has implications, but I think it is also a prerequisite to > true federation of resources. Just as I make reservations with > multiple providers when I co-allocate my travel itinerary, I think > users (or their agents/services) in a large-scale Grid are going to > have to make agreements with autonomous resource providers. These > resource providers may have no interest (or trust) in each other, but > only in their relationship to the user/consumer. > > I would point out though, that the domain-specific semantics of the > agreement could be extended to include contextual information about > the co-allocation goals. This could be useful for audit, > authorization, or even differentiated pricing if the co-allocation is > being managed by a broker/transaction manager that the resources trust > to do a good job and not cause thrashing! On the other extreme, you > could imagine some RMs who _only_ want to hear from accredited > brokers, because they do not trust end-users to make worthwhile > requests. > > > >> You can't rely on the creation of the RP thing in order to >> discover the >> decision later on. What if the RM is down? >> >> > > I guess I don't understand Paxos from my quick reading... how is the > RM being down any different than message loss that is supposed to be > tolerated? Any remote process has a tri-state understanding of the > RM's status in the transaction: prepared, not prepared, or unknown. > Right? > > I think it would be interesting to see how something like Paxos can be > layered on top of "normal" RM messages instead of deploying > Paxos-specific entities to each resource. It's all just a matter of > syntax, assuming the semantics of preparation and commitment can be > mapped appropriately. I understand there could be significant legwork > to re-validate the formal proofs about Paxos, and no I am not > volunteering. :-) > > > >> This is true. However, Paxos handles message delays/non-arrival by >> having subsequent ballots. It recovers automatically from this - it >> doesn't just block. So individual messages being delayed is not a >> problem. For Paxos not to make progress, you need to engineer a >> situation where there is no majority of acceptors still working. >> What >> do you think the chances are of messages being systematically delayed >> between a number of processes? >> >> > > It isn't a concern of blocking that I raised. It is that > co-allocation of real resources, e.g. simultaneous use of computers, > is based in wall-clock time and the abstract ACID transactions only > hold water as long as the commit phase completes before the actual > wall-clock time when the co-allocation is meant to commence. You > cannot "fix it up" with persistent logs after the fact, if the time > actually elapsed and the resources were not operating in the allocated > mode. (Conversely, you cannot do speculative allocation and then undo > in the event of a transaction abort. There is an opportunity cost > either way.) > > This is the core concern I have been raising for years now about the > inherent hazard of distributed resource management. I keep raising it > because I think people focus on the wrong kinds of "reliability" and > "correctness" metrics when talking about things like co-scheduling of > distributed computations and data-paths. I worry that we're somehow > talking past each other if you still think I am just talking about > "progress" in the abstract transactional sense. Failed transactions > that lead to idle resources is a potential livelock hazard for the > resource operators, no matter how elegantly the consensus problem is > phrased to suggest it completed. :-) > > One solution to this problem is markets: specifically by having cost > models for reservation and cancellation, market forces can push the > risks out to the coordinators who are trying to make risky > transactions. This gives them incentive to act as efficiently as they > know how. > > > >> If you crunch the numbers on all these failures (I used an example of >> acceptors being inoperable for one hour out of 24 hours), you find >> that >> the likelihood of a 5-acceptor Paxos round blocking is very, very >> small >> (once in a number of years). >> >> That's good enough for me. >> >> Jon. >> >> > > I have my doubts about the failure model, e.g. pairwise messaging > failures between an RM and an acceptor are not really independent in > the Internet, since quite likely a large number of acceptors are using > the same network links to talk to a particular RM. Partitioning can > be unkind. > > However, I am not really trying to debate the pros or cons of Paxos > per se, but to understand how we can get to a world where standard, > normalized, and interoperable RM services can be deployed and shared > by different brokers, VOs, and coordination strategies. I think the > architecture should be very agnostic and policy-free so that different > policies and "markets" can evolve. Your pursuit of this other > coordination strategy makes you an interesting candidate to talk to > about WS-Agreement mechanisms... it isn't so interesting to preach to > the choir. > > I think the future of Grid computing is in the human policies and > federating models, and not in the plumbing. The plumbing just needs > to be there and be well behaved, without obstructing the kinds of > experimental and production policies that organizations wish to > deploy. > > > karl > > -- > Karl Czajkowski > karlcz at univa.com > > From jim_pruyne at hp.com Wed Oct 26 01:54:21 2005 From: jim_pruyne at hp.com (Jim Pruyne) Date: Wed, 26 Oct 2005 01:54:21 -0500 Subject: [graap-wg] reminder: telecon on 10/26 Message-ID: <435F281D.5000101@hp.com> All, We will have a telecon. on Wed. morning/evening at our usual time. Dial-in numbers will be the same. The time is: 10:00AM Central Time US (GMT-0600) which should be: 11:00AM Eastern Time US 1600 UK 1700 Germany midnight Japan 2200 Thailand Phone Number: 866-673-8466 in the US. 702-477-6031 for those outside the US. Conference code #8578310. --- Jim From jim_pruyne at hp.com Wed Oct 26 10:06:23 2005 From: jim_pruyne at hp.com (Jim Pruyne) Date: Wed, 26 Oct 2005 10:06:23 -0500 Subject: [graap-wg] Tracker for public comments Message-ID: <435F9B6F.1060108@hp.com> All, Now that we are open for public comments, there is a tracker set-up for them as before. The URL for the tracker is: https://forge.gridforum.org/forum/forum.php?forum_id=577 --- Jim From jim_pruyne at hp.com Wed Oct 26 10:39:06 2005 From: jim_pruyne at hp.com (Jim Pruyne) Date: Wed, 26 Oct 2005 10:39:06 -0500 Subject: [graap-wg] Minutes from Oct. 26 telecon Message-ID: <435FA31A.1070104@hp.com> Minutes from October 26 '05 Telecon Attendees --------- Jim Pruyne Toshi Nakata Heiko Ludwig Asit Dan Discussion ---------- - Let's move the meeting time one hour earlier because of daylight savings time in the US and EU. We'll do that starting next week. - The URL for the public comments has been sent out. - Toshi to group the large comment list from Stephen Pickles into the grammar ones vs. the ones that require discussion. - Managing the responses could be handled a little more formally, so we should be more clear on exactly what to be posted in a reply on the public comment tracker. - There was some concern at GGF that people could not tell how we were making decisions. Heiko proposes that we make our process clear in an e-mail to the mailing list. Jim will do this. From Stephen.Pickles at manchester.ac.uk Wed Oct 26 11:11:46 2005 From: Stephen.Pickles at manchester.ac.uk (Stephen M Pickles) Date: Wed, 26 Oct 2005 17:11:46 +0100 Subject: [graap-wg] Minutes from Oct. 26 telecon In-Reply-To: <435FA31A.1070104@hp.com> Message-ID: <20051026171146266.00000003484@Wombat> All, 1.) In case it helps Toshi to complete his action, I've attached a word document containing the comments I posted to the WS-Agreement public comment tracker, and I've indicated the items that _I_ think might require discussion. 2.) In what timezone(s) will the meetings be one hour earlier? I ask because in today's telcon of GGF compute area chairs (which started at the same time as the GRAAP telcon), we thought it would be a good idea in future to hold the compute area telcon (which will be held on the last Wednesday of each month) half an hour earlier to avoid clashing with the GRAAP telcon. So if we do that, will we just be ensuring that we clash again next month? Best wishes, Stephen > -----Original Message----- > From: owner-graap-wg at ggf.org [mailto:owner-graap-wg at ggf.org] > On Behalf Of Jim Pruyne > Sent: 26 October 2005 16:39 > To: GRAAP-WG > Subject: [graap-wg] Minutes from Oct. 26 telecon > > Minutes from October 26 '05 Telecon > > Attendees > --------- > > Jim Pruyne > Toshi Nakata > Heiko Ludwig > Asit Dan > > Discussion > ---------- > > - Let's move the meeting time one hour earlier because of daylight > savings time in the US and EU. We'll do that starting next week. > > - The URL for the public comments has been sent out. > > - Toshi to group the large comment list from Stephen Pickles into the > grammar ones vs. the ones that require discussion. > > - Managing the responses could be handled a little more formally, so > we should be more clear on exactly what to be posted in a reply on > the public comment tracker. > > - There was some concern at GGF that people could not tell how we were > making decisions. Heiko proposes that we make our process clear in > an e-mail to the mailing list. Jim will do this. > > > > -------------- next part -------------- A non-text attachment was scrubbed... Name: WS-Agreement-Comments-SP2.doc Type: application/msword Size: 64512 bytes Desc: not available Url : http://www.ogf.org/pipermail/graap-wg/attachments/20051026/c5e9afa3/attachment.doc From jim_pruyne at hp.com Wed Oct 26 11:38:21 2005 From: jim_pruyne at hp.com (Jim Pruyne) Date: Wed, 26 Oct 2005 11:38:21 -0500 Subject: [graap-wg] Minutes from Oct. 26 telecon In-Reply-To: <20051026171146266.00000003484@Wombat> References: <20051026171146266.00000003484@Wombat> Message-ID: <435FB0FD.6070807@hp.com> Stephen M Pickles wrote: >All, > >1.) In case it helps Toshi to complete his action, >I've attached a word document containing the comments I >posted to the WS-Agreement public comment tracker, >and I've indicated the items that _I_ think might require >discussion. > > > Thanks, Stephen. I also cut-and-pasted the comments into a top-level response in the public comment tracker. >2.) In what timezone(s) will the meetings be one hour >earlier? I ask because in today's telcon of GGF compute area >chairs (which started at the same time as the GRAAP telcon), >we thought it would be a good idea in future to hold the compute >area telcon (which will be held on the last Wednesday of each >month) half an hour earlier to avoid clashing with the GRAAP >telcon. So if we do that, will we just be ensuring that we >clash again next month? > > > It will be one hour earlier in the US, and this is due to the daylight savings time change here. It will be the same time relative to GMT. --- Jim >Best wishes, > >Stephen > > > > > > >>-----Original Message----- >>From: owner-graap-wg at ggf.org [mailto:owner-graap-wg at ggf.org] >>On Behalf Of Jim Pruyne >>Sent: 26 October 2005 16:39 >>To: GRAAP-WG >>Subject: [graap-wg] Minutes from Oct. 26 telecon >> >> Minutes from October 26 '05 Telecon >> >>Attendees >>--------- >> >>Jim Pruyne >>Toshi Nakata >>Heiko Ludwig >>Asit Dan >> >>Discussion >>---------- >> >>- Let's move the meeting time one hour earlier because of daylight >> savings time in the US and EU. We'll do that starting next week. >> >>- The URL for the public comments has been sent out. >> >>- Toshi to group the large comment list from Stephen Pickles into the >> grammar ones vs. the ones that require discussion. >> >>- Managing the responses could be handled a little more formally, so >> we should be more clear on exactly what to be posted in a reply on >> the public comment tracker. >> >>- There was some concern at GGF that people could not tell how we were >> making decisions. Heiko proposes that we make our process clear in >> an e-mail to the mailing list. Jim will do this. >> >> >> >> >> >> From nakata at mtg.biglobe.ne.jp Wed Oct 26 18:30:56 2005 From: nakata at mtg.biglobe.ne.jp (Toshiyuki Nakata) Date: Thu, 27 Oct 2005 08:30:56 +0900 Subject: [graap-wg] Minutes from Oct. 26 telecon In-Reply-To: <20051026171146266.00000003484@Wombat> References: <20051026171146266.00000003484@Wombat> Message-ID: <436011B0.6090801@mtg.biglobe.ne.jp> Hi: Stephen M Pickles wrote: > All, > > 1.) In case it helps Toshi to complete his action, > I've attached a word document containing the comments I > posted to the WS-Agreement public comment tracker, > and I've indicated the items that _I_ think might require > discussion. > Thank you very much. > 2.) In what timezone(s) will the meetings be one hour > earlier? I ask because in today's telcon of GGF compute area > chairs (which started at the same time as the GRAAP telcon), > we thought it would be a good idea in future to hold the compute > area telcon (which will be held on the last Wednesday of each > month) half an hour earlier to avoid clashing with the GRAAP > telcon. So if we do that, will we just be ensuring that we > clash again next month? > The reason is that day light saving time ends in US and Europe. Since Japan is too near the equator and does not have daylight saving time, if the time is not moved one hour earlier for US and Europe, it would mean that the for JST (GMT +9) the starting time would be one hour late.. (And means that in Japan the telecon would start from 1 AM, which would be a bit tough for people in Japan.) So, one option which the other participants kindly agrfeedd to had been to shift the meeting one hour earlier in US and Europe after the daylight saving time starts so that at least for people in Japan, it would still be from midnight. Best Regards Toshi > Best wishes, > > Stephen > > > > > >>-----Original Message----- >>From: owner-graap-wg at ggf.org [mailto:owner-graap-wg at ggf.org] >>On Behalf Of Jim Pruyne >>Sent: 26 October 2005 16:39 >>To: GRAAP-WG >>Subject: [graap-wg] Minutes from Oct. 26 telecon >> >> Minutes from October 26 '05 Telecon >> >>Attendees >>--------- >> >>Jim Pruyne >>Toshi Nakata >>Heiko Ludwig >>Asit Dan >> >>Discussion >>---------- >> >>- Let's move the meeting time one hour earlier because of daylight >> savings time in the US and EU. We'll do that starting next week. >> >>- The URL for the public comments has been sent out. >> >>- Toshi to group the large comment list from Stephen Pickles into the >> grammar ones vs. the ones that require discussion. >> >>- Managing the responses could be handled a little more formally, so >> we should be more clear on exactly what to be posted in a reply on >> the public comment tracker. >> >>- There was some concern at GGF that people could not tell how we were >> making decisions. Heiko proposes that we make our process clear in >> an e-mail to the mailing list. Jim will do this. >> >> >> > > -- Toshiyuki Nakata t-nakata at cw.jp.nec.com +81-44-431-7653 (NEC Internal 8-22-60210) From t-nakata at cw.jp.nec.com Sat Oct 29 02:09:08 2005 From: t-nakata at cw.jp.nec.com (Toshiyuki Nakata) Date: Sat, 29 Oct 2005 16:09:08 +0900 Subject: [graap-wg] Minutes from Oct. 26 telecon In-Reply-To: <20051026171146266.00000003484@Wombat> References: <20051026171146266.00000003484@Wombat> Message-ID: <43632014.3070904@cw.jp.nec.com> Hi: Thanks to Stephen's word document (which I mostly followed but have categorised some more in the Discussion group) I've categorised Stephen's comments into 39 Grammatical ones and 19 Discussion ones. Hats off to Stephen for a really detailed checking (And it must have taken time to put them as comments with Page number line number etc.) The attached excel book has two sheets one for discussion and one for grammatical / expression wise corrections. I've also changed the Public Comment Format somewhat. 1)Finally found out how to use a hyperlink in excel so that the page can be referred to by just clicking on the title. 2)Decided to include all the contents in the excel sheet rather than to have to follow the link which results in larger volume in the table but should be quicker to refer to. (Makes 1) somewhat unnecessary..) 3)Will be distributing them in excel rather than in html format as you can browse more quickly with excel than with html. If anyone has problems, please let me know. I've also edited the draft document to reflect Stephen's "grammatical" corrections (The ones I marked as tentatively done in the excel sheet) . I will send the tentatively revised version to the co-chairs (and to Stephen) so that the co-chairs can put it in Grid-Forge as a tentative-correction draft that people can look into. (The numbers there are Stephen's and not my excel based numbers). I would suggest the authors look into it and see if they agree to the corrections and point out the ones they'd like to see otherwise in email rather than disccuss it on the telecon. One burning question. "37. Page 27, Section 4.2.6.3.1 Importance, first paragraph, second line Remove spurious ?;?.=>Couldn't find it. Perhaps you mean the comma after "therefore"? Best Regards Toshi PS I had intended to put some of my knit pickin on the list as well but probably should first send them to the Public Comment? PPS I'll never try to paticipate in a teleconference with a cold and an unclear head again :-) Stephen M Pickles wrote: >All, > >1.) In case it helps Toshi to complete his action, >I've attached a word document containing the comments I >posted to the WS-Agreement public comment tracker, >and I've indicated the items that _I_ think might require >discussion. > >2.) In what timezone(s) will the meetings be one hour >earlier? I ask because in today's telcon of GGF compute area >chairs (which started at the same time as the GRAAP telcon), >we thought it would be a good idea in future to hold the compute >area telcon (which will be held on the last Wednesday of each >month) half an hour earlier to avoid clashing with the GRAAP >telcon. So if we do that, will we just be ensuring that we >clash again next month? > >Best wishes, > >Stephen > > > > > > >>-----Original Message----- >>From: owner-graap-wg at ggf.org [mailto:owner-graap-wg at ggf.org] >>On Behalf Of Jim Pruyne >>Sent: 26 October 2005 16:39 >>To: GRAAP-WG >>Subject: [graap-wg] Minutes from Oct. 26 telecon >> >> Minutes from October 26 '05 Telecon >> >>Attendees >>--------- >> >>Jim Pruyne >>Toshi Nakata >>Heiko Ludwig >>Asit Dan >> >>Discussion >>---------- >> >>- Let's move the meeting time one hour earlier because of daylight >> savings time in the US and EU. We'll do that starting next week. >> >>- The URL for the public comments has been sent out. >> >>- Toshi to group the large comment list from Stephen Pickles into the >> grammar ones vs. the ones that require discussion. >> >>- Managing the responses could be handled a little more formally, so >> we should be more clear on exactly what to be posted in a reply on >> the public comment tracker. >> >>- There was some concern at GGF that people could not tell how we were >> making decisions. Heiko proposes that we make our process clear in >> an e-mail to the mailing list. Jim will do this. >> >> >> >> >> > > > -- Toshiyuki Nakata ?????? Chief Engineer, Central Research Lab. NEC 1753, Shimonumabe, Nakahara-Ku, Kawasaki,Kanagawa 211-8666,Japan Tel +81-44-431-7653 (NEC Internal 22-60035) Fax +81-44-431-7609 (NEC Internal 22-60509) -------------- next part -------------- A non-text attachment was scrubbed... Name: PublicComments1029.xls Type: application/vnd.ms-excel Size: 59904 bytes Desc: not available Url : http://www.ogf.org/pipermail/graap-wg/attachments/20051029/ec2262b1/attachment.xls