[Nsi-wg] error management issues
imonga at es.net
Wed May 5 09:50:13 CDT 2010
>> 1. I don't think maintaining state is hard - every point to point protocol has state on either side whether it be PPP, IPsec/IKE, TCP etc. This is not just ATM.
> The problem with state here is that it has to be maintained for all connection reservations, but all NSAs involved with making a reservation. The set of NSAs involved in a reservation for a particular connnection may be different depending on who made the reservation. If an NSA goes down the state of the connection as a whole becomes indeterminate. Or so it seems to me.
What we have suggested in the error document are the following assertions:
1. A failure in the service plane (i.e NSA) should not affect connections that are provisioned and active in the transport plane.
2. An NSA recovering from a failed condition cannot depend solely on it’s peer NSAs to reconstruct it’s state.
3. An NSA must be able recover the state of the local transport plane from it’s NRM(s).
We feel that with these assertions, the state machine and its recovery during service plane errors should not affect the connection in the transport plane and the steps in the document provides a way for coming to a determinate state after the recovery process.
During a recovery process, regardless of whatever protocol you examine, state machines may be inconsistent.
>> 2. Each NSA pair maintains state between them. What they do is to inform the service tree under them when an error happens that requires notification up and down the service tree. The biggest issue here is a "state cleanup" problem i.e. when things go bad, you need to ensure that every NSA will clean up its state.
>> 3. On slide 3, I do not understand the requirement "
>> •Signals from NRM and Requestor should be the same
>> –Need to define what these are, because they are carried on NSU
> By this I mean that what the NRM signals about local resources should be the same as what gets signaled by NSI about remote resources. I think we agreed to this on the calls.
I am sorry, maybe we should discuss this offline. What do you mean "signals about local resources"? - provisioning of local resources? errors from local resources?
>> 4.Slide 5 - signaling of the connections happens in the transport plane or by the service plane. The connection service resides in the service plane. The connection resides in the transport plane.
>> Look forward to the discussion tomorrow, and finalizing the error handling portion of the document if it fits within the agenda tomorrow.
>> On May 4, 2010, at 8:46 AM, John Vollbrecht wrote:
>> Inder Monga http://100gbs.lbl.gov
>> imonga at es.net http://www.es.net
>> (510) 499 8065 (c)
>> (510) 486 6531 (o)
Inder Monga http://100gbs.lbl.gov
imonga at es.net http://www.es.net
(510) 499 8065 (c)
(510) 486 6531 (o)
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the nsi-wg