Leaky Abstractions, Underlying Protocols and Message Exchange Patterns
The fault is all Faults. Normal application messages can easily be modeled as one-way or request-response, but it's support for Faults that complicate the message exchange patterns and bindings enormously.
I believe that all abstractions eventually leak, it's just a question of how much. Web service Faults are the catalyst for some serious cascading leaks in asynchronous messaging, and WS-Addressing compounds matters. With HTTP, SOAP HTTP Binding, SOAP MEPs, WSDL MEPs and WS-Addressing, the underlying protocol's ability to deliver Faults leaks through to every level. There ends up being no worthwhile abstraction.
Faults, and the delivery of Faults over HTTP, result in the need for extra SOAP and WSDL MEPs, and this abstraction leaks ever upward into application design.
It's even worse, because we have to retain SOAP meps even though we have WSDL meps.
Let's assume that you can deploy your application on 2 different protocols: HTTP and some marvy true one-way (SOAP over JMS or SOAP over UDP perhaps, or …)
Scenario 1: One-way
We start at the "top", that is the interface level. You write your application to send a one-way message. In WSDL terms this is a "one-way" MEP. SOAP provides a "request-response" MEP, and we probably will end up having a "one-way" SOAP MEP. Let's assume for now that we can use a "one-way" SOAP MEP. The logical SOAP mep choice for a WSDL one-way would be a SOAP one-way.
It's at the binding layer that the abstraction now leaks. You deploy on both HTTP and a real one-way. Things are great on the real one-way as the WSDL MEP, SOAP MEP and the protocol line up. But HTTP is a connection oriented protocol and gives a return code and optionally a response body. What happens with the HTTP return code and a response body? Should the application wait for the HTTP return code? And what should it do with it?
Imagine there's a failure, and a Fault is generated. An HTTP 4xx or 5xx is returned, and the Fault could be reported. Should the binding layer throw the return code + fault on the floor and say nothing to the application? Throwing away an error at the underlying protocol seems very counter-intuitive. We'd probably like the underlying protocol to report back the fault. Which means we picked the wrong WSDL MEP, as we'd like to be able to report the underlying protocol error to the application.
Scenario 2: WSDL Robust in-only
Let's pick a better WSDL MEP in scenario 2. WSDL 2.0 has a "robust In-only" for *exactly* this pattern of in + optional fault out. This is the obvious WSDL MEP to use if a Fault can be reported from the underlying protocol up to the application. Now we don't actually have the right SOAP MEP for this. Amazing, but true. The WSDL MEP correspondingly needs a SOAP in-optional-out MEP. Another option is a SOAP in-optional-fault MEP, when bound to HTTP specs that an HTTP 2xx must contain no body. Let's call this one-way mep mythical because we haven't decided to standardize one yet though I wrote up one many months ago.
The choice of the WSDL MEP at the application layer is guided by which underlying protocol(s) are expected to be used. . This is the main thesis of this article, that the application layer modeling is affected by the underlying protocol. An application using HTTP would probably never want to deploy a true in-only as the HTTP response code will be thrown away. The underlying protocol's ability to report faults has leaked into the SOAP mep which has then leaked into WSDL mep. The abstraction between the application MEP and the underlying protocol (via the SOAP MEP) has leaked.
Point #1: The myth of protocol independence
This comparison of in vs robust in-only has shown that the application level WSDL MEP will be determined by the underlying protocol, assuming that the underlying protocol's information is fully utilized.
Another possibility is to throw out anything "extra" from the underlying protocol, that is effectively dumbing HTTP down to UDP. In general, Web services using SOAP and WSDL 1.1 has already done that by ignoring the HTTP Operation. The Web services "architecture" has further done that by ignoring the protocol capabilities, such as security, encoding, caching. We could have utilized the capabilities of HTTP and other protocols if we'd agreed how to describe the capabilities, but the "features and properties" work of SOAP 1.2 and WSDL 2.0 looks pretty much DOA.
Corollary to point #1: True protocol independence means dumbing down every protocol to UDP OR a framework for expressing protocol capabilities
I've shown how we are achieving true protocol independence by throwing away everything that makes up HTTP: the operations, status code, response, encodings, security, and all that.
What's the point of SOAP meps
As I've been exploring this complexity of wsdl meps, soap meps, and bindings, I've been wondering if we could simplify or even not use SOAP meps. Why not just have WSDL MEPs and Bindings and skip the SOAP mep abstraction? Also, Jonathan Marsh made an intriguing proposal that maybe we could have an "uber MEP" of in optional-out rather than an in-out and an in-only SOAP MEP.
It turns out that we need both a one-way AND an in-optional-out, because the SOAP sender has to know what it can expect, and a receiver has to know what it can do. Argh!
Imagine a world without soap meps…
If a WSDL in-only MEP is specified, then there is no response allowed, period. So the SOAP MEP should be a one-way. This tells the SOAP sender to ignore any response, and if communicated in WSDL it tells the SOAP responder to not send any response. If a WSDL robust in-only is specified, then a response is allowed. An in-optional-out SOAP mep is needed. A 200 indicates no fault, and a 4/5xx indicates a Fault.
We need the 2 new SOAP meps (one-way and in-optional-out) to tell the SOAP sender what to expect and the SOAP receiver what it can do. If we only have the in-optional-out without the in-only, then how does a SOAP sender know whether it could get a Fault or some other content, and how does a SOAP receiver know that it can send a Fault? There has to be something somewhere that says "SOAP receiver, you can return a Fault on the HTTP connection". This is what the SOAP meps do.
If we used only the in-optional-out, that is ignoring request-response and in-only, then the receiver doesn't know what it can do with status codes and responses including faults. A sender wouldn't know whether it will get a response (request-response), may get a response (in-optional-out) or won't get a response (in-only). Imagine these two get out of synch, say the soap sender always waits for a response and the soap receiver lets the application specify whether a response is to be sent or not. How does the soap software "know" to keep the connection open? Imagine at the WSDL level it is one-ways. The receiver has dispatched the message, but the application won't send a response. The application *must* tell the SOAP layer that it's not going to send a response. How does it do that? It probably says "this is a one-way message".
Voila. We've just invented the main point of SOAP meps. Perhaps the SOAP meps could be simplified or rejigged. But at the end of the day, you still have to say things like can/may/must there be a response, and is the response a fault or a soap response or either. Maybe we could have a "SOAP Request-response profile" that says "assume all the stuff in the request-response mep, except there can't be a response."
The MEP return optionality naturally gets passed down to the binding as "can a response come with an http 200, can a response come with any http code, can only a 200 without a body or a 4/5xx with a body, etc.". It's worth pointing out that the only public SOAP HTTP binding spec says that an HTTP 200 requires that the SOAP sender start making an abstraction of the response message AND it says nothing about any other 2xx return codes. For example, it does not say that a 204 indicates the message is finished.
Point #2: SOAP MEP functionality needed
This exercise has shown that the needs of the SOAP sender and SOAP receiver dictate that SOAP meps are required. The SOAP meps are used to tell the SOAP software what it can do. Co-incidently, We need a SOAP HTTP Binding that supports an in-only and an in-optional-out MEP, not coincidentally matching up WSDL meps with underling protocol capabilities.
Presumably WSDL 2.0 will need an update to support the new MEPs and HTTP bindings, and I'd suggest that the SOAP MEP defaults to the WSDL meps. One could bind the meps in a different manner, and some extension could over-ride the default. For example, a WSDL robust in-only could use a SOAP one-way over a SOAP one-way underlying protocol. And a WSDL in-out could be mapped to 2 different SOAP in-optional-faults, each to a separate HTTP connection for the application request/response using asynchronous underlying protocol (aka asynch request response).
WSDL mep and SOAP mep needs
1. A SOAP one-way MEP,
2. A SOAP in-optional-out MEP or a SOAP in-optional-fault MEP,
3. One or more SOAP HTTP binding(s) that supports one-way and in-optional-out|in-optional-fault MEPs.
4. Updated WSDL 2.0 to relate WSDL meps to SOAP meps.
Enter sandman: WS-Addressing
Now you decide that you want to use WS-Addressing. You are again going to deploy your app on both HTTP and some one-way protocol. What should you do about faults and meps? Let's say you pick the robust in-only. You will probably specify a FaultTo. FaultTo has 2 options: an explicit address and "anonymous". The anonymous is for the same connection, ie an HTTP connection. Anonymous seems pretty ideal for the HTTP 4/5xx + SOAP Fault scenario. But which address to pick, explicit or anonymous?
Those pesky underlying protocols are different and this leaks upwards into WS-Addressing and ReplyTo. In the true one-way case, an explicit FaultTo address is the obvious choice. There are many different places where a Fault could be generated and all of them can be sent to the FaultTo address.
Using HTTP, a fault can be generated 1) during reception of the SOAP message (ie at the protocol level) or 2) after the HTTP connection has closed. If you put an anonymous FaultTo address, you lose the ability to transmit the 2nd type of Fault. If you put an explicit address, what happens to a fault generated while the HTTP connection is open? It seems extremely inefficient to report errors of type #1 by closing the HTTP connection, and then opening up another HTTP connection to send the Fault. Especially if a Fault is generated and the FaultTo can't be read, eg the envelope is encrypted and no appropriate key is available.
What we need is the ability for both anonymous and explicit FaultTo addresses in WS-Addressing when HTTP is used. The intent is that any faults generated during HTTP connection time can be returned in the HTTP response, and any faults generated after the connection is torn down have an explicit address. I think my favored design is a "AllowAnonymous" on FaultTo, with a default to True. The default is that a fault generated during the HTTP processing could be returned over the HTTP connection, but this can be turned off in the FaultTo. Some other ideas are to allow 2 EPRs for FaultTo and require one to be anonymous if 2 are present, or to have an "allowAnonymous" to have default False so that an explicit action is required.
Returning to the leaky abstraction theme, notice how the protocol abstraction is broken. Using HTTP with a FaultTo means that a Fault could come back on the HTTP connection OR to the FaultTo address. The application will need to be prepared for a Fault when it does the send. This does not occur if a true asynch protocol is used. An application that can be bound to true asynch as well as HTTP must be prepared for Faults in 2 places. The design of the WS-Addressing FaultTo mechanism itself reflects the underlying protocols capabilities.
There are 3 scenarios for FaultTo with no ReplyTo:
1. No FaultTo or anonymous: must have at least a WSDL robust in-only and SOAP in+optional-out.
2. FaultTo and allowAnonymous="true": equivalent to 2 WSDL operations and 2 SOAP meps, first is robust-in variety, the second is one-way or robust-in variety.
3. FaultTo and allowAnonymous="false": equivalent to 2 WSDL operations and 2 SOAP meps, first is in-only and second is one-way or robust-in variety.
Point #3: WS-Addressing Faults must reflect underlying protocol capabilities
The previous section has show that WS-Addressing Faults should support the capabilities of HTTP by allowing faults over the HTTP connection as well as Faults being sent to the ReplyTo address. A particular design is proposed that cleanly adds this into the FaultTo.
Mix-in some ReplyTo
In the case of Asynchronous request response, which is the application sees request/response but there are 2 separate connections, WS-Addressing provides a ReplyTo address. But which SOAP meps should be used for a WS-A with a ReplyTo and a FaultTo?
In the case of asynch underlying protocol, the request/response/fault messages each use SOAP one-ways and WSDL one-way messages. In the case of HTTP, things can be more efficient. I'd previously mentioned that WS-Addressing ought to support a Fault over the HTTP Connection. This implies a SOAP in-optional-fault aka robust-in MEP for the request message. The application will know to wait for a SOAP fault if the HTTP response code is 4/5xx.
It's conceivable and possible that there may be faults in delivering the Reply or the Fault on the separate HTTP connection. Presumably, the application sending the Reply or Fault will want to know if they can't be delivered. This implies a SOAP in-optional-fault MEP for the response and the fault messages.
Returning yet again to leaky abstractions, using WS-Addressing has done nothing to insulate or abstract the SOAP meps from the protocol binding. It's simple: if you use HTTP, then the SOAP meps and WSDL meps should be in-optional-fault (aka robust in-only). If you use a true asynch protocol, then the meps can be in-only. The safest option is to use in-optional-fault for the SOAP and WSDL meps. I'd say that the default for a WS-Addressing aware service ought to be in-optional-fault. If the underlying protocol doesn't support returning faults, then the optional fault part will never be used.
Point #4: WS-Addressing ReplyTo or FaultTo needs new SOAP MEP(s)
The very existence of ReplyTo means implies that it is not synchronous request response at some layer, and invariably it means not synchronous at the protocol layer. If you wanted to use the HTTP response for the application response, you'd just use request-response. ReplyTo is clearly intended for an asynchronous callback and thus 2 http connections using HTTP. Two connections means using two bindings and two SOAP meps. I've already mentioned the need for SOAP one-way and either in-optional-out or in-optional-fault.
I also think that it calls for a breakup of the 1:1 mapping between WSDL and SOAP meps for a natural programming environment.
I think it's pretty natural to want to express a request-response using ReplyTo as a single operation. The whole point of doing "WSDL" operations is to associate one or more messages into an operation. My pitch is that we should be able to take a WSDL in-out that has WS-A engaged and express it as 2 SOAP in-optional-fault meps if there is a ReplyTo and 3 SOAP in-optional-fault meps if there is a ReplyTo and non-anonymous FaultTo. Some folks will argue that there should be a 1:1 correspondance, and so ReplyTo should used 2 WSDL meps and then some kind of correlation aka BPEL service links. I proposed this to WSDL 2.0 in June 2004, but WS-A wasn't at the W3C at the time.
Point #5: WS-Addressing asynch request-response should allow 2 soap meps for wsdl request-response
The argument comes down to whether there is a leaky abstraction or not. Ironically, I want to hide the SOAP meps from the WSDL meps, so a wsdl request-response can map to more than 1 soap mep. Those that want a 1:1 relationship are arguing that the application protocol should mirror the underlying protocol.
My argument is that the constructs in WS-A - ReplyTo, FaultTo, messageId, and relatesTo - are sufficient information to do the correlation and manage the connections and URI-space. The key is to make sure the programming model knows that request-response can be deployed asynchronously and doesn't make synchronous assumptions.
Diversion: WSDL 1.1
What to do about WSDL 1.1 MEPs, SOAP and HTTP Bindings? The situation is quite simply dreadful. If you want to use an asynch protocol, then you can only use the WSDL 1.1 one-way. Which means if you also use HTTP, then any Faults are simply thrown away. If you use HTTP and you want faults over the HTTP connection, you are forced to use in-out. You have 2 bad choices: Use one-way and throw-away HTTP Faults, or get HTTP faults but be forced into request-response.
WS-Callback spent considerable time examining all the protocol possibilities for callbacks. One decision was to not allow an "acknowledgement" message (aka out or response) on the HTTP response. For an acknowledgement header block to be returned for a one-way mep, it would have required a "not a fault" SOAP fault, which seemed far too complicated.
We clearly need to move to WSDL 2.0 and it's MEPs to enable more efficient error handling.
What's needed, and why
What do we really need from all of this for faults and asynch request-response, and why:
1. A SOAP one-way MEP, so that a SOAP sender will know not to wait for any optional (ie Fault) or required response over a one-way protocol
2. A SOAP in-optional-out MEP or a SOAP in-optional-fault MEP, so that a SOAP sender can wait for an optional response, such as a FAULT over HTTP.
3. One or more SOAP HTTP binding(s) that supports one-way and in-optional-out MEPs.
4. Updated WSDL 2.0 to relate WSDL meps to SOAP meps.
5. A fix to WS-Addressing FaultTo so that a Fault can be returned upon an initial request and not just the FaultTo, and I propose a "AllowAnonymous" flag.
6. Support for asynchronous request-response to over-ride of the WSDL 1.1/2.0 WSDL mep defaults to specify different soap meps than the default. I'd proposed this in WSDL 2.0 but it could be done in Addressing as well.
7. Ideally, a simplified SOAP MEP structure to say simple things like in-only, in-optional-out, in-optional-fault.
Returning to the leaks
We need these new SOAP meps and SOAP HTTP bindings to support asynch and HTTP Fault reporting. The HTTP faults are have caused leakage because
1. the specific binding leaks into the SOAP mep selected, because HTTP bindings match up well with a to be standardize (TBS) SOAP in-optional-out to support faults on the HTTP connection
2. the specific binding leaks into the WSDL mep selected because the WSDL mep will logically match up with the SOAP mep
3. the specific binding leaks into the WS-Addressing Fault, because an HTTP binding will suggest Faults can be returned on the in HTTP connection AND a separate connection for the FaultTo.
This paper has shown how the underlying protocol leaks into the SOAP MEP (#1), the WSDL MEP (#2), an the WS-Addressing Fault (#3).
These conclusions are definitely not what I had wanted, but I can't see any other solutions that enable asynch request/response, binding to multiple protocols, and proper reporting of HTTP faults to the application.
This article is reprinted from Dave Orchard's blog entry found at http://www.pacificspirit.com
Trackback URL for this post: http://www.webservices.org/trackback/id/5760





