member login

WebServices dot org

Todays Featured Content:

StrikeIron Jump-Starts 2008 with Multiple Industry Honors

CMP’s Intelligent Enterprise Web site announced its 2008 Editors’ Choice Award winners with StrikeIron included among its 36 “Companies to Watch” in the enterprise application category. StrikeIron was also included in Robin Bloor’s list of “10 IT Companies to Watch in 2008.”

StrikeIron Expands Web Services Marketplace with New Financial and Business Data Services from Gale

In-depth financial and corporate information on hundreds of thousands of U.S. and international companies: Two new Financial and Business data services from Gale, part of Cengage Learning, have been added to StrikeIron's expanding Web Services Marketplace: Gale Business Information Web Service 1.0.0 and Gale Business Intelligence Web Service 1.0.0.

StrikeIron Delivers Data Web Services via IBM QEDWiki

StrikeIron Inc., a provider of Data as a Service (DaaS), today announced that it has aligned with IBM to deliver premium web services via IBM's enterprise mashup maker QEDWiki. Content available includes business intelligence services such as multiple D&B services, Address Verification, Email Verification, Currency Rates and many more.

StrikeIron Super Data Pack

Start working with Web services and live data instantly! The Super Data Pack brings together dozens of Web services into one easy-to-use “Super” Web service. With the Super Data Pack, developers and end-users can leverage multiple data sources for use within a diverse set of rich applications at no cost or with no commitment.

Featured Content provided by StrikeIron, Inc.

Getting Your Message Across in Binary

27th Jun 06:

What ever happened to “Binary XML”? Is the Fast Web Services initiative going to bring solace?

A surprisingly large number of people aspire to the opinion that XML is too heavyweight for high performance applications. Typically their basis is not much more than relating to the “bloating” characteristic of XML (think redundant data like closing tags) which has a negative impact on wire speed (particularly for large messages) and overheads in parsing and space requirements. The consequence is many people don’t consider XML text as desirable for high volume data transmissions or performant (a French word for efficient) systems.

The XML performance issue was given further credence when the W3C formed a W3C Binary XML Characterisation working group . That working group recently closed, with the outputs being a use case document to determine the minimum set of requirements which “Binary XML” must meet, a document to compare different measurement approaches to performance and a problem definition document. There was no document chartered to actually specify a common solution to Binary XML although in its defense it was a “characterization” working group to ascertain if “alternate (binary) encodings [of XML] provide the required properties”.

An important posting to the XML Group’s newsgroup gives good background to why it never went further.

"

“The TAG believes there are disadvantages as well as potential advantages that will result from even a well crafted Binary XML Recommendation. The advantages are clear: a successful binary format is likely to provide speed gains or size reductions, at least for certain use cases. The drawbacks are likely to include reduced interoperability with XML 1.0 and XML 1.1 software, and an inability to leverage the benefits of text-based formats. “

"

Basically opponents to the working group were saying is “ Bad side of XML is addressed by throwing away (much of) the good side ". Of course, “Binary XML” is an oxymoron as Tim Bray points out :

"

“I don’t care if anyone wants to go off and produce their own data interchange format, binary or not, open or not, standardized or not, mapped to XML or not; as long as they don’t call it XML. “Binary XML” is an oxymoron. And I should point out that the people at Sun who are building a binary data format with a mapping to XML are calling it something else entirely . These Binary-XML people are charging headlong onto the top of a very long, very steep, very slippery slope.”

"

The opponents also questioned the performance benefits of “Binary XML” and if in fact it was required at all, “if XML 1.x is inherently capable of meeting the needs of users, then our efforts should go into tuning our XML implementations, not designing new formats.” Since “Binary XML” was first debated inception the affect of faster and cheaper CPUs, bigger bandwidth has probably done much to invalidate the needs of many potential users.

There are however many proponents of Binary XML. The need for speed is strong. In terms of measuring performance for “Binary XML” implementations that do exist, Elliotte Rusty Harold has some pertinent warnings. Make sure you understand exactly what is happening in that implementation otherwise it is not a like for like comparison. When Harold was investigating one such implementation he comments :

"

“It took till Q&A to find out for sure, but I was right. They aren't doing full well-formedness checking. For instance, they don't check for legal characters in names. That's why XTalk is faster. They claim expat isn't either (I need to check that) but there was disagreement from the audience on this point. But the bottom line is they failed to demonstrate a difference in speed due to binary vs. text. “

"

Harold then goes on to conclude:

"

“My specific prediction is that they'll speed up XML processing by failing to do well-formedness checking. Of course, they'll hide that in a binary encoding and claim the speed gain is from using binary rather than text. However, I've seen a dozen of these binary panaceas and they never really work the way the authors think they work. An XML processor needs to be able to accept any stream of data and behave reliably, without crashing, freezing, or reporting incorrect content. And indeed a real XML parser can do exactly this.”

"

Now for something else entirely, Fast Web Services

As Tim Bray pointed out above, Sun are not calling their effort to define a binary data format “Binary XML”. Sun’s initiative to make Web services more performant (that French word again) is called Fast Web Services . It is an initiative that seems to have gained a good deal of airtime and support, although much of that is still waiting to see how it works out.

Aimed at the “identification of performance problems in existing implementations of Web Services standards”, Sun is putting forward two solutions, Fast infoset and Fast schema. They define standards for binary-based messages that are dedicated to consuming less bandwidth and or requiring less memory to be processed. I use the word “standards” but they are still in the process of being ratified .

The main page on Fast Web Services at Sun describes Fast Infoset and Fast Schema as follows:

"

“As its name suggests, the Fast schema approach uses information from a document's schema to optimize which parts of the infoset to include in a message. Fast infoset, on the other hand, is a pure infoset-based drop-in replacement for XML. Both alternatives have advantages and disadvantages; application requirements will dictate which of the two approaches is more appropriate.Performance characteristics aside, the main difference between Fast schema and Fast infoset, is that the latter can be used as a replacement for XML in any application. Think of an encoding that is tokenized and thus more compact but also faster to both generate and consume. Fast schema can only be used when both the producer and consumer of the data have access to its schema. This is exactly the case for (literal) Web services described using a WSDL file. “

"

Abstract Syntax Notation One (ASN.1) is the underlying technology behind Fast schema and Fast infoset and have been in development for nearly 20 years and are widely used in the telecommunication industry where bandwidth and device capabilities really do matter. Mapping W3C XML Schema Definitions into ASN.1

A good slideset showing the main points about Fast infoset can be found here . The inner workings of Fast infoset are rather complicated but well described in this XTech paper . An important point to note is that Fast infoset standard is not Java-specific. There are for example C++ implementations already available.

With the main objective being faster Web services, you would of course expect to see some significant performance gains. Typically those cited in the tutorials and papers ranges from 2-10X improvement in parsing. But as the manual says , “As the saying goes, when it comes to performance, 'your mileage will vary'.Fast Infoset is designed to optimize parsing and serialization, so the key to understanding the potential gains associated with using this technology is understanding the percentage of time your application spends in these two tasks.

How much time your application spends on “parsing and serialization” is critical in deciding how much benefit will be achieved from Fast Web Services. A multiple of 2-10X may sound like a lot, but if the original time spent doing these tasks is so small in comparison to the end-to-end timescales, there isn’t much overall gain. I don’t want to give the impression that the XML performance is a non-issue but it needs to be seen in context of your application as a whole. In saying this, if you are doing XSL transformations you should look for huge performance gains to be made there with a specialist hardware device like those from DataPower where XSLT transforms can be speeded up by huge factors.

I would personally buy into the Microsoft strategy which is use text based XML for interoperability across systems while for in-memory messaging, platform-specific tools, use your internal binary encoding of choice (such as Fast Web Services or the proprietary binaryXML encoding in WCF).

It would of course be beneficial if all vendors used the same binary encoding. Rich Turner of Microsoft does suggest Microsoft at least will allow Fast Web Services to be used. As Rich says :

"

“However, without a "standard" for this encoding, we'll end up with 20 different BinaryXML implementations, few of which will be interoperable with one another..Many next-generation distributed systems technologies, such as WCF, do (or will) fully support the notion of simultaneously exposing a given service via several transport & encoder "bindings" resulting in services being made available via several communication vehicles simultaneously…. However, once the world agrees on one (or maybe more) BinaryXML formats, you can be sure that we or someone else will ship a compliant encoder for WCF.“

"

And..of course, it is already happening. Sun are working on a solution called FastInfoset For Indigo (FIFI). It is described as:

"

“FIFI exploits the extensibility of the WCF framework and implements FastInfoset on this platform. By doing so, it enables WCF clients to efficiently interoperate with Java technology-based web services providers, such as the Sun Java Application Server. It is a primary example of how Sun is enabling its customers to cross the interoperability chasm between the open Java platform and proprietary solutions.”

"

Sun's JAX-WS 2.0 implementation supports the Fast Infoset standards, although beware since JAX-WS RI 2.0.1 is going througha re-architecture to efficiently support all those additional Web services related functionality, like addressing, reliable messaging and secure conversation. At this point Fast Infoset is currently not implemented in the rearchitectured implementation, but it will be soon!”

The key questions remains, as one comment here describes , “whether the actual size/speed technology advantages of such a (hypothetical) encoding would outweigh the business/social advantages of having one and only one XML format. [Actually it's more complex than that because of the multiple character encodings already allowed, the de-facto subset of XML that SOAP uses, and so on]. That is pretty much where the MS people agree with Elliotte -- we are all scrambling to make XML *text* interop a reality today, and wildcards such as XML 1.1 and "binary XML" just make that all the harder”.

Aside

In my last weblog I discussed MTOM and XOP as mechanisms to attach binary documents to SOAP messages. How do they compare with Fast infoset?

Fast infoset is a specific binary encoding of the XML information set where as MTOM and XOP provide access to a Base 64 compatible infoset but still allow the binary attachments. However, conceptually, they are similar in their use of binary data but Fast infoset could result in even better performance because there is no need to use a packaging technology such as MIME as it is naturally a binary message.


Trackback URL for this post: http://www.webservices.org/trackback/id/75327

Comments

Fast Infoset for .NET

An open source C# implementation of Fast Infoset for .NET has just been released, it is available at www.fastinfoset.net.