6 ביוני 2007

The truth about XML performance


WS * Standards are all based on top of XML. When people who care about performance hear this they go crazy. They say that XML is not efficient.
 The problem is that they do not understand exactly what XML really is.

The Definition of XML is “A standard for information representation using the hierarchic model”

This “hierarchic model” is a tree of nodes called: “XML Infoset”


It is true that in most cases this tree is expressed in text. This is the source of the misconception about XML. The fact people do not know is that the XML Infoset can be expressed  in any encoding . Text is only an encoding example.

Let us prove this: To do this we will look at the following code.

Let us assume we have a customer class which includes customer information.

//2. XML Serialization with binary encoding

            XmlSerializer xs =

                new XmlSerializer(typeof(Customer));


            XmlDictionaryWriter writer1 =


            xs.Serialize(writer1, cm1);

We create a XML serializer for the customer class cm1. Pay attention that  NO ENCODING INFORMATION IS SUPPLIED TO THE SERIALIZER. Why ???

The rezone relies in the definition of XML serialization: “Creation of an XML InfoSet to represent  the class state”. There is no one word about encoding.

This means we can XML serialize an object and encode this XML infoset afterwards in binary encoding.


This is still XML but encoded binary.


By the way this is exactly what WCF does in netTcpBinding. It creates XML using XML or DataContract serializers but it uses binary encoding.


The results are interesting: The size of xml serialization with binary encoding can be smaller that binary serialization.


This means that XML can be used in application that needs performance, as XML really means a tree of nodes. We represent this tree in any way we want. Some methods of representation are lighter than others, the information stays the same.

I Include a code sample   Compare XML Binary and DataCotract Serialization    that compare the 3 major serialization methods and proves that XML is not “Anti Performance” technology




Manu Cohen-Yashar

