Sun developed XML Test to compare the processing performance of common XML operations in Java and .NET, and used it to demonstrate that Java outperformed .NET 1.1.
Microsoft renamed it to XML Mark, fixed some issues and then used it to demonstrate that .NET 2.0 had significantly better XML performance than Java.
We renamed it to FI Mark and used it to measure the processing performance of:
- XmlFastInfosetReader/Writer of FastInfoset.NET 2.2.0
- XmlFastInfosetDictionaryReader/Writer of WCF-Xtensions 2.0.0
- XmlTextReader/Writer of .NET 2.0
- XmlDictionaryReader/Writer with CreateTextReader/Writer of .NET 3.0
- XmlDictionaryReader/Writer with CreateBinaryReader/Writer of .NET 3.0
The source code, executables, test documents, initialization files and results of this benchmark can be downloaded from here.
All tests were run in June 2007 using an IBM R50e Type 1834-U2G as provided by the manufacturer (Intel Pentium M 735 1.70GHz 2MB, 512 DDR SDRAM, 60 GB ATA-100) with Windows XP Home Edition SP2, .NET Framework v2.0.50727, .NET Framework v3.0, and WCF-Xtensions 2.0.0 Enterprise edition installed. The executables were compiled using Visual Studio 2005 Professional Edition v8.0.50727.42. The test documents were converted from the text XML format to the Fast Infoset XML format using Fast Infoset Converter 2.2.0 and to the .NET Binary XML format using MSBXML Converter 1.1.0.
Note on MTOM
The MTOM encoding was designed to decrease the size of text-encoded SOAP messages that contain large binary data. We did not include it in this benchmark as it is not it a general purpose XML encoding. However, separate tests indicate that it provides the lowest processing performance of all encodings.
The following description was taken directly from the paper by Sun available here.
XML Test is an XML processing test developed at Sun Microsystems. It is designed to mimic the processing that takes place in the lifecycle of an XML document. Typically that involves the following steps:
- Parse - Scan through the XML document processing elements and attributes and possibly build an in-memory tree (for DOM parsers). Parsing is a pre-requisite for any processing of an XML document.
- Access - Extract the data from the elements and attributes of parts of the document into the application program. For example, given an XML document for an invoice, the application might want to retrieve the prices for each item in the invoice.
- Modify - Change the textual content of elements or attributes and possibly also the structure of the document by inserting or deleting elements. This does not apply to streaming parsers. As an example, an application might wish to update the prices of some of the items in an invoice or may want to insert or delete some items.
- Serialize - Convert the in-memory tree representation to a textual form that is written to a disk file or forwarded to a network stream. This makes sense only for tree building parsers and is necessary in the cases where the XML document has been modified in memory.
XML Test simulates a multi-threaded server program that processes multiple XML documents in parallel. This is very similar to an application server that deploys web services and concurrently processes a number of XML documents that arrive in client requests. Since we wanted to concentrate on XML processing performance, rather than use some sort of web container, we designed a standalone multi-threaded program implemented in both Java and C# that processed XML document files. To avoid the effect of file I/O, the documents are read from and written to memory streams.
XML Test measures the throughput of a system processing XML documents. The notion of an XML transaction here corresponds to a complete lifecycle of an XML document. For tree building parsers this requires the four steps of parse, access, modify and serialize while for streaming parsers it just involves parse and access. XML Test reports one metric: Throughput - Average number of XML transactions executed per second.
The XML documents used with XML Test are based on a business invoice document. Each invoice has a fixed length header and summary and a variable number of lineitems. The schematic of the invoice schema is given below in XML schema form.
<element name="Header" type="InvoiceHeaderType"/>
<element name="LineItem" type="InvoiceLineItemType" minOccurs="1" maxOccurs="unbounded"/>
<element name="Summary" type="InvoiceSummaryType"/>
Though this schema can be used to generate an invoice document of almost any size, we made use of two particular document sizes in our experiments. Streaming parsers are typically used to process large XML documents, and so to compare SAX and the pull parser, we used an invoice with 1000 lineitems (about 900 KB). Tree-building DOM parsers have to operate under memory constraints since they construct a representation of the whole document in memory. Therefore for the DOM parsers we used a smaller invoice document containing 100 lineitems (about 90 KB).
Based on the above schema the lifecycle of an XML document as applicable to XML Test can be defined as follows.
- Parse - Build a complete document tree in memory (in the case of DOM). In the case of streaming parsers, scan through the XML document.
- Access - Retrieve the Currency attribute and the PriceAmount for the number of LineItems specified.
- Content Modification - Increase the PriceAmount for all lineitems by 10%. Not implemented for SAX / pull parser.
- Structure Modification - Delete a specified number of lineitems and insert the same number at the end of the set of LineItems. The new lineitems must have LineIDs in the properly increasing sequence. Not implemented for SAX / pull parser.
- Serialize - Convert the entire in-memory tree to a serialized XML form written to a stream. Not implemented for SAX / pull parser.
For each reader/writer being benchmarked, nine tests were performed that measure throughput as the average number of XML transactions executed per second (TPS).
The following three tables summarize the benchmark results. The complete set of results can be found within FI Mark 1.4.
Shows the relative performance of all readers/writers using as basis the performance of the Fast Infoset
Shows how much faster the Fast Infoset reader/writer is when compared to the text reader/writer and the .NET
Binary dictionary reader/writer. For .NET Binary the dictionary reader/writer was used as no other is available.
Shows how much faster the Fast Infoset dictionary reader/writer is when compared to the .NET Binary dictionary reader/writer and the text dictionary reader/writer. The dictionary readers/writers are used by the Windows Communication Foundation.
- FI: XmlFastInfosetReader/Writer of FastInfoset.NET 2.2.0
- FI-D: XmlFastInfosetDictionaryReader/Writer of WCF-Xtensions 2.0.0
- Text: XmlTextReader/Writer of .NET 2.0
- Text-D: XmlDictionaryReader/Writer with CreateTextReader/Writer of .NET 3.0
- .NET Binary: XmlDictionaryReader/Writer with CreateBinaryReader/Writer of .NET 3.0
Fast Infoset returned the highest number of transactions per second in all SAX and DOM tests, with the performance increase ranging from 20% up to 605% on comparable measurements.
On average between the three SAX tests, Fast Infoset delivered 155% more TPS than text and 181% more than .NET Binary. Between the six DOM tests, the average increase was 31% and 55% respectively.
The Fast Infoset dictionary reader/writer was on average 157% faster than .NET Binary and 601% faster than the text dictionary reader/writer in the SAX tests, while in the DOM tests it was 51% and 145% faster respectively.
The results of this benchmark demonstrate that XML processing is consistently and significantly faster when using the Fast Infoset encoding provided by Noemax than when using the text XML or .NET Binary encodings provided by .NET.
You can use FI Mark to run these benchmarks on your computer.