December 2008 - Posts
OK, this is the last post on the subject. You can find the first post here and the previous one here.
In this post, I want to show you how the tips and tricks we have been discussing make our C++ code quite elegant for the three projects we haven’t reviewed yet: XMLBuilder, XSDValidator and XSLTTransformer.
As usual, my benchmark for elegance is a comparison with equivalent C# code, however in the last two projects, the code is not identical in the two languages due to differences in the COM and System.Xml APIs for validation and transformation.
And yet, putting aside these differences, I think you will agree that the clarity of the code in both languages is comparable. This is certainly not the case for C++ code that is written against raw COM interfaces.
So, here we go.
XMLBuilder
In this project we build a DOM document programmatically and display the result as an XML string. In this case the code in C++ and C# is almost identical.
| C++ | C# |
#include "stdafx.h" void CloneSomeNodes (XmlDocument xmlDoc) { XmlNode oldNode = xmlDoc->getElementsByTagName("node1")->item[0]; for (int i=0; i<5; i++) { XmlNode newNode = oldNode->cloneNode(VARIANT_TRUE); xmlDoc->documentElement->appendChild( xmlDoc->createTextNode("\n\t")); xmlDoc->documentElement->appendChild( newNode); } } void BuildAndPrintXml() { XmlDocument xmlDoc (MSXML2::CLSID_DOMDocument60); xmlDoc->preserveWhiteSpace = VARIANT_TRUE; // Create a processing instruction element. XmlProcessingInstruction pi = xmlDoc->createProcessingInstruction ( "xml", "version='1.0'"); xmlDoc->appendChild(pi); // Create a comment element. XmlComment comment = xmlDoc->createComment( "sample xml file created using XML DOM object."); xmlDoc->appendChild(comment); // Create the root element. XmlElement root = xmlDoc->createElement("root"); // Create a "created" attribute for the <root> element, and // assign the "using dom" character data as the attribute value. XmlAttribute attr = xmlDoc->createAttribute("created"); attr->value = "using dom"; root->setAttributeNode (attr); xmlDoc->appendChild (root); // Next, we will create and add three nodes to the <root> element. _bstr_t newline ("\n"); _bstr_t newlineTab ("\n\t"); _bstr_t newlineTabTab ("\n\t\t"); // Add NEWLINE+TAB for identation before <node1>. root->appendChild( xmlDoc->createTextNode(newlineTab)); // Create a <node1> to hold text content. XmlElement element = xmlDoc->createElement("node1"); element->text = "some character data"; // Append <node1> to <root>. root->appendChild (element); // Add NEWLINE+TAB for identation before <node2>. root->appendChild( xmlDoc->createTextNode(newlineTab)); // Create a <node2> to hold a CDATA section. element = xmlDoc->createElement("node2"); XmlCDataSection cdata = xmlDoc->createCDATASection("<some mark-up text>"); element->appendChild (cdata); // Append <node2> to <root>. root->appendChild (element); // Add NEWLINE+TAB for identation before <node3>. root->appendChild( xmlDoc->createTextNode(newlineTab)); // Create <node3> to hold a doc fragment with three sub-elements. element = xmlDoc->createElement("node3"); // Create a document fragment to hold three sub-elements. XmlDocumentFragment fragment = xmlDoc->createDocumentFragment(); // Add NEWLINE+TAB+TAB for identation before <subnode1>. fragment->appendChild( xmlDoc->createTextNode(newlineTabTab)); // Create and append <subnode1>. fragment->appendChild ( xmlDoc->createElement("subnode1")); // Add NEWLINE+TAB+TAB for identation before <subnode2>. fragment->appendChild( xmlDoc->createTextNode(newlineTabTab)); // Create and append <subnode2>. fragment->appendChild ( xmlDoc->createElement("subnode2")); // Add NEWLINE+TAB+TAB for identation before <subnode3>. fragment->appendChild( xmlDoc->createTextNode(newlineTabTab)); // Create and append <subnode3>. fragment->appendChild ( xmlDoc->createElement("subnode3")); // Add NEWLINE+TAB after </subnode> in fragment. fragment->appendChild( xmlDoc->createTextNode(newlineTab)); // Append fragment to <node3> (element). element->appendChild (fragment); // Append <node3> to <root>. root->appendChild (element); // Add NEWLINE for identation before </root>. root->appendChild( xmlDoc->createTextNode(newline)); printf( "Dynamically created DOM:\n%s\n", (const char*) xmlDoc->xml); CloneSomeNodes (xmlDoc); printf( "After cloning some nodes:\n%s\n", (const char*) xmlDoc->xml); } void main (int argc, char* argv[]) { ComInit com; try { BuildAndPrintXml(); } catch (Error e) { printf (e); } catch (_com_error e) { printf (e.ErrorMessage()); } printf ("\nDone\n"); _getch (); } | using System; using System.Xml; namespace XmlBuilderCS { class Program { void CloneSomeNodes(XmlDocument xmlDoc) { XmlNode oldNode = xmlDoc.GetElementsByTagName("node1").Item(0); for (int i = 0; i < 5; i++) { XmlNode newNode = oldNode.CloneNode(true); xmlDoc.DocumentElement.AppendChild( xmlDoc.CreateTextNode("\n\t")); xmlDoc.DocumentElement.AppendChild( newNode); } } void BuildAndPrintXml() { XmlDocument xmlDoc = new XmlDocument(); xmlDoc.PreserveWhitespace = true; // Create a processing instruction element. XmlProcessingInstruction pi = xmlDoc.CreateProcessingInstruction( "xml", "version='1.0'"); xmlDoc.AppendChild(pi); // Create a comment element. XmlComment comment = xmlDoc.CreateComment( "sample xml file created using XML DOM object."); xmlDoc.AppendChild(comment); // Create the root element. XmlElement root = xmlDoc.CreateElement("root"); // Create a "created" attribute for the <root> element, and // assign the "using dom" character data as the attribute value. XmlAttribute attr = xmlDoc.CreateAttribute("created"); attr.Value = "using dom"; root.SetAttributeNode(attr); xmlDoc.AppendChild(root); // Next, we will create and add three nodes to the <root> element. string newline = "\n"; string newlineTab = "\n\t"; string newlineTabTab = "\n\t\t"; // Add NEWLINE+TAB for identation before <node1>. root.AppendChild( xmlDoc.CreateTextNode(newlineTab)); // Create a <node1> to hold text content. XmlElement element = xmlDoc.CreateElement("node1"); element.InnerText = "some character data"; // Append <node1> to <root>. root.AppendChild(element); // Add NEWLINE+TAB for identation before <node2>. root.AppendChild( xmlDoc.CreateTextNode(newlineTab)); // Create a <node2> to hold a CDATA section. element = xmlDoc.CreateElement("node2"); XmlCDataSection cdata = xmlDoc.CreateCDataSection("<some mark-up text>"); element.AppendChild(cdata); // Append <node2> to <root>. root.AppendChild(element); // Add NEWLINE+TAB for identation before <node3>. root.AppendChild( xmlDoc.CreateTextNode(newlineTab)); // Create <node3> to hold a doc fragment with three sub-elements. element = xmlDoc.CreateElement("node3"); // Create a document fragment to hold three sub-elements. XmlDocumentFragment fragment = xmlDoc.CreateDocumentFragment(); // Add NEWLINE+TAB+TAB for identation before <subnode1>. fragment.AppendChild( xmlDoc.CreateTextNode(newlineTabTab)); // Create and append <subnode1>. fragment.AppendChild( xmlDoc.CreateElement("subnode1")); // Add NEWLINE+TAB+TAB for identation before <subnode2>. fragment.AppendChild( xmlDoc.CreateTextNode(newlineTabTab)); // Create and append <subnode2>. fragment.AppendChild( xmlDoc.CreateElement("subnode2")); // Add NEWLINE+TAB+TAB for identation before <subnode3>. fragment.AppendChild( xmlDoc.CreateTextNode(newlineTabTab)); // Create and append <subnode3>. fragment.AppendChild( xmlDoc.CreateElement("subnode3")); // Add NEWLINE+TAB after </subnode> in fragment. fragment.AppendChild( xmlDoc.CreateTextNode(newlineTab)); // Append fragment to <node3> (element). element.AppendChild(fragment); // Append <node3> to <root>. root.AppendChild(element); // Add NEWLINE for identation before </root>. root.AppendChild(xmlDoc.CreateTextNode(newline)); Console.WriteLine( "Dynamically created DOM:\n{0}\n", xmlDoc.OuterXml); CloneSomeNodes(xmlDoc); Console.WriteLine( "After cloning some nodes:\n{0}\n", xmlDoc.OuterXml); } static void Main(string[] args) { try { new Program().BuildAndPrintXml(); } catch (Exception e) { Console.WriteLine(e.Message); } Console.WriteLine("\nDone\n"); Console.ReadLine(); } } } |
XSDValidator
This project demonstrates how to validate an XML file against a schema in three scenarios:
- The schema is inline
- The schema is stored in a separate file referenced by the xml file.
- The schema is cached in memory and applied to an xml file.
The C++ example is based on this MSDN article. As you can see, the C++ code and C# code are not identical here. MSXML works with an XmlSchemaCollection, whereas System.Xml.Schema works with an XmlSchemaSet. Actually, an XmlSchemaCollection class is defined in System.Xml.Schema too, but it’s use for XmlDocument validation has been deprecated (see here).
| C++ | C# |
#include "stdafx.h" void Validate (char* xmlFileName, char* xsdFileName, char* namespaceURI) { XmlDocument xmlDoc (CLSID_DOMDocument60); if (xsdFileName != NULL) { XmlSchemaCollection schemas (CLSID_XMLSchemaCache60); schemas->add( namespaceURI, xsdFileName); xmlDoc->schemas = schemas.GetInterfacePtr(); } else { xmlDoc->resolveExternals = VARIANT_TRUE; xmlDoc->setProperty("UseInlineSchema", VARIANT_TRUE); } xmlDoc->async = VARIANT_FALSE; xmlDoc->validateOnParse = VARIANT_TRUE; VARIANT_BOOL ok = xmlDoc->load(xmlFileName); //xmlDoc->validate(); XmlParseError parseError = xmlDoc->parseError; if (parseError->errorCode != S_OK) { printf ("Validation failed validating %s\nReason: %s\nLine %d\nPosition %d\n", xmlFileName, (const char*) parseError->Getreason(), (int) parseError->Getline(), (int) parseError->Getlinepos()); } else { printf ("Validation succeeded for %s\n", xmlFileName); } } void CheckedValidate (char* xmlFileName, char* xsdFileName, char* namespaceURI) { try { Validate ( xmlFileName, xsdFileName, namespaceURI); } catch (Error e) { printf (e); } catch (_com_error e) { printf ("%s\n", e.ErrorMessage()); } } void main(int argc, char* argv[]) { ComInit com; CheckedValidate ("inline-valid.xml", NULL, NULL); CheckedValidate ("inline-invalid.xml", NULL, NULL); CheckedValidate ("external-valid.xml", NULL, NULL); CheckedValidate ("external-invalid.xml", NULL, NULL); CheckedValidate ("noschema-valid.xml", "sc.xsd", "urn:books"); CheckedValidate ("noschema-invalid.xml", "sc.xsd", "urn:books"); printf ("\nDone\n"); _getch (); } | using System; using System.Xml; using System.Xml.Schema; namespace XSDValidatorCS { class Program { void Validate(string xmlFileName, string xsdFileName, string namespaceURI) { XmlDocument document = new XmlDocument(); XmlReaderSettings settings = new XmlReaderSettings(); if (xsdFileName != null) { settings.Schemas.Add(namespaceURI, xsdFileName); } else { settings.ValidationFlags = XmlSchemaValidationFlags.ProcessSchemaLocation | // resolveExternals XmlSchemaValidationFlags.ProcessInlineSchema; // UseInlineSchema } settings.ValidationType = ValidationType.Schema; // validateOnParse settings.ValidationEventHandler += settings_ValidationEventHandler; XmlReader reader = XmlReader.Create(xmlFileName, settings); ok = true; document.Load(reader); if (!ok) { Console.WriteLine("Validation failed validating {0}\nReason: {1}\nLine {2}\nPosition {3}\n", xmlFileName, reason, lineNumber, position); } else { Console.WriteLine("Validation succeeded for {0}\n", xmlFileName); } } bool ok; string reason; int lineNumber; int position; void settings_ValidationEventHandler(object sender, ValidationEventArgs e) { ok = false; reason = e.Exception.Message; lineNumber = e.Exception.LineNumber; position = e.Exception.LinePosition; } void CheckedValidate(string xmlFileName, string xsdFileName, string namespaceURI) { try { Validate(xmlFileName, xsdFileName, namespaceURI); } catch (Exception e) { Console.WriteLine(e.Message); } } static void Main(string[] args) { Program program = new Program(); System.IO.Directory.SetCurrentDirectory(@"..\..\Samples\"); program.CheckedValidate("inline-valid.xml", null, null); program.CheckedValidate("inline-invalid.xml", null, null); program.CheckedValidate("external-valid.xml", null, null); program.CheckedValidate("external-invalid.xml", null, null); program.CheckedValidate("noschema-valid.xml", "sc.xsd", "urn:books"); program.CheckedValidate("noschema-notvalid.xml", "sc.xsd", "urn:books"); Console.WriteLine("\nDone\n"); Console.ReadLine(); } } } |
XSLTTransformer
This project demonstrates how to apply a transform stored in a file to XML stored in another file and display the result. Again, the APIs available in System.Xml and System.Xml.Xsl are a little different to those in the MSXML2 namespace of MSXML. But I tried to close that gap a little with a helper class called StringWriter (you can probably guess why).
| C++ | C# |
#include "stdafx.h" void Transform (char* xmlFileName, char* xslFileName) { XmlDocument xmlDoc (MSXML2::CLSID_DOMDocument); XmlDocument xslDoc (CLSID_FreeThreadedDOMDocument); if (xmlDoc == NULL || xslDoc == NULL) throw Error ("Failed creating XSL and XML document objects"); VARIANT_BOOL ok; ok = xmlDoc->load (xmlFileName); if (! ok) throw Error ("Could not open file %s", xmlFileName); ok = xslDoc->load (xslFileName); if (! ok) throw Error ("Could not open file %s", xslFileName); XslTemplate xslTemplate (CLSID_XSLTemplate); xslTemplate->stylesheet = xslDoc; XslProcessor xslProcessor = xslTemplate->createProcessor(); Stream stream; xslProcessor->output = stream; xslProcessor->input = (IUnknown*)xmlDoc; HRESULT hr = xslProcessor->addParameter("maxprice", "35", ""); if (FAILED(hr)) throw Error (hr); ok = xslProcessor->transform (); if (FAILED(hr)) throw Error (hr); // get results of transformation and print it to stdout size_t size = (size_t) stream.Size(); char* str = new char[size + 1]; if (! str) throw Error ("Failed to allocate buffer of size %d", size+1); stream.ReadFromOrigin(str, size); str[size] = 0; printf("%s", str); delete [] str; } void main(int argc, char* argv[]) { ComInit com; try { Transform ("books.xml", "trans.xsl"); } catch (Error e) { printf (e); } catch (_com_error e) { printf (e.ErrorMessage()); } printf ("\nDone\n"); _getch (); } | using System; using System.IO; using System.Xml; using System.Xml.Xsl; namespace XSLTTransformerCS { class Program { void Transform(string xmlFileName, string xslFileName) { XmlDocument xmlDoc = new XmlDocument(); XmlDocument xslDoc = new XmlDocument(); if (xmlDoc == null || xslDoc == null) throw new Exception("Failed creating XSL and XML document objects"); xmlDoc.Load(xmlFileName); xslDoc.Load(xslFileName); XslCompiledTransform transform = new XslCompiledTransform(); transform.OutputSettings.CloseOutput = true; transform.Load(xslDoc); StringWriter writer = new StringWriter(); XsltArgumentList prms = new XsltArgumentList(); prms.AddParam("maxprice", "", "35"); transform.Transform(xmlFileName, prms, writer); // get results of transformation and print it to stdout Console.WriteLine(writer.ToString()); } static void Main(string[] args) { try { Directory.SetCurrentDirectory(@"..\..\Samples"); new Program().Transform("books.xml", "trans.xsl"); } catch (Exception ex) { Console.WriteLine(ex.Message); } Console.WriteLine("Done"); Console.ReadLine(); } } } |
In the C# code, a StringWriter can be passed in as an argument to the Transform method of the XslCompiledTransform object. Calling ToString() on the StringWriter will return the output of the transform as text.
In the C++ code, the XslProcessor COM interface expects its output property to be set to an IStream interface. It writes the transformed information to that interface when its transform method is called. IStream is similar to a Stream object in the System.IO namespace of .Net and works with unstructured binary data. Extracting the information from the object behind the IStream interface is a little tricky, so I wrote the StringWriter class (in Utils.h) to do that work.
Here is the implementation of the StringWriter class:
class StringWriter
{
IStream *stream;
public:
StringWriter ()
{
HRESULT hr = CreateStreamOnHGlobal (NULL, TRUE, &stream);
if (FAILED(hr))
throw Error (hr);
}
operator _variant_t() { return _variant_t(stream); }
ULONG Write (char* str, size_t count)
{
ULONG written;
HRESULT hr = stream->Write(str, count, &written);
if (FAILED(hr))
throw Error (hr);
return written;
}
LONGLONG Size ()
{
LARGE_INTEGER zero; zero.QuadPart = 0;
ULARGE_INTEGER position;
stream->Seek (zero, STREAM_SEEK_CUR, &position);
return position.QuadPart;
}
ULONG ToString (char* str, ULONG toRead)
{
LONGLONG maxRead = Size();
if (toRead > maxRead)
toRead = (ULONG) maxRead;
LARGE_INTEGER zero; zero.QuadPart = 0;
stream->Seek (zero, STREAM_SEEK_SET, NULL);
ULONG read;
stream->Read (str, toRead, &read);
return read;
}
};
Summary
Programming raw COM interfaces with C++ presents many challenges, the most significant being object lifetime management, error handling and type conversions.
COM programmers often overcome these challenges using macros, ‘goto’ statements, meticulous checking of HRESULT return values and peer reviews (lots). Unfortunately, the resulting code is often difficult to understand and difficult to maintain. Moreover, it remains error prone, because C++ used in a traditional way cannot check for some of the fatal coding errors that may occur.
In this article we have seen how it’s possible to harness some of the powerful features of C++ to address these challenges in a more readable, maintainable and less error-prone way.
The key elements of the this approach are:
- RAI (Resource Allocation as Initialization) to handle COM CoUninitialize
- Smart pointers to manage COM interface lifetimes safely and reliably
- Casting operator overloads on wrapper classes such as _variant_t and _bstr_t
- Structured exception management using _com_err to eliminate excessive, unreliable error handling code.
- C++ Properties using a Microsoft extension to the language.
Any set of raw COM interfaces can be wrapped with a thin layer to implement these elements. MFC and ATL provide such layers, but if you don’t want to burden your project with either of those, you could provide your own.
For MSXML, Microsoft has provided such a layer for us. ATL and MFC are not needed.
In the article, we have seen how, equipped with this layer, basic usage scenarios of the COM MSXML library can become as clean and elegant as equivalent implementations in C#.
Part 1 is the first post in this series.
In Part 4 I described the first of the 5 project pairs provided with this article, DOMAndXPath.
In this post I will review the second project – SAXReader. The SAX programming model is very different to the DOM model: SAX models the parser, whereas DOM models the XML document. SAX provides a forward only push model, where as DOM provides random access to nodes in the document. The SAXReader project is based on the MSDN example you can find here. It defines a content handler, attaches it to a SAX reader and then directs the SAX reader to load an XML file. Appropriate content handler methods are called when the SAX reader encounters significant syntactical elements in the file.
The .Net Framework does not come equipped with a push-mode SAX reader. The XmlReader, the nearest equivalent provides a pull-mode reader.
So, what C# code can we use to compare our C++ code to for elegance?
In this project I used COM interop to call the SAX COM interfaces from managed code. I used the Add Reference wizard to add the MSXML6 dll from the COM tab. This provides a proxy for all the COM visible types in the MSXML2 namespace (recall, we are using the MSXML2 interface in C++ too).
I will admit, that I didn’t clean up the implementation of the content handler, but lets compare the main method in C++ and in C#, which are quite similar. Here they are, side-by-side:
| C++ | C# |
#include "stdafx.h" #include "MyContentHandler.h" #include "SAXErrorHandlerImpl.h" inline unsigned short* ToUnsignedShort (_bstr_t bstr) { return (unsigned short*) (wchar_t*) bstr; } void ReadXmlFile (char* xmlFileName) { ISAXXMLReaderPtr saxReader (CLSID_SAXXMLReader); if (saxReader == NULL) { throw Error ("Failed to create ISAXXMLReader"); } MyContentHandler *contentHandler = new MyContentHandler(); HRESULT hr = saxReader->putContentHandler(contentHandler); if (FAILED(hr)) { throw Error ("Failed to set content handler"); } // An illustration how to set other handlers // SAXErrorHandlerImpl * eh = new SAXErrorHandlerImpl(); // hr = saxReader->putErrorHandler(eh); // SAXDTDHandlerImpl * dh = new SAXDTDHandlerImpl(); // hr = saxReader->putDTDHandler(dh); hr = saxReader->parseURL (ToUnsignedShort(xmlFileName)); saxReader->putContentHandler(NULL); delete contentHandler; if (FAILED(hr)) { throw Error ("Failed to parse file with SAX Reader"); } } void main(int argc, char* argv[]) { ComInit comInit; try { ReadXmlFile ("Sample.xml"); } catch (Error e) { printf (e); } catch (_com_error e) { printf (e.ErrorMessage()); } printf ("\nDone\n"); _getch (); } | using System; using System.Collections.Generic; using System.Linq; using System.Text; namespace SAXReaderCS { class Program { void ReadXmlFile(string xmlFileName) { MSXML2.SAXXMLReader saxReader = new MSXML2.SAXXMLReader(); if (saxReader == null) { throw new Exception("Failed to create ISAXXMLReader"); } saxReader.contentHandler = new MyContentHandler(); // An illustration how to set other handlers // SAXErrorHandlerImpl * eh = new SAXErrorHandlerImpl(); // hr = saxReader.putErrorHandler(eh); // SAXDTDHandlerImpl * dh = new SAXDTDHandlerImpl(); // hr = saxReader.putDTDHandler(dh); saxReader.parseURL(xmlFileName); } static void Main(string[] args) { try { new Program().ReadXmlFile( @"..\..\Samples\Sample.xml"); } catch (Exception ex) { Console.WriteLine(ex); } Console.WriteLine("Done"); Console.ReadLine(); } } } |
As in the DOMAndXPath project (see Part 4) the following key points make our C++ code on the left look like the C# code on the right.
- ComInit is used for handling the calls to CoInitialize and CoUninitialize
- Exceptions are handled using the _com_error and Error classes.
- Smart pointers are used to hold COM interfaces (for instance ISAXXMLReaderPtr).
- The _bstr_t class is used to convert an char* string to an unsigned short* string.
In Part 3 I mentioned that some of the tricks we are using are provided by MSXML in a set of interfaces that wrap the raw interfaces. I also mentioned that in some cases we might actually want to import only the raw interfaces. The SAXReader project is one of those cases.
You see, in this project we are not only using COM interfaces implemented by MSXML; we are also implementing our own. For instance the MyContentHandler class implements the ISAXContentHandler interface.
In this case, we do not want burden ourselves with the implementation of two sets of additional interfaces, so we opt out by defining RAW_INTERFACES_ONLY in the stdafx.h precompiled header. This makes sure the raw_interfaces_only keyword is added to the #import statement in ImportMSXML.h.
The interface we are now implementing is therefore only the raw interface. See MyContentHandler.h. There are no smart types or smart pointers there. However, if we were to import the non-raw interfaces two, we have to implement these methods (they would then have the prefix raw_) and the smart set.
At this point, we have reviewed all the key points in the 5 project pairs provided with this article. Hopefully I have convinced you that C++ and MSXML provide us with the tools to write C++ code that is as elegant as C#.
In the next and final post in the article, I will simply be showing the fruit of our work – a comparison of the C++ and the C# code for the remaining 3 projects.
Part 1 is the first post in this series.
In Part 3 I described 3 simple steps that will help simplify your MSXML enabled C++ project. With those in place we are ready to examine each of the 5 project pairs provided with this article in more detail. Each project demonstrates how to implement a set of basic XML functions using MSXML in C++.
In this post I will review the first of the five - DOMAndXPath. This project loads an XML file into a DOM, recursively traverses its nodes and displays them. It then displays a subset of the document’s nodes using an XPath expression.
Here is the C++ code side-by-side with the C# (also shown in Part 2 of this article)
| C++ | C# |
#include "stdafx.h" inline void IndentedPrint (int indent, char* format, ...) { char m_Message[512]; va_list args; va_start(args, format); vsprintf_s(m_Message, format, args); va_end(args); printf ("%*s%s", indent, "\t", m_Message); } void DisplayAttribute (XmlNode node, int depth) { IndentedPrint (depth, "type: %s name: %s value: %s \n", NodeTypeString (node->nodeType), (const char*) node->nodeName, (const char*) (_bstr_t) node->nodeValue); } void DisplayNode (XmlNode node, int depth = 0); void DisplayNodes (XmlNodeList nodes, int depth) { for (int i=0; i<nodes->length; i++) { DisplayNode (nodes->item[i], depth); } } void DisplayNode (XmlNode node, int depth) { IndentedPrint (depth, "type: %s name: %s", NodeTypeString (node->nodeType), (const char*) node->nodeName); if (node->nodeType == MSXML2::NODE_TEXT) { IndentedPrint (0, "text: %s\n", (const char*) (_bstr_t) node->nodeValue); } else { printf ("\n"); XmlNamedNodeMap attrMap = node->attributes; if (attrMap) { for (int i=0; i<attrMap->length; i++) { DisplayAttribute ( attrMap->item[i], depth); } IndentedPrint (depth, "\n"); } XmlNodeList childNodes = node->childNodes; if (childNodes) DisplayNodes (childNodes, depth + 5); } } void DisplayXPath (char* xmlFileName, char* xpath) { XmlDocument xmlDoc (CLSID_DOMDocument60); VARIANT_BOOL ok = xmlDoc->load(xmlFileName); if (! ok) throw Error ("Failed to load %s", xmlFileName); printf ("Entire document\n"); DisplayNode (xmlDoc->documentElement, 5); printf ("Nodes in %s\n", xpath); XmlNodeList nodeList = xmlDoc->selectNodes(xpath); DisplayNodes (nodeList, 5); } void main(int argc, char* argv[]) { ComInit com; try { DisplayXPath ("books.xml", "//@*"); } catch (Error e) { printf (e); } catch (_com_error e) { printf (e.ErrorMessage()); } printf ("\nDone\n"); _getch (); } | using System; using System.Xml; class Program { void IndentedPrint (int indent, string format, params object[] array) { string indentedString = "".PadLeft(indent); Console.WriteLine(indentedString + format, array); } void DisplayAttribute (XmlNode node, int depth) { IndentedPrint (depth, "type: {0} name: {1} value: {2} \n", node.NodeType.ToString(), node.Name, node.Value); } void DisplayNodes (XmlNodeList nodes, int depth) { for (int i=0; i<nodes.Count; i++) { DisplayNode (nodes[i], depth); } } void DisplayNode (XmlNode node, int depth) { IndentedPrint (depth, "type: {0} name: {1}", node.NodeType.ToString(), node.Name); if (node.NodeType == XmlNodeType.Text) { IndentedPrint (0, "text: {0}\n", node.Value); } else { Console.WriteLine(); XmlNamedNodeMap attrMap = node.Attributes; if (attrMap != null) { for (int i=0; i<attrMap.Count; i++) { DisplayAttribute ( attrMap.Item(i), depth); } IndentedPrint (depth, "\n"); } XmlNodeList childNodes = node.ChildNodes; if (childNodes != null) DisplayNodes (childNodes, depth + 5); } } void DisplayXPath (string xmlFileName, string xpath) { XmlDocument xmlDoc = new XmlDocument(); xmlDoc.Load(xmlFileName); Console.WriteLine("Entire document\n"); DisplayNode (xmlDoc.DocumentElement, 5); Console.WriteLine("Nodes in {0}\n", xpath); XmlNodeList nodeList = xmlDoc.SelectNodes(xpath); DisplayNodes (nodeList, 5); } static void Main(string[] args) { try { string xmlFileName = @"..\..\Samples\books.xml"; new Program().DisplayXPath(xmlFileName, "//@*"); } catch (Exception e) { Console.WriteLine(e.Message); } Console.WriteLine("\nDone\n"); Console.ReadLine(); } } |
Here are the key points in the code.
- Resource Allocation as Initialization
First, take a look at the main method and the bottom of the left page. I am using the ComInit helper class to initialize COM and uninitialize when main exits.
- Exception handling
There are two catch clauses, one that catches _com_error exceptions issued by a smart pointer and the other to catch Error exceptions. Error is a helper exception class that I use to describe application level errors.
- Get an Interface with a Smart Pointer
Now, scroll up to review the DisplayXPath method.
The first statement here is: XmlDocument xmlDoc (CLSID_DOMDocument60);
This creates an instance of the smart pointer XmlDocument. Within the constructor there are calls to QueryInterface for the IXMLDOMDocument2Ptr interface, an AddRef plus throw of a _com_err exception if any of the above fail. Notice how, using smart pointers, this looks like a simple declaration and initialization statement.
- Calling Methods Through a Smart Pointer
Note also how loading the document is done by calling the load method of the IXMLDOMDocument2Ptr through the overloaded –> operator of the xmlDoc smart pointer.
- Properties in C++
After loading the document from the xmlFileName file, this method recursively calls DisplayNode for the root node and all its children. As defined in the DOM, the root node is held in the documentElement property of xmlDoc.
documentElement looks like a field, eh? Well, it isn’t. It’s a property ?!
Select the documentElement property and browse to its definition by hitting F12. You will find the following declaration in the automatically generated msxml6.tli header file: __declspec(property(get=GetdocumentElement,put=PutRefdocumentElement))
IXMLDOMElementPtr documentElement;
What you can see here is use of the Microsoft C++ property extension that supports property-like syntax similar to that of C#. It enables you to use the property in your code as if it were a field while the compiler implements calls to the getter and setter methods specified in the declaration.
This is particularly elegant because if an error occurs in either the setter or the getter method, a _com_error is thrown and caught in our main. Use of such property syntax is sprinkled all over the code of this and the other four C++ projects. Note also, that the pointer that is returned from this method is itself a smart pointer.
The attribute collection is accessed as a property too : XmlNamedNodeMap attrMap = node->attributes;
Again property syntax is used to turn this into a call to a method that returns a smart pointer (of type MSXML2::IXMLDOMNamedNodeMapPtr). The smart pointer is copied into the attrMap variable invoking a copy constructor which makes sure to call AddRef and Release as required.
- Indexed Properties in C++
Now take a look at the loop that iterates over all attributes in attrMap using ‘attrMap->item[i]’. Indexing semantics are also made possible through the new property syntax for indexers:
__declspec(property(get=Getitem))
IXMLDOMNodePtr item[];
- Handling BSTR with Smart Pointers
A BSTR is a string representation used by many of the COM APIs. A BSTR consists of a length prefix, a wide char array and two byte NULL terminator. When you work directly with COM interfaces you need to allocate and free BSTR strings with care (typically using APIs such as SysAllocString and SysFreeString).
The _bstr_t class defined in comutil.h is another smart pointer class that manages BSTR object lifetime, provides efficient sharing of strings and implicit casts to other string types and implements structured error handling with _com_error exceptions. Thanks to _bstr_t you see no L”” constants, no SysAllocString and SysFreeString calls and no clumsy error handling in these projects.
In the next post I will describe the SAX Reader project.
See the previous posts for this article here: Part 1, Part 2.
Before I review the projects for this article, I would like to describe the basics: a few simple steps that will give you a C# experience while programming MSXML with C++.
Step 1: Import MSXML
There are a number of ways to import com libraries in a C++ project. I think the simplest way is to add the following line in a common header (best precompiled).
#import <msxml6.dll> named_guids
This will create the headers (with extensions .tli and .tlh) that we need to access the COM objects created by MSXML. It will also automatically add them to your project. These also include other header files that we will be using later (comdef.h and comip.h)
The 6 in <msxml6.dll> stands for version 6. If you have this version installed, it should be in your path (under system32), so specifying its name is sufficient. Microsoft recommends that you use MSXML version 6 or version 3 unless you need some specific feature from another version. Choose version 6 to get the best in performance and security. Choose 3 (replace the 6 with a 3) if you want to target the broadest audience. Both versions work for the projects in these posts. You can read more about MSXML versions here.
We will be using types from the MSXML2 namespace (yes, for msxml3 and msxml6 too) so I recommend you add the following line too:
The named_guids keyword will allow you to refer to guids by their names later in the code.
Step 2: Enter Smart Pointers
Simply put, a smart pointer is a C++ class that has the semantics of a pointer to another class but does not need to be released explicitly.
Smart pointers are able to achieve this due to three powerful C++ features which I will review briefly:
- Reliable object lifetime management
- Operator overloading
- Templates
C++ manages the lifetime of an object by calling the object’s destructor when it goes out of scope or after the destructor’s containing class is called. The destructor is also called if an exception is thrown from within the scope of an object or from within a nested call made from that scope. In this sense, the mechanism is reliable and ensures that class destructors can be used to release resources reliably.
C++ also supports overloading of the ‘–>’ operator. This allows an object of a class to return a pointer to an object other than itself, thereby giving it the semantics of that pointer.
The method of releasing a pointer differs from domain to domain (for instance using the ‘delete’ operator for memory, or by calling some domain specific Release function). But often, within a domain, pointers of different types can be released in the same way. It would therefore seem rather cumbersome to have to write the same smart pointer logic for each pointer type in the domain.
C++ templates allow you to write a smart pointer once as a template for many classes in a domain. STL provides classic examples of smart pointers templates with its auto_ptr and shared_ptr classes. For COM objects, Microsoft has implemented a smart pointer template called ‘_com_ptr_t’. _com_ptr_t uses the specific COM mechanisms to manage any COM object’s lifetime and can be found in comip.h which is automatically included in your code by the #import statement.
As a convenience, for many COM interfaces, Microsoft also provides a type definition (typedef) to instantiate a smart pointer type for that interface. according to the naming convention for these types, they usually have a ‘Ptr’ suffix.
Moreover, MSXML offers two sets of interfaces for many objects. The raw interfaces use the ‘good’ old COM types (like VARIANT, BSTR and HRESULT) and ‘dumb’ pointers (you know what I mean – not smart pointers). The second set of interfaces wrap the raw interfaces and are defined in terms of wrapper types that wrap raw COM types and manage their resources safely. If you only want the raw interfaces, you can add the keyword “raw_interfaces_only” after the #import statement above.
You may be asking yourself – why would I not want to import the non-raw interfaces? Why work so hard to manage resources safely, manage object lifetime, convert types safely and handle errors, if I can get it all for free? I will answer that in Part 5 when we review the SAXReader project.
Now, in order to make our C++ code look like code written in C#, we will use that second set of interfaces, and the smart pointers that are defined for them. We will also add our own type definitions to map the smart pointer types from the MSXML2 namespace to equivalent types in the System.Xml namespace.
typedef MSXML2::IXMLDOMNodePtr XmlNode;
typedef MSXML2::IXMLDOMDocument2Ptr XmlDocument;
typedef MSXML2::IXMLDOMElementPtr XmlElement;
typedef MSXML2::IXMLDOMAttributePtr XmlAttribute;
typedef MSXML2::IXMLDOMCommentPtr XmlComment;
typedef MSXML2::IXMLDOMNamedNodeMapPtr XmlNamedNodeMap;
typedef MSXML2::IXMLDOMNodeListPtr XmlNodeList;
typedef MSXML2::IXMLDOMDocumentFragmentPtr XmlDocumentFragment;
typedef MSXML2::IXMLDOMCDATASectionPtr XmlCDataSection;
typedef MSXML2::IXMLDOMProcessingInstructionPtr XmlProcessingInstruction;
typedef MSXML2::IXMLDOMSchemaCollectionPtr XmlSchemaCollection;
typedef MSXML2::IXMLDOMParseErrorPtr XmlParseError;
typedef MSXML2::IXSLProcessorPtr XslProcessor;
typedef MSXML2::IXSLTemplatePtr XslTemplate;
Feel free to remove some of these if you don’t need them or add more, similar types if you use other interfaces.
You may be asking why I explicitly specified the MSXML2 namespace in these definitions. Would it not suffice to include the ‘using’ directive from the previous step?
Well, one of the few differences between the Visual C++ 6.0 environment and that of Visual Studio 2008 with regard to MSXML is that in the latter, some of the COM smart pointers (on the left side of my typedefs) were redefined in the global namespace. As we specifically need those from the msxml2 namespace, and to avoid an ambiguity compilation error, this has to be specified explicitly. On the whole, that makes the left side pretty ugly, but this will be of no concern to you once you include the typedefs as I propose.
Step 3: Add Some Helper Classes
A CoUninitialize Helper
Applications must call CoInitialize in a thread before any other call to COM in that thread. They must also call CoUninitialize when COM is no longer needed. Forgetting to call CoUninitialize is not a problem in a single threaded application, because when the process exits any clean-up that needs to be done will be done for you. However, in multi-threaded applications, every thread that runs and exits without calling CoUninitialize generates a resource leak in your application.
Seasoned C++ programmers like us probably won’t forget to call CoUninitialize before exiting a thread, but remember, you have to make the call even if your thread exits due to an unhandled exception. Altogether, managing all cases can make your code a little messy – which is a big NO, NO :)
The simple solution for such problems in C++ is Resource Allocation as Initialization (RAI). RAI refers to the use of C++ object lifetime management to ensure that a resource is released automatically, as we would expect it to.
The following class does the trick. Just instantiate a local variable of this type at the beginning of the outermost block in your thread and forget about CoUninitialize.
class ComInit
{
public:
ComInit() { ::CoInitialize(NULL); }
~ComInit() { ::CoUninitialize(); }
};
An Error Handling Helper
Another aspect of COM programming that we must address is error management.
C++ supports structured error handling very well, but unfortunately, its mostly ‘do it yourself’ with COM. Most COM methods return the cryptic HRESULT which immediately causes the following problems:
- HRESULT is not an enumerated type, so providing useful information to callers and users usually requires additional steps. Yes you could stay with the FAILED(hr) macro, but is that really enough information?
- When every line contains a call to a COM function returning an HRESULT, you have only a few options:
- You can check the return code of every function adding ~3 lines for each function call, rendering your code utterly unreadable. (75% of the code deals with error handling).
- You might take your chances and ignore some of the errors. A catastrophe waiting to happen.
- You can use macros to check the return code and throw an exception, as in the MSDN code quoted in my first post in this article. Macros make code difficult to browse and debug
Well, Microsoft defines a very useful class called ‘_com_error’ in the comdef.h include file. comdef.h is automatically included in your code by the #import statement. _com_error is a very useful class to throw when an HRESULT value indicates some error. It takes an HRESULT in its constructor and provides string formatted information through the ErrorDescription method. As you probably know, some COM objects support the IErrorInfo interface which provides more detailed error information. _com_error can optionally take one of those in its constructor too and provide easy access to that information.
So? Where does that get us? COM doesn’t throw this class.
Well, first of all, _com_error is used by the _com_ptr class to manage errors that occur in the COM methods that it calls. Thus, by wrapping a COM object with a _com_ptr you create a COM object in one line and use C++ try catch syntax to handle errors in a structured way.
Second, you can use _com_error objects yourself to access more information about an HRESULT error.
But what about errors that occur in your application and are not generated by COM? Well, just for convenience, I added my own Error class that can optionally handle HRESULT errors by reusing _com_error. Nothing clever here. You can write your own class to wrap an error with an exception, but please do something, because structured exception handling is the way to go. Here is mine:
class Error
{
char m_Message[512];
public:
Error (HRESULT hr)
{
m_Message[0] = '\0';
_com_error comError (hr);
const TCHAR* message = comError.ErrorMessage();
if (message)
strncpy_s (m_Message, message, sizeof (m_Message));
m_Message[sizeof(m_Message)-1] = '\0';
}
Error (char* format, ...)
{
va_list args;
va_start(args, format);
vsprintf_s(m_Message, format, args);
va_end(args);
}
Error(const Error& r)
{
strcpy_s (m_Message, r.m_Message);
}
operator char*() { return m_Message; }
};
Visual C++ 6.0 and Visual Studio 2008 Compatibility
Oh, and one last point. I used a few of the new safe CRT calls provided with Visual Studio 2008. So, for backward compatibility with Visual C++ 6.0, define the following.
#if _MSC_VER <= 1200 // Visual Studio 6
#define strncpy_s(dest, src, size) strcpy (dest, src)
#define vsprintf_s vsprintf
#define wcsncpy_s wcsncpy
#endif
In each of the C++ projects (download here) you will find my implementation of Step 1 and Step 2 in ImportMSXML.h and my implementation of Step 3 in Utils.h
In the next post(s) I will briefly describe each of the 5 project pairs (one in C# and one in C++) in more detail.
See the previous posts for this article here: Part 1, Part 2.
Stay tuned.
So, in my opening post on this subject (Part 1), I promised to show you some useful sample code using MSXML, written in C++ yet as elegant as C#.
Well. Here is an example of what can be achieved.
| C++ | C# |
#include "stdafx.h" inline void IndentedPrint (int indent, char* format, ...) { char m_Message[512]; va_list args; va_start(args, format); vsprintf_s(m_Message, format, args); va_end(args); printf ("%*s%s", indent, "\t", m_Message); } void DisplayAttribute (XmlNode node, int depth) { IndentedPrint (depth, "type: %s name: %s value: %s \n", NodeTypeString (node->nodeType), (const char*) node->nodeName, (const char*) (_bstr_t) node->nodeValue); } void DisplayNode (XmlNode node, int depth = 0); void DisplayNodes (XmlNodeList nodes, int depth) { for (int i=0; i<nodes->length; i++) { DisplayNode (nodes->item[i], depth); } } void DisplayNode (XmlNode node, int depth) { IndentedPrint (depth, "type: %s name: %s", NodeTypeString (node->nodeType), (const char*) node->nodeName); if (node->nodeType == MSXML2::NODE_TEXT) { IndentedPrint (0, "text: %s\n", (const char*) (_bstr_t) node->nodeValue); } else { printf ("\n"); XmlNamedNodeMap attrMap = node->attributes; if (attrMap) { for (int i=0; i<attrMap->length; i++) { DisplayAttribute ( attrMap->item[i], depth); } IndentedPrint (depth, "\n"); } XmlNodeList childNodes = node->childNodes; if (childNodes) DisplayNodes (childNodes, depth + 5); } } void DisplayXPath (char* xmlFileName, char* xpath) { XmlDocument xmlDoc (CLSID_DOMDocument60); VARIANT_BOOL ok = xmlDoc->load(xmlFileName); if (! ok) throw Error ("Failed to load %s", xmlFileName); printf ("Entire document\n"); DisplayNode (xmlDoc->documentElement, 5); printf ("Nodes in %s\n", xpath); XmlNodeList nodeList = xmlDoc->selectNodes(xpath); DisplayNodes (nodeList, 5); } void main(int argc, char* argv[]) { ComInit com; try { DisplayXPath ("books.xml", "//@*"); } catch (Error e) { printf (e); } catch (_com_error e) { printf (e.ErrorMessage()); } printf ("\nDone\n"); _getch (); } | using System; using System.Xml; class Program { void IndentedPrint (int indent, string format, params object[] array) { string indentedString = "".PadLeft(indent); Console.WriteLine(indentedString + format, array); } void DisplayAttribute (XmlNode node, int depth) { IndentedPrint (depth, "type: {0} name: {1} value: {2} \n", node.NodeType.ToString(), node.Name, node.Value); } void DisplayNodes (XmlNodeList nodes, int depth) { for (int i=0; i<nodes.Count; i++) { DisplayNode (nodes[i], depth); } } void DisplayNode (XmlNode node, int depth) { IndentedPrint (depth, "type: {0} name: {1}", node.NodeType.ToString(), node.Name); if (node.NodeType == XmlNodeType.Text) { IndentedPrint (0, "text: {0}\n", node.Value); } else { Console.WriteLine(); XmlNamedNodeMap attrMap = node.Attributes; if (attrMap != null) { for (int i=0; i<attrMap.Count; i++) { DisplayAttribute ( attrMap.Item(i), depth); } IndentedPrint (depth, "\n"); } XmlNodeList childNodes = node.ChildNodes; if (childNodes != null) DisplayNodes (childNodes, depth + 5); } } void DisplayXPath (string xmlFileName, string xpath) { XmlDocument xmlDoc = new XmlDocument(); xmlDoc.Load(xmlFileName); Console.WriteLine("Entire document\n"); DisplayNode (xmlDoc.DocumentElement, 5); Console.WriteLine("Nodes in {0}\n", xpath); XmlNodeList nodeList = xmlDoc.SelectNodes(xpath); DisplayNodes (nodeList, 5); } static void Main(string[] args) { try { string xmlFileName = @"..\..\Samples\books.xml"; new Program().DisplayXPath(xmlFileName, "//@*"); } catch (Exception e) { Console.WriteLine(e.Message); } Console.WriteLine("\nDone\n"); Console.ReadLine(); } } |
The code on the left is written in C++ using MSXML6, the code on the right in C# using the System.Xml namespace.
Both snippets perform the same task: Loading an xml file as a DOM object and displaying its nodes recursively. I threw in a bit of XPath too.
Very similar no?
This is the code for this project pair (DOMAndXPathCpp and DOMAndXPathCs) along with 4 others.
In my next posts (next is Part 3) I will explain each project pair and how I did away with all those QueryInterfaces, Releases and HRESULTS.
Well, I did say that I would change subjects from time to time : )
This week I taught a class of C++ Programmers how to use MSXML and I sat down to write samples and demos for the lesson.
Well, I love C++, but C# has spoiled me.
Programming COM directly can get ugly and working with MSXML in C++ is no exception: all those AddRefs and Releases you need to call, HRESULTS you need to handle, and CLSIDs you need to find.
Code that accesses COM directly is often error prone, difficult to understand and difficult to maintain. (Isn’t it interesting how ugly code nearly always has these problems whereas beautiful code nearly always doesn’t?).
Well, eventually I created some elegant examples, and I would like to share them with you.
My benchmark for elegance was quite simple. It should look like it does in C# : )
Here is an example of what I mean. The following example (from MSDN) loads an XML as a DOM document. Scroll down below this long snippet to see the goal that I set myself for the samples.
// LoadDOM.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <stdio.h>
#include <windows.h>
#import <msxml3.dll> raw_interfaces_only
// Macro that calls a COM method returning HRESULT value:
#define HRCALL(a, errmsg) \
do { \
hr = (a); \
if (FAILED(hr)) { \
dprintf( "%s:%d HRCALL Failed: %s\n 0x%.8x = %s\n", \
__FILE__, __LINE__, errmsg, hr, #a ); \
goto clean; \
} \
} while (0)
// Helper function that put output in stdout and debug window
// in Visual Studio:
void dprintf( char * format, ...)
{
static char buf[1024];
va_list args;
va_start( args, format );
vsprintf_s( buf, format, args );
va_end( args);
OutputDebugStringA( buf);
printf("%s", buf);
}
// Helper function to create a DOM instance:
IXMLDOMDocument * DomFromCOM()
{
HRESULT hr;
IXMLDOMDocument *pxmldoc = NULL;
HRCALL( CoCreateInstance(__uuidof(MSXML2::DOMDocument30),
NULL,
CLSCTX_INPROC_SERVER,
__uuidof(IXMLDOMDocument),
(void**)&pxmldoc),
"Create a new DOMDocument");
HRCALL( pxmldoc->put_async(VARIANT_FALSE),
"should never fail");
HRCALL( pxmldoc->put_validateOnParse(VARIANT_FALSE),
"should never fail");
HRCALL( pxmldoc->put_resolveExternals(VARIANT_FALSE),
"should never fail");
return pxmldoc;
clean:
if (pxmldoc)
{
pxmldoc->Release();
}
return NULL;
}
int _tmain(int argc, _TCHAR* argv[])
{
IXMLDOMDocument *pXMLDom=NULL;
IXMLDOMParseError *pXMLErr=NULL;
BSTR bstr = NULL;
VARIANT_BOOL status;
VARIANT var;
HRESULT hr;
CoInitialize(NULL);
pXMLDom = DomFromCOM();
if (!pXMLDom) goto clean;
VariantInit(&var);
V_BSTR(&var) = SysAllocString(L"stocks.xml");
V_VT(&var) = VT_BSTR;
HRCALL(pXMLDom->load(var, &status), "");
if (status!=VARIANT_TRUE) {
HRCALL(pXMLDom->get_parseError(&pXMLErr),"");
HRCALL(pXMLErr->get_reason(&bstr),"");
dprintf("Failed to load DOM from stocks.xml. %S\n",
bstr);
goto clean;
}
HRCALL(pXMLDom->get_xml(&bstr), "");
dprintf("XML DOM loaded from stocks.xml:\n%S\n",bstr);
clean:
if (bstr) SysFreeString(bstr);
if (&var) VariantClear(&var);
if (pXMLErr) pXMLErr->Release();
if (pXMLDom) pXMLDom->Release();
CoUninitialize();
return 0;
}
I don’t know about you but I cringed on every line.
Why can’t COM in C++ look as elegant as this equivalent code in (C#)?
using System;
using System.Xml;
static void Main(string[] args)
{
try
{
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load("stocks.xml");
Console.WriteLine("Loaded stocks.xml");
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
}
Well, it can!
As you probably know, there are some neat tricks in C++ that allow you to clean up the mess and hide the details of resource management and type conversions. I am referring to RAI (resource allocation as initialization), smart pointers and operator overloads.
MSDN does document how to use this smart approach for MSXML, but, in my humble opinion, these tools can do even more than the samples there demonstrate.
In my next post(s) I will provide a solution with 5 samples implementing the most basic MSXML services:
- DOMAndXPath
Load an XML file into a DOM, recursively traverse and display its nodes, select and display a subset of its nodes using XPath.
- SAXReader
Read an XML file using the SAX model.
- XMLBuilder
Build a DOM model programmatically and displaying the result as an XML string.
- XSDValidator
Validate an XML file against a schema in three scenarios:
- The schema is inline
- The schema is stored in a separate file referenced by the xml file.
- The schema is cached in memory and applied to an xml file.
- XSLTTransformer
Apply a transform stored in a file to XML stored in another file and display the result.
Well, there are quite a few of samples like these to be found on the net.
But my emphasis is not on the what the samples can do, but what they look like. In my humble opinion, if I can get them to look as clean and elegant as they would in C#, not only will they do the job, but they will be easy to understand, easy to maintain and error free.
To demonstrate this, each project comes as a pair for comparison: One in C# and one in C++.
The C++ code compiles and runs in both Visual C++ 6.0 and in Visual Studio 2008. That might sound trivial, but in the case of MSXML, unfortunately it is not. I will point out the differences when we go through the code.
Stay tuned for Part 2
Hi everyone,
Developer Academy 3 was a great success. The convention center was packed ... and all my demos worked :)
You can now view all the lectures online on the DevAcademy3 website.
If you scroll down to DEV305 you can view my presentation, “Leverage SQL Server 2008 in your .Net Code with Visual Studio 2008 SP1”. You can also download the code for the demos.
In the fourth demo you will see how FILESTREAM allowed me to stream high quality HD videos directly from the server.
Following is a description of all the demos.
If you have any questions or problems with the demos, please post your comments and I will take a look.
Enjoy
Introduction
In my session in Developer Academy 3 I demonstrated four new features of SQL Server 2008 that you can easily leverage in your .Net code with Visual Studio 2008 SP1.
To remind you, these features are:
- Table Valued Parameters (TVP)
- The new MERGE SQL command
- Change Tracking
- FILESTREAM
When you open the zip you will find two folders: “Session Demos” and “Performance Tests”.
“Session Demos” contains the demos that I presented in the session. “Performance Tests” are the two projects that I discussed earlier here and here.
In order to run these projects, do make sure you have a running instance of SQL Server 2008 which you will find here and that you have installed Visual Studio 2008 SP1 which you will find here.
Session Demos
Under “Session Demos” you will find two folders: “SQL Server Management Studio” and “Visual Studio 2008”. “SQL Server Management Studio” contains the SSMS projects for the demos and “Visual Studio 2008” contains the corresponding Visual Studio solutions.
- First, open “Demos.ssmssln” from the “SQL Server Management Studio” folder with SQL Server Management Studio. In Solution Explorer this is what you will see.
- Now open “Demos.sln” from the “Visual Studio 2008” folder with Visual Studio 2008. In Solution Explorer this is what you will see.
- Note that in both environments we have solution folders corresponding to two demo environments: “Schedule” and “Album”. The “Schedule” demo environment demonstrates the Table Valued Parameters, Merge and Change Tracking features. The “Album” demo environment demonstrates FILESTREAM.
- Run the Setup.sql script in the Schedule folder under SSMS. This creates the DevAcademy database with one table, ‘Courses’, representing the presentation schedule in the Tavor hall where I presented.
Starting Point
- Compile the “1. DataAdapter” project in Visual Studio 2008. This project creates a simple WPF application binding a strongly typed dataset to a DataGrid.
I am using the DataGrid provided in the latest release of the WPF control kit. If you want, you can download the control kit with source code from here, but you do not have to as I have included the required dll in this package.
- Check and, if necessary, modify the connection strings to point to the SQL Server 2008 instance you are working with.
- Run the application. You can make changes to the data in the grid and use the Update and Fill buttons to send and retrieve data to and from the database respectively.
- Browse the Schedule.xaml.cs file and you will see how I use the strongly typed data adapter to perform these tasks.
Table Valued Parameters
- In SSMS run the “Setup TVP and Merge.sql” to define the CourseTableType and the stored procedures that use it.
- Browse the script and note how the stored procedures InsertCourses, DeleteCourses, and UpdateCourses use their single table valued parameter in set operations as if it were a local table.
- In VS2008 compile the “2. Table Valued Parameters” project.
- Run the application. Edit the data in the grid and use the Update and Fill buttons to verify that the functionality has not changed.
- Now browse the Schedule.xaml.cs file in this project. The Fill button click event handler still uses the data adapter to retrieve data from the database. However, the Update button click event handler now uses an SQL Command with a Table Valued Parameter to invoke the InsertCourses, UpdateCourses and DeleteCourses stored procedures.
The MERGE command
- In SSMS browse the “Setup TVP and Merge.sql” again and note how the MERGE command is used in the MergeCourses stored procedure.
- In VS2008 compile the “3. Merge” project.
- Run the application. Edit the data in the grid and use the Update and Fill buttons to verify that the functionality has not changed.
- Browse the Schedule.xaml.cs file in this project. Note that the Fill button click event handler still uses the data adapter to retrieve data from the database. However, the Update button click event handler now uses an SQL Command with a Table Valued Parameter to invoke the MergeCourses stored procedure.
- Note also that the table passed to the MergeCourses stored procedure is defined as follows:
As you can see we are passing unchanged rows to the MergeCourses stored procedure. This corresponds to the ‘when not matched’ clause in the stored procedure:
As an alternative to this approach you can delete the highlighted code in the .Net code and the highlighted clause in the MergeCourses definition. Then, you might use the DeleteCourses stored procedure as in the “2. Table Valued Parameters” project to perform deletes. This alternative approach would be more efficient if there are many unchanged rows in the table.
Change Tracking
In VS compile and run the “4. Change Tracking” project. This project uses Synchronized Services with Change Tracking. I advise you to read about Local Database Cache here before studying this demo.
You can create the demo for yourself as follows:
- Copy the original “1. DataAdapter” project and rename it as “4. Change Tracking”.
- Delete the DevAcademyDataSet.xsd. (The project won’t compile now).
- Add a new item of type “Local Database Cache” named LocalScheduleCache.sync. (You can find this item under the Data category) and complete the wizard as follows:
In my session I created this application twice: first, without checking the “Use SQL Server change tracking” check box, and a second time, checking it. This check box only appears in Visual Studio 2008 SP1 and when the remote server is SQL Server 2008. Checking the checkbox does not affect the next steps at all, nor does it change the functionality of the application, but, as I noted during the session, it makes a big difference on the database server.
Without change tracking, Sync Services for ADO.NET changes the schema of the DevAcademy database, adding two columns to the Courses table and an additional table called Courses_Tombstone. These are required to enable correct management of changes in the data. However, with “Use SQL Server change tracking” enabled, no schema changes are required.
-
After completing the wizard you will find some new items in the project:
- A Local Database Cache item named LocalScheduleCache.sync.
- SQL scripts to undo and redo the changes in the database.
- A new local database (SQL Compact) named ‘DevAcademy.sdf’.
Note the difference between the contents of those SQL scripts when change tracking is enabled and when it is not. When not enabled the scripts remove columns and tables that were added by the wizard. When enabled the scripts only disable change tracking at the database and table levels.
- Next, add a new Data Source of type Database (Data -> Add New Data Source). This time select the new connection string called “ClientDevAcademyConnectionString” that represents the local database. We are now binding the client application to the local database and not to the database on SQL Server 2008.
- As in the original “1. DataAdapter” project, the Fill and Update button click event handlers use the data adapter to send data to update and retrieve data from the database. However, as we are using the dataset from the previous step, we are now accessing the local database and not the one on the SQL Server.
- Now to add Sync Services for ADO.NET. Add the Sync method that you can find in the Schedule.xaml.cs file of the downloaded project. Add also the invocations of the Sync method as I did in the button event handlers (and in the constructor). In this (rather contrived) scenario we are downloading updates from the remote database to the local database immediately before filling the dataset from the local database. Similarly, we are uploading updates from the local database to the remote database immediately after updating the local database from the dataset. This demonstrates how you can implement Sync Services for ADO.NET in your application. In a real world scenario you would not apply the Sync method in this way, however the example does demonstrate how you can generate the code you need for that. Synchronization will usually be performed “occasionally”, that is, when a connection is available and not every time the GUI updates or needs to be updated locally.
FILESTREAM
Now let’s run through the FILESTREAM demo. We will see how FILESTREAM enables us to HD video directly from the database!
The FILESTREAM demo was inspired by this sample at Codeplex, but I made a few changes to simplify the demo. In particular, I made use of an .ashx HttpHandler to stream the video, instead of streaming from an .asmx page and I used Windows Media Player instead of the WPF MediaElement control.
- In SSMS, delete the DevAcademy database. We will rebuild it in the next step.
- In SSMS, open the solution folder “Album” and run the “Setup Filestream.sql” script. Note how enabling FILESTREAM needs to be enabled at three levels.
- The “filestream access level” configuration of the server must be ‘2’
- The database must have a secondary filegroup that is declared as containing a FILESTREAM.
- A table with a FILESTREAM column must have a ROWGUIDCOL column.
- In VS2008 Open the “Album” Solution folder. You will find three projects here.
- AlbumDataAccess
AlbumDataAccess is a class library that creates an SqlFileStream object to represent a FILESTREAM blob in a row in the database.
- AlbumUploader
AlbumUploader is a console application that copies files from a specified folder to a blob in the database. It also calculates the bitrate at which the data is uploaded.
- AlbumStreamer
AlbumStream contains an HttpHandler that simply reads a blob and writes it to the http response stream.
- Rather than uploading the video files to the website, I invite you to download some excellent HD footage for yourselves from here.
- Compile the AlbumDataAccess application.
- Compile and run the AlbumUploader. When your uploads are complete, open Windows Media Player and open a url such as this: http://localhost:55555/AlbumStreamer/Handler.ashx?title=<name of file>
Voila ! You are streaming video
Hi again,
Recently I have been making progress towards my MCPD certification (one more exam to go :).
Like me, you may have been a little overwhelmed, at first, by the large number of exams and the detailed information on Microsoft’s MCP site. But I assure you, its really quite simple.
I have prepared the attached document to describe the exams, the certifications, and the relations between them. The document is in Hebrew, but the conclusions are posted below in English.
John Bryce Training is offering a dedicated preparation program to speed you on your way toward these prestigious certifications. I recommend you take a look and register!
I can make no commitment as to the correctness of this information, but I hope it will help you plan you exam taking. Your source for correct and updated information should always be Microsoft’s MCP site.
Conclusions
Here are my recommendations for which exams you should take and in which order. I believe they get you the most important certifications for the least number of exams.
If you have already embarked on the .Net 2.0 MCPD program, I recommend you complete these 7 exams:
This way you will earn all the MCTS certifications for .Net 2.0, the broadest of the three MCPD certifications for .Net 2.0, the four most important (of the six) MCTS certifications for .Net 3.5 and the broadest MCPD certification for .Net 3.5.
If you are just starting out and would like to go directly to the .Net 3.5 certifications, I recommend you take the following 6 exams:
This will you will earn the four most important (of the six) MCTS certifications for .Net 3.5 and the broadest MCPD certification for .Net 3.5.
Good luck on those exams!
This is my second post about some of the performance improvements you will experience when upgrading to SQL Server 2008 and Visual Studio 2008 SP1.
In this one we will be taking FILESTREAM for a test ride.
During my session at Developer Academy 3 I will be describing how this new feature in SQL Server 2008 can be leveraged from Visual Studio 2008 (SP1) to provide the data integrity and manageability of SQL Server with the access speed of an NTFS file.
In preparation, I decided to measure the performance of FILESTREAM using the project attached to this post.
The project compares the reading and writing speeds to and from a BLOB using three methods:
- NTFS:
Reading and writing from a plain vanilla NTFS file. This method is used when BLOBs are stored on the file system and only their URLs are stored in the database. - SQL Server 2005:
Reading a BLOB from the database using DataReader in SequentialAccess mode. Writing a BLOB using the TEXTPTR and UPDATETEXT SQL Commands. - SQL Server 2008:
Using FILESTREAM.
Speeds are calculated as the throughput in bytes per second of reads/writes of 100 blocks of 102400 bytes each.
The following table presents the average and standard deviation of these speeds over 100 repetitions:
Using the code:
- You must be using SQL Server 2008 and Visual Studio 2008 SP1.
- Download the code from the bottom of this post.
- Run the RunThisFirst.sql against the server.
- Modify the connection string in the project Settings.
- F5
- The code should be self-explanatory, but let me know if not.
Conclusions:
- Comparison to NTFS:
FILESTREAM is as fast as NTFS on writes and about 10% slower than NTFS on reads. - Comparison to BLOB technologies on SQL Server 2005:
FILESTREAM improves throughput by a factor of 45 for writes and 7 for reads.
See you on Monday !
This is my first post of two (maybe more, we’ll see :) about some of the performance improvements you will experience when upgrading to SQL Server 2008 and Visual Studio 2008 SP1.
In this one we will be taking TVP (Table Valued Parameters) for a test ride.
Let’s compare the time it takes to insert a large number of rows into a table using the following methods:
- SqlDataAdapter (with different values of the UpdateBatchSize property)
- Multiple Inserts (delimited by semicolons)
- Packaging the rows to insert as XML and calling a stored procedure that receives it.
- A stored procedure with a Table Valued Parameter
Using the code:
- You must be using SQL Server 2008 and Visual Studio 2008 SP1.
- Download the code from the bottom of this post.
- Run the Setup.sql against the server.
- Modify the connection string in the project Settings.
- F5
- The code should be self-explanatory, but let me know if not.
Here are the results of just one run on my laptop for the insertion of 10,000 rows.
Conclusions:
TVP performs better than DataAdapter generated updates, and better than the relatively fast method of sending tables as XML.
The results aren’t entirely consistent each run, but TVP wins every time : )
Comments are welcome
Hi everyone
I am sure you have all heard about Developer Academy 3 (15th December at the ‘Avenue’ at Airport City).
I will be there (at 09:00 on Floor -1) giving a session about some of the exciting new features of SQL Server 2008 and how Visual Studio 2008 SP1 helps us utilize them in .Net applications.
In the next post or so I will be taking those two for a test ride with some performance testing that you can try out for yourselves.
See you there!
Hello world,
This is my first post, and here is a short description of what you will find here.
I try to keep up to date with the latest and greatest in .Net, but, there is so much going on - this is a near impossible task. I therefore enjoy reading blogs that give me a short description of what’s new and that provide a succinct yet complete sample project I can run and learn from.
So, I would like to provide you with exactly that. You will find short descriptions, and clearly written code that you can run or reuse as you please. I hope that these posts will be useful to you.
At first, I expect my subject matter will be quite broad. That’s because I teach in a variety of areas, and I will be sharing with you discoveries and insights from my lessons. However with time, I think I will focus on two or three areas that you and I find most interesting.
Oh, and the most important thing. I hope to learn from this blog, so any comments, enhancements, bug reports or corrections you may have are most welcome.
So, off we go
David