The World Wide Web Consortium (W3C)has developed a set of standards that give XML its power and potential, Without these standards, XML would not have the impact on the development world that it does. The W3C Web site (www.w3.erg) is a valuable source for all things XML.
The .NET Framework supports the following W3C standards:
- XML 1.0 (www.w3.erg/TR/1998/REC-xml-19980210), including DTD support
- XML namespaces (www.w3.erg/TR/REC-XIlIl-n~es), both stream level and DOM
- XML schemas (www.w3.erg/2001/XMLSchema)·
- XPath expressions (WW’N. w3. erg ITR/xpath)
- XSLTtransformations (www.w3. erg ITR/xsl t)
- DOM Level 1 Core (www.w3.erg/TR/REC-DOM-Level-l)
- DOM Level 2 Core (www.w3.erg/TR/DOM-Level-2-Cere)
The level of standards support will change as the framework matures and the W3C updates the recommended standards, Because of this, you need to make sure you stay up-to-date with the standards and the level ~f support provided by Microsoft.
Introducing the System.Xml Namespace
Support for processing XMLis provided by the classes in the system,Xmlnamespace in .NET, This section looks (in no particular order) at some of the more important classes that the System, Xml namespace provides, The following table lists the main XMLreader and writer classes.
The following table lists some other useful classes for handling XML.
Many of the classes in the System Xml namespace provide a means to manage XML documents and streams, whereas others (such as the Xml Data Document class) provide a bridge between XML data stores and the relational data stored in DataSets.
It is worth noting that the XML namespace is available to any language that is part of the .NET family, This means that all of the examples written in Visual Basic.
Using System.Xml Classes
The following examples use books. xml as the source of data. You can download this file from the csharp Web site (www.csharpaid.com). but it is also included in several examples in the .NET SDK The books xml file is a book catalog for an imaginary bookstore, It includes book information such as genre, author name, price, and ISBN number, As with the other chapters, you can download all code examples in this chapter from the csharp Web site (www.csharpaid.com).
This is what the books xml file looks like:
Reading and Writing Streamed XML
The XmlReader and XmlWriter classes will feel familiar if you have ever used SAX, XmlReader-based classes provide a very fast, forward-only, read-only cursor that streams the XML data for processing, Because it is a streaming model, the memory requirements are not very demanding, However, you don’t have the navigation flexibility and the read or write capabilities that would be available from a DOM-based model, XmlWriter-based classes produce an XML document that conforms to the W3C’s XML 1.0 Namespace Recommendations.
XmlReader and Xmlwriter are both abstract classes, The following classes are derived from XmlReader:
The following classes are derived from XmlWriter:
XmlTextReader and XmlTextWri ter work with either a stream-based object from the System, 10 namespace or TextReader /TextWriter objects. XmlNodeReader uses an XmlNode as its source instead of a stream: The XmlValidatingReader adds DID and schema validation and therefore offers data validation. You look at these a bit more closely later.
Using the XmlReader Class
XmlReader is a lot like SAX in the MSXMi SDK One of the biggest differences, however, is that whereas SAX is a push type of model (that is, it pushes data out to the application, and the developer has to be ready to accept it), the XmlReader has a pull model, where-data is pulled into an application requesting it, This provides an easier and more intuitive programming model. Another advantage to this is that a pull model can be selective about the data that is sent to the application: if you don’t want all of the data, you don’t need to process it, In a push model, all of the XML data has to be processed by the application, whether it is needed or not.
The following is a very simple example of reading XML data, and later you take a closer look at the XmIReader class. You’ll find the code in the XmlReaderSample folder. Here is the code for reading in the books, XMI document, As each node is read, the NodeType property is checked. U the node is a text node, the value is appended to the text box:
private void button3_Click(object sender, EventArgs e)
XmIReader rdr = XmlReader.Create(‘books.xml’);
if (rdr.NodeType == XmlNodeType.Text)
richTextBoxl.AppendText(rdr.Value + ‘\r\n’);
As previously discussed, XmlReader is an abstract class. So in order to use the XmlReader class directly a create static method has been added, The create method returns an XmlReader object, The overload list for the Create method contains nine entries, In the preceding example, a string that represents the file name of the XmlDocument is passed in as a parameter, Stream-based objects and TextReader-based objects can also be passed in.
An XmlReaderSettings object can also be used. XmlReaderSettings specifies the features of the reader. For example, a schema can be used to validate the stream, Set the Schemas property to a valid XmlSchemaSet object, which is a cache of XSD schemas, Then the XsdValidate property on the XmlReaderSettings object can be set to true. Several Ignore properties exist that can be used to control the way the reader processes certain nodes and values, These properties include IgnoreComments,IgnoreIdentityConstraints, IgnoreInlineSchema,IgnoreProcessingInstructions,IgnoreSchemaLocation,and IgnoreWhitespace, These properties can be used to strip certain items from the document.
Several ways exist to move through the document. As shown in the previous example, Read () takes you to the next node. You can then verify whether the node has a value (HasValue () or, as you see shortly, whether the node has any attributes (HasAttri,butes (), You can also use the ReadStartElement () method, which verifies whether the current node is the start element and then positiorjs you on to the next node. U you are not on the start element, an XmlException is raised, Calling this method is the same as calling the IsStartElement () method followed by a Read () method.
ReadElernentString () is similar to ReadString ( ), except that you can optionally pass in the name of an element. U the next content node is not a start tag, or if the Name parameter does not match the current node Name, an exception is raised.”
Here is an example of how ReadElementString () can be used, Notice that this example uses FileStreams, so you will need to make sure that you include the System. IO namespace via a using statement:
In the while loop, you use MoveToContent () to find each node of type XmlNodeType, Element with the name title. You use the EOFproperty of the XmITextReader as the loop condition. If the node is not of type Element or not named title, the else clause will-issue a Read () method to move to the next node, When you find a node that matches the criteria, you add the. result of a ReadElementString () to the list box, This should leave you with just the book titles in the list box, Note that you don’t have to issue a Read () call after a successful ReadElementString () because ReadElementString () consumes the entire Element and positions you on the next node.
If you remove && rdr. Name title from the if clause, you will have to catch the XmlException when it is thrown, If you look at the data file, you will see that the first element that MoveToContent ( ) will find is the <bookstore> element. Because it is an element, it will pass the check in the if statement. However, because it does not contain a simple text type, it wi!! cause ReadElementString () to raise an XmlException. One way to work around this is to put the ReadElementString () call in a function of its own.
Then, if the call to ReadElementString () fails inside this function, you can deal with the error and ren.rn to the calling function. Go ahead and do this; call this new method LoadTextBox () and pass in the XmlTextReader as a parameter. This is what the LoadTextBox () method looks like with these changes:
This section from the previous example:
will have to change to the following
if (tr.MoveToContent() == XmlNodeType.Element)
LoadTextBox (tr) ;
//otherwise move on
tr . Read () ;
After running this example, the results should be the same as before. What you are seeing is that there is more than one way to accomplish the same goal, This is where the flexibility of the classes in the System, XML namespace starts to become apparent The XmlReader can also read strongly typed data, There are several ReadElementContentAs methods, such as ReadElementContentAsDouble, ReadElementContentAsBoolean, and so on, The following example shows how to read in the values as a decimal and do some math on the value. In this case, the value from the price element is increased by 25 percent:
If the value cannot be converted to a decimal value, a Format Exception is raised, This IS a much more efficient method than reading the value as a string and casting it to the proper data type.
Retrieving Attribute Data
As you play with the sample code, you might notice that when the nodes are read in, you don’t see any attributes, This is because attributes are not considered part of a document’s structure, When you are on an element node, you can check for the existence of attributes and optionally retrieve the attribute values, For example, the HasAttributes property returns true if there are any attributes, otherwise, it returns false. The AttributeCount property teIls you how many attributes there are, and the GetAttribute () method gets an attribute by name or by index, If you want to iterate through the attributes one at a time, you can use the MoveToFirstAttribute () and MoveToNextAttribute () methods.
The following is an example of iterating through the attributes of the books. xml document:
This time you are looking for element nodes. When you find one, you loop through all of the attributes and, using the Get At tribute () method, you load the value of the attribute into the list box, In this example, those attributes would be genre, publicationdate, and ISBN.
Validating with XmlReader
Sometimes its important to know not only that the document is well’formed but also that the document is valid.An XmlReader can validatethe XML according to an XSD schema by using the XmLReader Settings class:The XSD schema is added to the XmlSchemaSet that is exposed through the
Schemas property, The Xsdvalidate property must also be set to true the default for this property is false.
The following example demonstrates the use of the XmlReaderSettings class.The following is the XSD schema that will be used to validatethe books, xml document:
This schema was generated from the books xml in Visual Studio, Notice that the publicationdate attribute has been commented out, This will cause the validation to fail.
The following is the code that uses the schema to validate the books. xml document:
After the XmlReaderSet tings object setting is created, the schema books. xsd is added to the XmlSchemaSet object. The Add method for XmlSchemaSet has four overloads. One takes an XmlSchema object, The XmlSchema object can be used to create a schema on-the-fly without having to create theb schema file on disk, Another overload takes another XmlSchemaSet object as a parameter, Another takes two string values: the first is the target namespace and the other is the URL for the XSL document, If the target namespace parameter is null, the targetNamespace of the schema will be used, The last overload takes the target Namespace as the first parameter as well, but it used an XmlReader-based object to
read in the schema, The XmlSchemaSet preprocesses the schema before the document to be validated is processed.
After the schema is referenced, the XsdValidate property is set to one of the ValidationType enumeration values. These valid values are DTD,Schema, or None. If the value selected is set to None, then no validation will occur.
Because the XmlReader object is being used, if there is a validation problem with. the document, it will not be found until that attribute or element is read by the reader. When the validation failure does occur, an XmlSchemaValidationException is raised. This exception can be handled in a catch block; however, handling exceptions can make controlling the flow of the data difficult. To help with ‘this, a validationEvent is available in the XmlReaderSettings class, This way, the validation failure can be handled without your having to use exception handling, The event is also raised by validation warnings, which do not raise an exception. The ValidationEvent passes in a ValidationEventArgs object that contains a Severity property. This property determines whether the event was raised by an error or a warning. If the event was raised by an error, the exception that caused the event to be raised is passed in as well, There is also a message property, In the example, the message is displayed in a MessageBox.
Using the XmlWriter Class
The Xrnlwri ter class allows you write XML to a stream, a file, a StringBui1der, a TextWriter, or another XmlWriter object, Like XML textReader, it does so in a forward-only, non-cached manner, XmlWriter is highly configurable, allowing you to specify such, things as whether or not to indent content, the amount to indent, what quote character to use in attribute values, and whether namespaces are supported, Like the XmlReader, this configuration is done Using an XmlwriterSettings object.
Here’s a simple example that shows how the XmlTextWriterclass can be used:
Here, you are writing to a new XML file called newbook xm1, adding the data for a new book. Note that XmlWriter will overwrite an existing file with a new one. You look at inserting a new element or node into an existing document later in this chapter. You are instantiating the Xrn1Writer object using the Create static method. In this example, a string representing a file name is passed as a parameter along with an instance of an XmlWriterSetting class.
The Xm1WriterSettings class has properties that control the way that the XML is generated, The checkedcharacters property is a Boolean that will raise an exception if a character in the XML does not conform to the W3C XML 1.0 recommendation. The Encoding class sets the encoding used for the XML being generated; the default is Encoding.UTF8. The Indent property is a Boolean value that determines if elements should be indented, The IndentChars property is set to the character string that it is used to indent, The default is two spaces, The NewLin property is used to determine the characters for line breaks. In the preceding example, the NewLineOrlAttribute is set to true. This will put each attribute in a separate line, which can make the XML generated a little easier to read.
WriteStartDocument () adds the document declartion, Now you start writing data. First comes the book element; then you add the genre, pub] icationdate, and ISBNattributes, Then you write the littIe author and price elements, Note that the author element has a child element name.
When you click the button, you produce the booknew xmI file,which looks like this:
The nesting of elements is controlled by paying attention to when you start and finish writing elements and attributes, You can see this when you add the name child element to the authors element. Note how the WriteStartElement () and WriteEndEIement () method calls are arranged and how that arrangement produces the nested elements in the output file.
To go along with theWriteElementString () and WriteAttributeString () methods, there are sever other specialized write methods. WriteCData() outputs a CData section « !CDATA section, writing out the text it takes as a parameter. WriteContment () writes out a comment in proper XML format. Writedars () writes out the contents of a char buffer. This works in a similar fashion to the ReadChars () method that you looked at earlier; they both use the same type of parameters.
Wri teChars () needs a buffer (an array of characters), the starting position for writing (an integer), and the number of characters to write (an integer).
Reading and writing XML using the XmIReader- and XmIWriter-based classes are surprisingly flexible and simple to do. Next you’ll how the DOM is implemented in the System, XmInamespace through the XmIDocumentand XmINodeclasses.