What the ContentHandler Doesn't Tell You
The ContentHandler interface is designed to provide everything most applications need to know about an XML instance document. What it leaves out are things you rarely care about, although most of these are available through other callback interfaces discussed in the next chapter. These include
Comments, unskipped entities, and CDATA sections, all of which are available through the LexicalHandler interface
The names, public IDs, system IDs, and notations for unparsed entities; and the names, public IDs, and system IDs for notations—all of which are available through the DTDHandler interface
ELEMENT, ATTLIST, and parsed ENTITY declarations from the DTD, all of which are reported through the DeclHandler interface
Validity errors and other nonfatal errors, which are reported through the ErrorHandler interface
The only things that truly aren't available in SAX2, even after all optional extensions are included, are
The version, encoding, and standalone attributes from the XML declaration
(scheduled to be added in SAX 2.1)
Insignificant white space in tags and before and after the root element
The order of attributes
The type of quotes that surround attributes
Character references
Prenormalized attribute values
Whether an attribute was specified in the instance document or defaulted in from the DTD or schema
Whether empty elements are represented as <name></name> or <name />
Skipped entities in attribute values
The only common use case for most of this information would be an XML editor. Editors are quite strange beasts compared with most client applications, and they really require a custom parser and API. None of the standard APIs or parsers provide all of the information an editor needs.
|
Main Menu
|