|
Parsing with StAX in JDK 6.0
by Deepak Vohra
J2EE/XML developers commonly parse XML documents with the Document Object Model (DOM) API or the Simple API for XML (SAX ) API. However, these APIs have their disadvantages. The DOM API has the disadvantage of being memory intensive, because an in memory structure of the complete XML document is created before the XML document may be navigated. The SAX API has the disadvantage of being a push parsing model API, in which parsing events are generated by the parser. StAX, in comparison, is based on a pull parsing model. In this article, you'll first create your own XML document and then learn the various ways to parse it, using the StAX pull-method of event generation.
Push Parsing vs. Pull Parsing
Pull parsing has these general advantages over push parsing:
- In pull parsing, events are generated by the parsing application, thus providing parsing regulation to the client, rather than the parser.
- Pull parsing's code is simpler and it has less libraries than push parsing.
- Pull parsing clients can read multiple XML documents simultaneously.
- Pull parsing allows you to filter XML documents and skip parsing events.
Enter StAX
Streaming API for XML (StAX), introduced in JSR 173 in March 2004, is a streaming, pull-parsing API for XML. StAX is a new feature in JDK 6.0, the beta version of which is available here.
A push model parser generates events until the XML document is completely parsed. But pull parsing is regulated by the applicationso parse events are generated by the application. This means with StaX, you can suspend parsing, skip elements while parsing, and parse multiple documents. With the DOM API, you end up parsing the complete XML document into a DOM structure, thus reducing parsing efficiency. With StAX, parsing events get generated as the XML document gets parsed. Click here to see a performance comparison of StAX with other parsers.
The StAX API is also implemented in Java Web Services Developer Pack (JWSDP) 1.6 with the Sun Java Streaming XML Parser (SJSXP), located in the javax.xml.stream package. The XMLStreamReader interface is for parsing an XML document. The XMLStreamWriter interface is used to generate an XML document. The XMLEventReader parses XML events with an object event iteratoras opposed to the cursor mechanism used by XMLStreamReader. This tutorial parses an XML document with the StAX implementation in JDK 6.0.
StaX is just one in a long list of XML features found in the JDK 6.0. There's also support for Java Architecture for XML-Web services (JAX-WS) 2.0, Java API for XML Binding (JAXB) 2.0, XML Digital Signature APIs, and support for the SQL:2003 'XML' data type.
Preliminary Setup
If you're using JDK 6.0, the StAX API is in the Classpath by default. If you're using JWSDP 1.6, add the JWSDP 1.6 StAX API to the classpath. Add <jwsdp-1.6>\sjsxp\lib\ jsr173_api.jar and <jwsdp-1.6≫\sjsxp\lib\sjsxp.jar to the CLASSPATH variable. Install JWSDP 1.6 in the <jwsdp-1.6≫ directory. Jsr173_api.jar is JSR-173 API JAR. Sjsxp.jar is the SJXSP implementation JAR.
New on the Java Boutique:
New Review:
Time Management Made Easy with the Quartz Enterprise Job Scheduler
Why not just use the Java timer API? This open source scheduling
API boasts simplicity, ease-of-integration, a well-rounded feature
set, and it's free!
New Applet:
Reverse Complement
Reverse Complement is a simple applet that converts DNA or RNA
sequences into three useful formats.
Elsewhere on internet.com:
WebDeveloper Java
Lots of Java information on webdeveloper.com
WDVL Java
Thorough Java resource at the Web Developer's Virtual Library.
ScriptSearch Java
Hundreds of free Java code files to download.
jGuru: Your View of the Java Universe
Customizable portal with online training, FAQs, regular news updates, and tutorials.
|