JAXP DOM I/O
In many cases, the fastest form of transformation available is to
feed an instance of org.w3c.dom.Document directly into
JAXP. Although the transformation is fast, it does take time to
generate the DOM; DOM is also memory intensive, and may not be the
best choice for large documents. In most cases, the DOM data will be
generated dynamically as the result of a database query or some other
operation (see Chapter 1). Once the DOM is generated, simply wrap the
Document object in a DOMSource as follows:
org.w3c.dom.Document domDoc = createDomDocument( );
Source xmlSource =
new javax.xml.transform.dom.DOMSource(domDoc);
Note: Color coded lines have been broken for display purposes.
The remainder of the transformation looks identical to the file-based
transformation shown in Example
5-4. JAXP needs only the alternate input Source object
shown here to read from DOM.
JAXP SAX I/O
XSLT is designed to transform well-formed XML data into another
format, typically HTML. But wouldn't it be nice if we could also
use XSLT stylesheets to transform nonXML data into HTML? For
example, most spreadsheets have the ability to export their data
into Comma Separated Values (CSV) format, as shown here:
Burke,Eric,M
Burke,Jennifer,L
Burke,Aidan,G
One approach is parsing the file into memory, using DOM to create an
XML representation of the data, and then feeding that information into
JAXP for transformation. This approach works but requires an
intermediate programming step to convert the CSV file into a DOM tree.
A better option is to write a custom SAX parser, feeding its output
directly into JAXP. This avoids the overhead of constructing the DOM
tree, offering better memory utilization and performance.
The approach
It turns out that writing a SAX parser is quite easy.
[4] All
a SAX parser does is read an XML file top to bottom and fire event
notifications as various elements are encountered. In our custom
parser, we will read the CSV file top to bottom, firing SAX events
as we read the file. A program listening to those SAX events will
not realize that the data file is CSV rather than XML; it sees only
the events. Figure 5-4
illustrates the conceptual model.
Figure 5-4.
Custom SAX parser
|
|
In this model, the XSLT processor interprets the SAX events as XML
data and uses a normal stylesheet to perform the transformation. The
interesting aspect of this model is that we can easily write custom
SAX parsers for other file formats, making XSLT a useful transformation
language for just about any legacy application data.
In SAX, org.xml.sax.XMLReader is a standard interface
that parsers must implement. It works in conjunction with
org.xml.sax.ContentHandler, which is the interface that
listens to SAX events. For this model to work, your XSLT processor
must implement the ContentHandler interface so it can
listen to the SAX events that the XMLReader generates.
In the case of JAXP, javax.xml.transform.sax.TransformerHandler
is used for this purpose.
Obtaining an instance of TransformerHandler requires a
few extra programming steps. First, create a
TransformerFactory as usual:
TransformerFactory transFact =
TransformerFactory.newInstance( );
Note: Color coded lines have been broken for display purposes.
As before, the TransformerFactory is the JAXP abstraction
to some underlying XSLT processor. This underlying processor may not
support SAX features, so you have to query it to determine if you can
proceed:
if (transFact.getFeature(SAXTransformerFactory.FEATURE)) {
If this returns false, you are out of luck. Otherwise,
you can safely downcast to a SAXTransformerFactory and
construct the TransformerHandler instance:
SAXTransformerFactory saxTransFact =
(SAXTransformerFactory) transFact;
// create a ContentHandler, don't specify a stylesheet. Without
// a stylesheet, raw XML is sent to the output.
TransformerHandler transHand =
saxTransFact.newTransformerHandler( );
Note: Color coded lines have been broken for display purposes.
In the code shown here, a stylesheet was not specified. JAXP defaults
to the identity transformation stylesheet, which means that the SAX
events will be "transformed" into raw XML output. To
specify a stylesheet that performs an actual transformation, pass
a Source to the method as follows:
Source xsltSource = new StreamSource(myXsltSystemId);
TransformerHandler transHand =
saxTransFact.newTransformerHandler(
xsltSource);
Note: Color coded lines have been broken for display purposes.
Detailed CSV to SAX design
Before delving into the complete example program, let's step back
and look at a more detailed design diagram. The conceptual model is
straightforward, but quite a few classes and interfaces come into
play. Figure 5-5 shows the pieces
necessary for SAX-based transformations.
Figure 5-5.
SAX and XSLT transformations
|
|
This diagram certainly appears to be more complex than previous
approaches, but is similar in many ways. In previous approaches,
we used the TransformerFactory to create instances
of Transformer; in the SAX approach, we start with a
subclass of TransformerFactory. Before any work can be
done, you must verify that your particular implementation supports
SAX-based transformations. The reference implementation of JAXP does
support this, although other implementations are not required to do
so. In the following code fragment, the getFeature
method of TransformerFactory will return true
if you can safely downcast to a SAXTransformerFactory
instance:
TransformerFactory transFact =
TransformerFactory.newInstance( );
if (transFact.getFeature(SAXTransformerFactory.FEATURE)) {
// downcast is allowed
SAXTransformerFactory saxTransFact =
(SAXTransformerFactory) transFact;
Note: Color coded lines have been broken for display purposes.
If getFeature returns false, your only
option is to look for an implementation that does support SAX-based
transformations. Otherwise, you can proceed to create an instance of
TransformerHandler:
TransformerHandler transHand =
saxTransFact.newTransformerHandler(myXsltSource);
Note: Color coded lines have been broken for display purposes.
This object now represents your XSLT stylesheet. As
Figure 5-5 shows,
TransformerHandler extends
org.xml.sax.ContentHandler, so it knows how to listen
to events from a SAX parser. The series of SAX events will provide
the "fake XML" data, so the only remaining piece of the
puzzle is to set the Result and tell the SAX parser to
begin parsing. The TransformerHandler also provides a
reference to a Transformer, which allows you to set
output properties such as the character encoding, whether to indent
the output or any other attributes of <xsl:output>.
Footnote:
4.
Our examples use SAX 2.
New on the Java Boutique:
New Review:
Time Management Made Easy with the Quartz Enterprise Job Scheduler
Why not just use the Java timer API? This open source scheduling
API boasts simplicity, ease-of-integration, a well-rounded feature
set, and it's free!
New Applet:
Reverse Complement
Reverse Complement is a simple applet that converts DNA or RNA
sequences into three useful formats.
Elsewhere on internet.com:
WebDeveloper Java
Lots of Java information on webdeveloper.com
WDVL Java
Thorough Java resource at the Web Developer's Virtual Library.
ScriptSearch Java
Hundreds of free Java code files to download.
jGuru: Your View of the Java Universe
Customizable portal with online training, FAQs, regular news updates, and tutorials.
|