Putting SAX to work
Now we're ready to write a class that reads an XML-file. The class should
first of all extend "DefaultHandler" and must also:
- Define the methods "startElement", "endElement", and "characters"
- Instantiate a SAX parser
- Register itself as an event handler for this parser
- Invoke the parser
- Do something with the parsed results
To illustrate the techniques we'll make a Java program with a main method
that receives the name of the XML file we want to process. The skeleton for such
a program looks like this (the 5 tasks listed above are shown as comments like
this: ***3***):
import java.util.*;
import org.xml.sax.*;
import org.xml.sax.helpers.*;
public class MySAXParser extends DefaultHandler {
static final String PARSER = "org.apache.xerces.parsers.SAXParser";
StringBuffer b = new StringBuffer(); // collects text
List list; // holds parsed results
. . .
public static void main(String[] args) {
// main receives the name of an XML file
if (args.length > 0) {
MySAXParser mp = new MySAXParser();
try {
mp.processFile(args[0]);
mp.listData();
} catch (Exception e) {e.printStackTrace();};
}
}
public void processFile(String file) throws Exception {
// Parse an XML file
list = new LinkedList();
XMLReader parser =
XMLReaderFactory.createXMLReader(PARSER); // ***2***
parser.setContentHandler(this); // ***3***
parser.parse(file); // ***4***
}
// ***1***
public void startElement(String uri, String localName, String qname,
Attributes attributes) {
b.setLength(0); // empty character buffer
. . .
}
// ***1***
public void endElement(String uri, String localName, String qname) {
b.setLength(0); // empty character buffer
. . . // ***5***
}
// ***1***
public void characters(char[] chars, int start, int length) {
// collect the characters
b.append(chars, start, length); // ***5***
}
public void listData() {
// List data
. . .
}
}
The SAX parser we use in the examples is Apache's Xerces parser. Note that its
name is given as a constant in the start of the program. If you want to be able
to choose another parser without having to re-compile your program, you could
get the name of the parser from the system property "org.xml.sax.driver", and
leave out the parameter to "XMLReaderFactory.createXMLReader()".
To show you how the technique works we'll parse an XML-file containing a set
of DVD's:
<?xml version = "1.0" ?>
<collection>
<dvd>
<title>Lord of the Rings: The Fellowship of the Ring</title>
<length>178</length>
<actor>Ian Holm</actor>
<actor>Elijah Wood</actor>
<actor>Ian McKellen</actor>
</dvd>
<dvd>
<title>The Matrix</title>
<length>136</length>
<actor>Keanu Reeves</actor>
<actor>Laurence Fishburne</actor>
</dvd>
<dvd>
<title>Amadeus</title>
<length>158</length>
<actor>F. Murray Abraham</actor>
<actor>Tom Hulce</actor>
<actor>Elizabeth Berridge</actor>
</dvd>
</collection> |
Since XML demands that we have one root element we wrap the DVD's inside a
"collection" tag. For every DVD we have the title of the movie, the length in
minutes, and a list of actors. In our program, we'll build a list of DVD's each
containing the title, length and a list of actors. Since we're storing all data
in memory the DOM API might actually be a better choice for this case if this
wasn't a teaching model for SAX.
A DVD class
First we'll create a DVD class:
package hansen.playground;
import java.util.*;
public class DVD {
String title;
int length;
List actors;
public DVD(String title, int length, List actors) {
this.title = title;
this.length = length;
this.actors = actors;
}
public String getTitle() {return title;}
public int getLength() {return length;}
public String getActors() {
// Return a comma-separated list of actors
StringBuffer s = new StringBuffer();
for (Iterator i = actors.iterator(); i.hasNext();) {
if (s.length() != 0) s.append(", ");
s.append((String)i.next());
}
return s.toString();
}
}
New on the Java Boutique:
New Review:
Time Management Made Easy with the Quartz Enterprise Job Scheduler
Why not just use the Java timer API? This open source scheduling
API boasts simplicity, ease-of-integration, a well-rounded feature
set, and it's free!
New Applet:
Reverse Complement
Reverse Complement is a simple applet that converts DNA or RNA
sequences into three useful formats.
Elsewhere on internet.com:
WebDeveloper Java
Lots of Java information on webdeveloper.com
WDVL Java
Thorough Java resource at the Web Developer's Virtual Library.
ScriptSearch Java
Hundreds of free Java code files to download.
jGuru: Your View of the Java Universe
Customizable portal with online training, FAQs, regular news updates, and tutorials.
|