Tutorials : Java and XML: putting SAX to work :

Contents
Why use XML?
Reading an XML file
Putting SAX to work
A complete event handler program for SAX
Sorting the data

Putting SAX to work

Now we're ready to write a class that reads an XML-file. The class should first of all extend "DefaultHandler" and must also:

  1. Define the methods "startElement", "endElement", and "characters"
  2. Instantiate a SAX parser
  3. Register itself as an event handler for this parser
  4. Invoke the parser
  5. Do something with the parsed results

To illustrate the techniques we'll make a Java program with a main method that receives the name of the XML file we want to process. The skeleton for such a program looks like this (the 5 tasks listed above are shown as comments like this: ***3***):

	import java.util.*;
	import org.xml.sax.*;
	import org.xml.sax.helpers.*;
	
	public class MySAXParser extends DefaultHandler {
	  static final String PARSER = "org.apache.xerces.parsers.SAXParser";
	  StringBuffer b = new StringBuffer(); // collects text
	  List list; // holds parsed results
	  . . .
	  public static void main(String[] args) {
	    // main receives the name of an XML file 
	    if (args.length > 0) {
	      MySAXParser mp = new MySAXParser();
	      try {
	        mp.processFile(args[0]);
	        mp.listData();
	      } catch (Exception e) {e.printStackTrace();};
	    }
	  }  
	  
	  public void processFile(String file) throws Exception {
	    // Parse an XML file 
	    list = new LinkedList();
	    XMLReader parser = 
	              XMLReaderFactory.createXMLReader(PARSER); // ***2***
	    parser.setContentHandler(this); // ***3***
	    parser.parse(file); // ***4***
	  }
	  
	  // ***1*** 
	  public void startElement(String uri, String localName, String qname, 
	                           Attributes attributes) {
	    b.setLength(0); // empty character buffer
	    . . .
	  } 
	  
	  // ***1*** 
	  public void endElement(String uri, String localName, String qname) { 
	    b.setLength(0); // empty character buffer
	    . . .   // ***5***
	  }
	
	  // ***1*** 
	  public void characters(char[] chars, int start, int length) { 
	    // collect the characters
	    b.append(chars, start, length); // ***5***
	  }
	  
	  public void listData() {
	    // List data
	    . . .
	  }  
	}        

The SAX parser we use in the examples is Apache's Xerces parser. Note that its name is given as a constant in the start of the program. If you want to be able to choose another parser without having to re-compile your program, you could get the name of the parser from the system property "org.xml.sax.driver", and leave out the parameter to "XMLReaderFactory.createXMLReader()".

To show you how the technique works we'll parse an XML-file containing a set of DVD's:

	<?xml version = "1.0" ?>
	<collection>
	  <dvd>
	    <title>Lord of the Rings: The Fellowship of the Ring</title>
	    <length>178</length>
	    <actor>Ian Holm</actor>
	    <actor>Elijah Wood</actor>
	    <actor>Ian McKellen</actor>
	  </dvd>
	  <dvd>
	    <title>The Matrix</title>
	    <length>136</length>
	    <actor>Keanu Reeves</actor>
	    <actor>Laurence Fishburne</actor>
	  </dvd>
	  <dvd>
	    <title>Amadeus</title>
	    <length>158</length>
	    <actor>F. Murray Abraham</actor>
	    <actor>Tom Hulce</actor>
	    <actor>Elizabeth Berridge</actor>
	  </dvd>
	</collection>

Since XML demands that we have one root element we wrap the DVD's inside a "collection" tag. For every DVD we have the title of the movie, the length in minutes, and a list of actors. In our program, we'll build a list of DVD's each containing the title, length and a list of actors. Since we're storing all data in memory the DOM API might actually be a better choice for this case if this wasn't a teaching model for SAX.

A DVD class

First we'll create a DVD class:

	package hansen.playground;
	
	import java.util.*;
	
	public class DVD {
	  String title;
	  int length;
	  List actors;
	  
	  public DVD(String title, int length, List actors) {
	    this.title = title;
	    this.length = length;
	    this.actors = actors;
	  }
	  
	  public String getTitle() {return title;}
	  
	  public int getLength() {return length;}
	  
	  public String getActors() {
	    // Return a comma-separated list of actors
	    StringBuffer s = new StringBuffer();
	    for (Iterator i = actors.iterator(); i.hasNext();) {
	      if (s.length() != 0) s.append(", ");
	      s.append((String)i.next());
	    }  
	    return s.toString();
	  }
	
	} 

How to Add Java Applets to Your Site

New on the Java Boutique:

New Review:

Time Management Made Easy with the Quartz Enterprise Job Scheduler
Why not just use the Java timer API? This open source scheduling API boasts simplicity, ease-of-integration, a well-rounded feature set, and it's free!

New Applet:

Reverse Complement
Reverse Complement is a simple applet that converts DNA or RNA sequences into three useful formats.

Elsewhere on internet.com:

WebDeveloper Java
Lots of Java information on webdeveloper.com

WDVL Java
Thorough Java resource at the Web Developer's Virtual Library.

ScriptSearch Java
Hundreds of free Java code files to download.

jGuru: Your View of the Java Universe
Customizable portal with online training, FAQs, regular news updates, and tutorials.