advertisement
javaboutique
Search Tips
Articles  |   Tutorials  |   Reviews  |   Tools  |   by Category  |   by Date  |   by Name  |   Submit  |   Source  |   Forums  |  
javaboutique
Browse DevX


Partners & Affiliates











advertisement

Reviews : Java Books : Java and XSLT :


Title: Java and XSLT
ISBN: 0-596-00143-6, Order Number: 143-6
US Price: $39.95
© O'Reilly & Associates, Inc.

Writing the custom parser

Writing the actual SAX parser sounds harder than it really is. The process basically involves implementing the org.xml.sax.XMLReader interface, which provides numerous methods you can safely ignore for most applications. For example, when parsing a CSV file, it is probably not necessary to deal with namespaces or validation. The code for AbstractXMLReader.java is shown in Example 5-5. This is an abstract class that provides basic implementations of every method in the XMLReader interface except for the parse( ) method. This means that all you need to do to write a parser is create a subclass and override this single method.

Example 5-5: AbstractXMLReader.java

package com.oreilly.javaxslt.util;

import java.io.IOException;
import java.util.*;
import org.xml.sax.*;

/**
 * An abstract class that implements the SAX2 XMLReader 
 * interface. The intent of this class is to make it easy 
 * for subclasses to act as SAX2 XMLReader implementations. 
 * This makes it possible, for example, for them to emit SAX2 
 * events that can be fed into an XSLT processor for 
 * transformation.
 */
public abstract class AbstractXMLReader 
			implements org.xml.sax.XMLReader {
    private Map featureMap = new HashMap(  );
    private Map propertyMap = new HashMap(  );
    private EntityResolver entityResolver;
    private DTDHandler dtdHandler;
    private ContentHandler contentHandler;
    private ErrorHandler errorHandler;

    /**
     * The only abstract method in this class. Derived classes 
     * can parse any source of data and emit SAX2 events to the 
	 * ContentHandler.
     */
    public abstract void parse(InputSource input) 
				throws IOException,
            SAXException;

    public boolean getFeature(String name)
            throws SAXNotRecognizedException, 
				SAXNotSupportedException {
        Boolean featureValue = (Boolean) this.featureMap.get(name);
        return (featureValue == null) ? false
                : featureValue.booleanValue(  );
    }

    public void setFeature(String name, boolean value)
            throws SAXNotRecognizedException, 
				SAXNotSupportedException {
        this.featureMap.put(name, new Boolean(value));
    }

    public Object getProperty(String name)
            throws SAXNotRecognizedException, 
				SAXNotSupportedException {
        return this.propertyMap.get(name);
    }

    public void setProperty(String name, Object value)
            throws SAXNotRecognizedException, 
				SAXNotSupportedException {
        this.propertyMap.put(name, value);
    }

    public void setEntityResolver(EntityResolver entityResolver) {
        this.entityResolver = entityResolver;
    }

    public EntityResolver getEntityResolver(  ) {
        return this.entityResolver;
    }

    public void setDTDHandler(DTDHandler dtdHandler) {
        this.dtdHandler = dtdHandler;
    }

    public DTDHandler getDTDHandler(  ) {
        return this.dtdHandler;
    }

    public void setContentHandler(ContentHandler contentHandler) {
        this.contentHandler = contentHandler;
    }

    public ContentHandler getContentHandler(  ) {
        return this.contentHandler;
    }

    public void setErrorHandler(ErrorHandler errorHandler) {
        this.errorHandler = errorHandler;
    }

    public ErrorHandler getErrorHandler(  ) {
        return this.errorHandler;
    }

    public void parse(String systemId) throws IOException, 
		SAXException {
        parse(new InputSource(systemId));
    }
}
Note: Color coded lines have been broken for display purposes.

Creating the subclass, CSVXMLReader, involves overriding the parse( ) method and actually scanning through the CSV file, emitting SAX events as elements in the file are encountered. While the SAX portion is very easy, parsing the CSV file is a little more challenging. To make this class as flexible as possible, it was designed to parse through any CSV file that a spreadsheet such as Microsoft Excel can export. For simple data, your CSV file might look like this:

Burke,Eric,M
Burke,Jennifer,L
Burke,Aidan,G

The XML representation of this file is shown in Example 5-6. The only real drawback here is that CSV files are strictly positional, meaning that names are not assigned to each column of data. This means that the XML output merely contains a sequence of three <value> elements for each line, so your stylesheet will have to select items based on position.

Example 5-6: Example XML output from CSV parser

<?xml version="1.0" encoding="UTF-8"?>
<csvFile>
  <line>
    <value>Burke</value>
    <value>Eric</value>
    <value>M</value>
  </line>
  <line>
    <value>Burke</value>
    <value>Jennifer</value>
    <value>L</value>
  </line>
  <line>
    <value>Burke</value>
    <value>Aidan</value>
    <value>G</value>
  </line>
</csvFile>

One enhancement would be to design the CSV parser so it could accept a list of meaningful column names as parameters, and these could be used in the XML that is generated. Another option would be to write an XSLT stylesheet that transformed this initial output into another form of XML that used meaningful column names. To keep the code example relatively manageable, these features were omitted from this implementation. But there are some complexities to the CSV file format that have to be considered. For example, fields that contain commas must be surrounded with quotes:

"Consultant,Author,Teacher",Burke,Eric,M
Teacher,Burke,Jennifer,L
None,Burke,Aidan,G

To further complicate matters, fields may also contain quotes ("). In this case, they are doubled up, much in the same way you use double backslash characters (\\) in Java to represent a single backslash. In the following example, the first column contains a single quote, so the entire field is quoted, and the single quote is doubled up:

"test""quote"

This would be interpreted as:

test"quote,Teacher,Burke,Jennifer,L

The code in Example 5-7 shows the complete implementation of the CSV parser.

How to Add Java Applets to Your Site

New on the Java Boutique:

New Review:

Time Management Made Easy with the Quartz Enterprise Job Scheduler
Why not just use the Java timer API? This open source scheduling API boasts simplicity, ease-of-integration, a well-rounded feature set, and it's free!

New Applet:

Reverse Complement
Reverse Complement is a simple applet that converts DNA or RNA sequences into three useful formats.

Elsewhere on internet.com:

WebDeveloper Java
Lots of Java information on webdeveloper.com

WDVL Java
Thorough Java resource at the Web Developer's Virtual Library.

ScriptSearch Java
Hundreds of free Java code files to download.

jGuru: Your View of the Java Universe
Customizable portal with online training, FAQs, regular news updates, and tutorials.

 Microsoft Visual Studio 2010 Showcase
 Avaya Developer Showcase
 MSDN Spotlight
 PHP for Windows Showcase
XML error: undefined entity at line 39
advertisement
Receive Articles via our XML/RSS feed
Receive Articles via our XML/RSS feed

JavaBytes
Internet Cyclone
This powerful, easy-to-use, internet optimizer is for Windows 95, 98, ME, NT, 2000 and XP. It's designed to automatically optimize your Windows settings, boosting your Internet connection up to 200%.

Windows 7: From Beta to Final Code in One Year
Google Shows Off Chrome OS, Releases Source
Microsoft Shows Off Silverlight 4, IE9 Plans
Metasploit Expands Vulnerability Test Framework
HyperCard Reborn?
Fedora 12 Takes Aim at Linux Networking
Top Supercomputer Nearly Doubles in Speed
Fedora 12 Linux Tackles Virtualization
Apple Gives iPhone Developers App Status Tracker
Novell Sets OpenSUSE 11.2 Free

Creating Custom Export Filters for StarOffice with XSLT
WPF Wonders: Using DataTemplates
Crystal Reports Family Offers Options for Developers
Avaya Aura Session Manager video
Avaya Aura Overview video
Exploring HTML 5's Audio/Video Multimedia Support
Overriding Virtual Functions? Use C++0x Attributes to Avoid Bugs.
Understanding the Cloud Computing Security Vulnerabilities
Cisco and IBM Target a Greener World
Upgrade to Visual Studio 2010 with the Ultimate Offer

Advertising Info  |   Member Services  |   Contact Us  |   Help  |   Feedback  |   Site Map  |   Network Map  |   About

internet.commediabistro.comJusttechjobs.comGraphics.com

Search:

WebMediaBrands Corporate Info

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | Shopping | E-mail Offers | Freelance Jobs