advertisement
javaboutique
Search Tips
Articles  |   Tutorials  |   Reviews  |   Tools  |   by Category  |   by Date  |   by Name  |   Submit  |   Source  |   Forums  |  
javaboutique
Browse DevX


Partners & Affiliates











advertisement

Tutorials : Digesting XML documents :

Digesting XML documents

by Keld H. Hansen

Introduction

When it comes to parsing XML documents there are several ways to proceed. One is to use SAX, which is an event-driven tool, that allows you to catch precisely the data you need. SAX is pretty low-level, however, and you'd therefore often prefer a tool like JDOM, which builds an in-core tree structure of the XML document. Using various method calls you may manipulate the tree as you like.

Another approach is to let either Castor or XMLBeans convert the XML document into a linked structure of Java beans, which are easy to access from your program. 

If you're only interested in parts of an XML document, or you don't care about fancy tree structures, Digester from Jakarta Commons could be an option. It allows you to extract the parts of the XML document you need, and puts few restrictions on the way you store data in your program. In general it's simpler to use than the other tools I mentioned.

It resembles SAX since it links the various XML elements to methods in your programs, but it's much simpler to use than SAX. Digester has a programming API, but also has a possibility of using an XML configuration file to describe how processing should be done. It furthermore implements a very open architecture allowing you to define your own processing rules by coding separate plug-ins.

If you're interested in another XML-tool for your toolbox then you're welcome to read on. First I'll tell you how to install Digester, then we'll look into a few basic examples, and finally we'll build a Struts web application which will process and show data from an RSS (Really Simple Syndication--or Rich Site Summary) feed.

Installing Digester

To run Digester you'll need a jar-file for Digester plus 3 additional jar-files from other Jakarta Commons projects: Beanutils, Collections and Logging. All projects can be downloaded from the same Jakarta download page. After the download you must place the jar files in your classpath and you're ready to run.

The Digester design

When using Digester there are some simple rules you must know and follow:

Determine what data you'll need from the XML document
You may choose to extract anything from a single data value to all data
 
Create Java classes (if you don't have them already) to hold the data you extract
You must also have methods in these classes that can be used for storing the data. The standard bean setter-methods are fine for non-Collection type of data. Collections may be handled by "add-methods", which adds one element to a Collection. The examples following shortly show how it works.
 
Digester identifies each XML element by a simple string pattern
Let's take a simple XML document like this:

<A>
   <B>data for B</B>
   <C>
      <D>data for D</D>
   </C>
</A>

To identify the B-element Digester uses the string "A/B".
The data in D is referred to by "A/B/C/D".
The strings are called "Element patterns". More on this in the examples below.
 
Rules for element matching
When an XML element is matched to a pattern you must specify what should happen in your program. This is done by telling which Java objects should be created or which Java methods should be called.   
 
If possible use the same names for XML elements and attributes as for bean properties 
  This just makes coding simpler.

How to Add Java Applets to Your Site

New on the Java Boutique:

New Review:

Time Management Made Easy with the Quartz Enterprise Job Scheduler
Why not just use the Java timer API? This open source scheduling API boasts simplicity, ease-of-integration, a well-rounded feature set, and it's free!

New Applet:

Reverse Complement
Reverse Complement is a simple applet that converts DNA or RNA sequences into three useful formats.

Elsewhere on internet.com:

WebDeveloper Java
Lots of Java information on webdeveloper.com

WDVL Java
Thorough Java resource at the Web Developer's Virtual Library.

ScriptSearch Java
Hundreds of free Java code files to download.

jGuru: Your View of the Java Universe
Customizable portal with online training, FAQs, regular news updates, and tutorials.

 Microsoft RIA Development Center
 IBM Rational Resource Center
 Destination .NET
XML error: not well-formed (invalid token) at line 33
advertisement
Receive Articles via our XML/RSS feed
Receive Articles via our XML/RSS feed

JavaBytes
Internet Cyclone
This powerful, easy-to-use, internet optimizer is for Windows 95, 98, ME, NT, 2000 and XP. It's designed to automatically optimize your Windows settings, boosting your Internet connection up to 200%.

Free VMware Server 2.0 Now Release Candidate
Linux Player Xandros Grabs Storied Rival Linspire
Hey Enterprise: Here Comes the 3G iPhone
MySpace Opens Profile Portability API
Microsoft Jumps Into Virtualization Fray
Eclipse Ganymede Makes It Easier for Devs
Open Source Nokia a Threat to Microsoft, Google?
Salesforce, Google Head for 2nd on Apps
HP Open Sources Unix File System for Linux
Red Hat Opens Its Network to Space

Build a Generic Histogram Generator for SQL Server
Beyond XML and JSON: YAML for Java Developers
Mastering the Windows Mobile Emulators
Avaya AE Services Provide Rapid Telephony Integration with Facebook
Featured Algorithm: Intel Threading Building Blocks: parallel_reduce
Getting Started with Windows Live Admin Center
Eight Key Practices for ASP.NET Deployment
Java ME User Interfaces: Do It with LWUIT!
Talking VPro: Transcript
Bringing Semantic Technology to the Enterprise

Advertising Info  |   Member Services  |   Contact Us  |   Help  |   Feedback  |   Site Map  |   Network Map  |   About



JupiterOnlineMedia

internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and JupiterOnlineMedia

Jupitermedia Corporate Info


Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy.

Advertise | Newsletters | Tech Jobs | Shopping | E-mail Offers

Solutions
Whitepapers and eBooks
IBM eBook: Planning a Service Oriented Architecture
IBM eBook: Choosing the Right Architecture--What It Means for You and Your Business
Microsoft Article: Will Hyper-V Make VMware This Decade's Netscape?
Avaya Article: Using Intelligent Presence to Create Smarter Business Applications
Intel Go Parallel Article: Getting Started with TBB on Windows
Microsoft Article: 7.0, Microsoft's Lucky Version?
Avaya Article: How to Feed Data into the Avaya Event Processor
IBM Article: Developing a Software Policy for Your Organization
Microsoft Article: Managing Virtual Machines with Microsoft System Center
Intel Go Parallel Article: Intel Threading Tools and OpenMP
HP eBook: Storage Networking , Part 1
Microsoft Article: Solving Data Center Complexity with Microsoft System Center Configuration Manager 2007
MORE WHITEPAPERS, EBOOKS, AND ARTICLES
Webcasts
HP Video: StorageWorks EVA4400 and Oracle
HP Webcast: Storage Is Changing Fast - Be Ready or Be Left Behind
Microsoft Silverlight Video: Creating Fading Controls with Expression Design and Expression Blend 2
MORE WEBCASTS, PODCASTS, AND VIDEOS
Downloads and eKits
Red Gate Download: SQL Toolbelt and free High-Performance SQL Code eBook
Iron Speed Designer Application Generator
MORE DOWNLOADS, EKITS, AND FREE TRIALS
Tutorials and Demos
Silverlight 2 App and Walkthrough: Leverage Silverlight 2 with SQL Server and XML
IBM Article: Enterprise Search--Do You Know What's Out There?
HP Demo: StorageWorks EVA4400
Microsoft Article: The Progress and Promise of Deep Zoom
Microsoft How-to Article: Get Going with Silverlight and Windows Live
MORE TUTORIALS, DEMOS AND STEP-BY-STEP GUIDES