Converting XML to JavaBeans with XMLBeans

Introduction

A month ago or so one of my colleagues told me about a new XML- to-JavaBeans conversion tool called XMLBeans that he had success with. He even compared it to Castor-XML, a conversion tool many programmers have been using for a long time, and have been extremely satisfied with. A couple of weeks ago I had the chance to look more into this new tool, and in this article I'll try to sum up my findings.

XMLBeans is developed by BEA Systems, and was donated as an open source software to the Apache Foundation in December 2003. Like Castor, XMLBeans takes an XML schema file and produces a set of Java classes that can be used to convert ("marshal") an XML instance that conforms to the schema. Reading the specs for XMLBeans the first thing that strikes one is its ability to handle any (!) XML schema. The other thing that made me really interested was the fact that XMLBeans preserves the underlying XML document and schema structure, thus making it possible to use query languages like XQuery (see however my discouraging findings described later) on the document structure. Finally I really liked that the code generation tool would make me a jar-file ready for use in my programs.

If you're not interested yet, then I suggest you read the XMLBeans FAQ from BEA. It gives a lot of good, relevant information that I'm sure will make you curious about this new tool.

This article will show you how to:

Finally I'll show how you could wrap the XMLBeans tool so it can be replaced by any other marshalling tool. This is always a nice thing to do, and the technique I'll show is quite general, and not related to XML marshalling.

In the resources section you'll find pointers to newsgroups and some interesting discussions about XMLBeans and its relations to other tools and standards.

Install XMLBeans

Here's a link to BEA Systems from where you may download XMLBeans. It's not available yet from the official Apache site because at the time of this writing it's still in the "incubation process" at Apache. Nonetheless, you may find zip files on several Apache mirror sites if you search for "apache xmlbeans" on the web. I've not compared the BEA and Apache downloads in detail, and my experience with XMLBeans comes from using the version from BEA. By the time you read this article, XMLBeans might be available from the official Apache XMLBeans site.

The XMLBeans.zip from BEA contains all you need, documentation, examples, utilities and a jar file with the general classes. I assume that you the reader have available a JDK 1.4.x. The steps to follow to install XMLBeans are these:

  1. Unzip the download in a directory, for example c:\xmlbeans,
  2. Set the environment variable XMLBEANDIR to point to the directory containing the xbean.jar file, e.g. on Windows:
    set XMLBEANDIR=c:\xmlbeans\xkit\lib
  3. To generate the Java classes for your schema run the scomp utility located in the bin folder. For convenience add this folder to your PATH:
    set PATH=%PATH%;c:\xmlbeans\xkit\bin 
  4. Try to enter "scomp". This will show you the usage information for scomp

This completes the installation!

The First Try

For our first example we'll use an XML Schema file I'd used earlier in an article about Castor. Here a simplified deployment descriptor for a web application was used as an example. Let's look at a picture of this schema:

Figure 1. The structure of the web.xml schema (webapp.xsd)
 

To open a new window with a listing of the schema file click here.

Here's an example of a deployment descriptor file using this schema, namely the one used by Struts' "blank application":

Listing 1. web.xml for the Struts-blank application

<?xml version="1.0" encoding="ISO-8859-1"?>

<!DOCTYPE web-app
  PUBLIC "-//Sun Microsystems, Inc.//DTD Web Application 2.2//EN"
  "http://java.sun.com/j2ee/dtds/web-app_2_2.dtd">

<web-app>

  <!-- Standard Action Servlet Configuration (with debugging) -->
  <servlet>
    <servlet-name>action</servlet-name>
    <servlet-class>org.apache.struts.action.ActionServlet</servlet-class>
    <init-param>
      <param-name>config</param-name>
      <param-value>/WEB-INF/struts-config.xml</param-value>
    </init-param>
    <init-param>
      <param-name>debug</param-name>
      <param-value>2</param-value>
    </init-param>
    <init-param>
      <param-name>detail</param-name>
      <param-value>2</param-value>
    </init-param>
    <load-on-startup>2</load-on-startup>
  </servlet>

  <!-- Standard Action Servlet Mapping -->
  <servlet-mapping>
    <servlet-name>action</servlet-name>
    <url-pattern>*.do</url-pattern>
  </servlet-mapping>

  <!-- The Usual Welcome File List -->
  <welcome-file-list>
    <welcome-file>index.jsp</welcome-file>
  </welcome-file-list>

  <!-- Struts Tag Library Descriptors -->
  <taglib>
    <taglib-uri>/tags/struts-bean</taglib-uri>
    <taglib-location>/WEB-INF/struts-bean.tld</taglib-location>
  </taglib>

  . . . (several other taglibs follow) . . .
</web-app>

Our first program will read and parse this file using XMLBeans. The first step is to generate the specific XMLBeans classes from the schema.

Go to the directory where your webapp.xsd file is located and enter:

    scomp -out webapp.jar webapp.xsd

The -out parameter is used to name the output jar file. This is the result you should see:

[There's a known bug in scomp. If you receive a "CreateProcess exception", then the scomp utility is trying to find the Java compiler (the javac.exe file) in the Java runtime directory, instead of looking in the JDK bin directory. The fix is to set the location of the JDK bin-directory in the start of the PATH environment variable. You may find more information in the newsgroup(see the resources section).

Look into the generated webapp.jar file to see what was generated. You'll notice that the original schema file is there and several files that contains the element names from the xsd: WebApp, ServletMapping, Taglib etc. There are many files, and all in all it's a somewhat confusing picture.

If you want to see the Java code that was generated there is a "-src" option that saves the source to a specified directory.

Before we start coding the first program we'll need to add webapp.jar and the xbean.jar file to our classpath. Our code starts out like this:

Listing 2-1. The FirstTry class

package dk.hansen.playground;

import java.io.File;
import java.io.IOException;

import com.bea.xml.XmlException;

import noNamespace.WebAppDocument;
import noNamespace.WebAppDocument.WebApp;

public class FirstTry {

  public static void main(String[] args) 
    throws XmlException, IOException {

    File file = new File(args[0]);
    WebAppDocument webappDoc =
    WebAppDocument.Factory.parse(file);

    WebApp webapp = webappDoc.getWebApp();
    . . .

  }
}

The program takes the name of the web.xml file as a parameter, and from this name we create a File instance. To parse the file we use the static parse method in a class named after our root element, WebApp. This'll give us our first object--a "document" instance. The document contains our XML data, and we get a handle to this by using the getWebApp method. The WebApp class has these getters:

Method name XML element
Servlet[] getServletArray() <servlet>
ServletMapping[] getServletMappingArray() <servlet-mapping>
Taglib[] getTaglibArray() <taglib>
WelcomeFileList getWelcomeFileList() <welcome-file-list>

The mapping from the XML element name to the method name is obvious. If an element can be repeated the getter method name has an "Array" appended to it.

It's now rather simple to continue with the coding:

Listing 2-2. The FirstTry class

    . . .
    System.out.println("Servlets:");

    Servlet[] servlet = webapp.getServletArray();
    for (int i = 0; i < servlet.length; i++) {
      System.out.println(" Name/class: "  
        + servlet[i].getServletName() + "/" 
        + servlet[i].getServletClass());
      InitParam[] init = servlet[i].getInitParamArray();
      System.out.println(" Init parameters:");
      for (int j = 0; j < init.length; j++) {
        System.out.println(" Name/value: " + 
          init[j].getParamName() + "/" 
          + init[j].getParamValue());
      }
      BigInteger load = servlet[i].getLoadOnStartup();
      if (load != null) {
        System.out.println(" Load on startup: " + load);
      }
    }

    System.out.println("Servlet Mappings:");
    ServletMapping[] mapping = webapp.getServletMappingArray();
    for (int j = 0; j < mapping.length; j++) {
      System.out.println(" Name/URL: " 
        + mapping[j].getServletName() + "/" 
        + mapping[j].getUrlPattern());
    }

    System.out.println("Welcome File List:");

    WelcomeFileList fileList = webapp.getWelcomeFileList();
    String[] s = fileList.getWelcomeFileArray();
    for (int i = 0; i < s.length; i++) {
      System.out.println(" Filename: " + s[i]);
    }

    System.out.println("Taglibs:");

    Taglib[] tags = webapp.getTaglibArray(); 
    for (int j = 0; j < tags.length; j++) {
      System.out.println(" URI/location: " 
        + tags[j].getTaglibUri() + "/" 
        + tags[j].getTaglibLocation());
    }
  }

The complete program can viewed here and may be downloaded from the resources section at the end of the article.

If we run the program with web-struts-blank.xml as argument we'll get this listed:

*** Data in web-struts-blank.xml
Servlets:
  Name/class: action/org.apache.struts.action.ActionServlet
  Init parameters:
    Name/value: config//WEB-INF/struts-config.xml
    Name/value: debug/2
    Name/value: detail/2
  Load on startup: 2
Servlet Mappings:
  Name/URL: action/*.do
Welcome File List:
  Filename: index.jsp
Taglibs:
  URI/location: /tags/struts-bean//WEB-INF/struts-bean.tld
  URI/location: /tags/struts-html//WEB-INF/struts-html.tld
  URI/location: /tags/struts-logic//WEB-INF/struts-logic.tld
  URI/location: /tags/struts-nested//WEB-INF/struts-nested.tld
  URI/location: /tags/struts-tiles//WEB-INF/struts-tiles.tld

If you compare this output with the output shown in the article about Castor you'll see that they're identical, which they should be!

Saving and modifying the XML document

What if you want to write the document back to the file system? That's easy: add this line to your program:

webappDoc.save(new File("c:/webapp.xml")); 

If you look in the output file you'll be impressed to see that it's an exact copy of your original document. Even the comments and formatting are preserved! To add elements to your xml document it is also very easy. If we want to add another servlet-mapping we could code it like this:

    
ServletMapping newSM = webapp.addNewServletMapping();
newSM.setServletName("action2"); 
newSM.setUrlPattern("done.*");

You may control where the new servlet-mapping element is inserted. Use an index to show the position, zero means "at the beginning":

    
// Insert new element before the others: 
    ServletMapping newSM = webapp.insertNewServletMapping(0);
    newSM.setServletName("action2"); 
    newSM.setUrlPattern("done.*");

If you want to update some of the data in the document you use the setter-methods. To change the name of the welcome-file we use this code:

    
WelcomeFileList fileList = webapp.getWelcomeFileList();
    fileList.setWelcomeFileArray(new String[]{"welcome.htm"});

To keep the XML document pretty-printed you have several options to apply to the save method, e.g.:

    
    XmlOptions opt = new XmlOptions();
    opt.setSavePrettyPrint();
    opt.setSavePrettyPrintIndent(3);

    webappDoc.save(new File("c:/webapp.xml"), opt);

I'll encourage you to use the JavaDoc to see the many methods that are available. If you use a tool like Eclipse the code-complete feature may also help you to inspect the methods that are at hand.

Package names

In the FirstTry program you may have noticed the names of the package in the generated jar-file: noNamespace. XMLBeans takes the package name from the namespace definition in the schema file, so let's make a new schema element like this:

<xsd:schema 
  targetNamespace="http://playground.hansen.dk" 
  elementFormDefault="qualified" 
  xmlns:xsd="http://www.w3.org/2001/XMLSchema">

In the XML instance file we must add the same namespace:

<web-app xmlns="http://playground.hansen.dk"> 

When we generate the jar file this time the package name will be dk.hansen.playground, so the only thing that needs to be changed in the FirstTry program is the import statements.

Schema validation

If you read an XML file that does not conform to the schema, you may get an error. As an example we'll change the load-on- startup value to a non-integer, for example "xyz". When the program is run you get this message:

com.bea.xbean.values.XmlValueOutOfRangeException: Not a valid 
integer: xyz   

This is OK, but the error occurs a little late, it shows up when you execute the getLoadOnStartup() method call. Note, by the way, that the exception comes from a BEA package. I'll assume that such packages will be renamed when the Apache version becomes available.

Generally you'd like to have a validation when you parse the XML file, and to do that you use the XmlOptions class. To use this class you instantiate it and set the properties you want. You may then use the instance in several method calls, for example the validate method:

  
. . .
  Collection errorList = new ArrayList();
  XmlOptions xo = new XmlOptions();
  // Collect errors in "errorList":
  xo.setErrorListener(errorList);

  File file = new File(args[0]);
  WebAppDocument webappDoc =
  WebAppDocument.Factory.parse(file);

  if (!webappDoc.validate(xo)) {
    System.out.println("Errors: " + errorList.size());
    for (Iterator it = errorList.iterator(); it.hasNext();) {
      System.out.println("Error: " + it.next());
    }
    System.exit(-1);
  }
  . . .

If we run the program again we get these messages:

Errors: 1
Error: C:\eclipse3.04\workspace\XMLBeans\web-struts-blank.xml:0: error: 
Illegal decimal, unexpected char: 120

ASCII character 120 is the character "x"--not the most informative error message. Let me show you a few other errors that will be detected: Attempt to use the welcome-file-list element twice:

Errors: 1
Error: C:\eclipse3.04\workspace\XMLBeans\web-struts-blank.xml:0: error: 
Element not allowed: welcome-file-list@http://playground.hansen.dk 
in element web-app@http://playground.hansen.dk 

Missing element url-pattern:

Errors: 1
Error: C:\eclipse3.04\workspace\XMLBeans\web-struts-blank.xml:0: error: 
Expected element(s) in element servlet-mapping@http://playground.hansen.dk 

This message is not as precise as we would like it to be. Hopefully this will improve before the first Apache release. Syntax error, the element name servlet-mapping misspelled:

com.bea.xml.XmlException: 
C:\eclipse3.04\workspace\XMLBeans\web-struts-blank.xml:32: 
error: </servlet-mapping> does not close tag <xservlet-mapping>.
at com.bea.xbean.store.Root.loadXml(Root.java:719)
. . .
at dk.hansen.playground.WebAppDocument$Factory.parse(Unknown Source)
at dk.hansen.playground.Try1.main(Try1.java:33)
Caused by: org.xml.sax.SAXParseException: 
</servlet-mapping> does not close tag <xservlet-mapping>.
at com.bluecast.xml.Piccolo.reportFatalError(Piccolo.java:1003)
at com.bluecast.xml.Piccolo.parse(Piccolo.java:705)
at com.bea.xbean.store.Root.loadXml(Root.java:695)
... 6 more
Exception in thread "main"  

This time an exception is thrown, so to catch errors like this we'll have to set up a try-catch block to handle the error in the program. Line number 32 is precise; it points to the erroneous line in web-struts-blank.

Validation of an XML file versus a schema may also be done with a utility located in the XMLBeans bin-folder. It's called "validate", and the syntax is this:

validate schema.xsd instance.xml  

Here's an example of its usage:

Types

In our schema we only use two simple schema types: string and integer, but the XML schema specification defines 46 different types. How does XMLBeans map those to Java types? The answer is:

  1. First of all these 46 types are mapped one-to-one with XMLBeans Java classes starting with "Xml". A string is for example mapped to the class XmlString  You get the instance of such a class by putting an "x" in front of your getters, e.g.
        XmlString xs = servlet[i].xgetServletName();   
  2. The schema types are also mapped to well-known simple types or classes. You use the normal getters, as in the program above, to work with the known Java types.

To see a picture of all the mappings click here. The schema type "string" looks like this in the picture:

Normally you'll work with the "natural" Java types, but the XMLBeans types gives you some possibilities, that I'll only touch upon here. All XMLBeans types inherit from XmlObject, which has several interesting characteristics. You may, for example, validate the object. Also if you assign a value using an "x"-setter method, and there are schema restrictions on the element (like an integer range), these restrictions will be checked.

More documentation of the relation between Schema types, XMLBeans types and Java types may be found here.

Queries

One of the features that will make XMLBeans really interesting is the capability to use a query language on the parsed document. Here's an example on how you'd search for a specific tag library in the xml file:

String queryExpression =
"declare namespace xq='http://playground.hansen.dk'" +
"$this/xq:web-app/xq:taglib[taglib-uri='/tags/struts-html']";

Taglib[] tl = (Taglib[])webappDoc.selectPath(queryExpression);

Unfortunately this feature is not available as open source, probably because it's bound to a BEA owned query engine. Time will show when this feature becomes available in the Apache project.

Shielding the actual implementation

XMLBeans is not the only choice when it comes to converting XML files into Java objects. Castor and JAX-B are other alternatives, and if you don't want to be binded to the tool you select, you could consider "wrapping" the implementation. A general interface to the data read from a web.xml file could be these implementation-neutral Java Interfaces:

Listing 3. Interfaces

public interface MyWebApp {
  public void init(File configFile) throws Exception; 
  public MyServlet[] getServlets();
  public HashMap getMappings();
  public String[] getWelcomeFiles();
  public HashMap getTaglibs(); 
}
   
public interface MyServlet {
  public String getName();
  public String getClassName();
  public HashMap getInitParams();
  public Integer getLoadOnStartup();
}

The meaning of the methods should be straightforward. The init method is used to receive the web.xml file.

An XMLBeans implementation of the MyWebApp Interface could look like this:

Listing 4. The XMLBeansImpl class

package dk.hansen.playground;

import java.io.File;
import java.util.HashMap;

import dk.hansen.playground.WebAppDocument.*;
import dk.hansen.playground.WebAppDocument.WebApp.*;

public class XMLBeansImpl implements MyWebApp {

  private WebApp webapp;
  
  public void init(File configFile) throws Exception {
    WebAppDocument webappDoc =
    WebAppDocument.Factory.parse(configFile);
    webapp = webappDoc.getWebApp();
  }

  public MyServlet[] getServlets() {
    Servlet[] servlet = webapp.getServletArray();
    MyServlet[] servlets = new XMLBeansServletImpl[servlet.length];
    for (int i = 0; i < servlet.length; i++) {
      servlets[i] = new XMLBeansServletImpl(servlet[i]);
    }
    return servlets;
  }

  public HashMap getMappings() {
    ServletMapping[] mapping = webapp.getServletMappingArray();
    HashMap map = new HashMap();
    for (int i = 0; i < mapping.length; i++) {
      map.put(mapping[i].getServletName(), mapping[i].getUrlPattern());
    }
    return map;
  }

  public String[] getWelcomeFiles() {
    WelcomeFileList fileList = webapp.getWelcomeFileList();
    return fileList.getWelcomeFileArray();
  }

  public HashMap getTaglibs() {
    Taglib[] tags = webapp.getTaglibArray();
    HashMap map = new HashMap();
    for (int i = 0; i < tags.length; i++) {
      map.put(tags[i].getTaglibUri(), tags[i].getTaglibLocation());
    }
    return map;
  }
}

The implementation of the MyServlet Interface is even simpler. You may find it in the download in the resources section.

A simple main program that will read the contents of a web.xml file could now look like this:

Listing 5. The SimpleWebAppReader class

package dk.hansen.playground;

import java.io.File;
import java.util.HashMap;

public class SimpleWebAppReader {

public static void main(String[] args) throws Exception {

  String file = "try1.xml";
  MyWebApp webapp = new XMLBeansImpl(); 
  webapp.init(new File(file));

  System.out.println("*** Data in " + file);
  System.out.println("Servlets:");

  MyServlet[] servlets = webapp.getServlets();
  for (int i = 0; i < servlets.length; i++) {
    MyServlet servlet = servlets[i];
    System.out.println(
      " Name/class: "
      + servlet.getName()
      + "/"
      + servlet.getClassName());
    HashMap init = servlet.getInitParams();
    System.out.println(" Init parameters:");
    System.out.println(" " + init);
    Integer load = servlet.getLoadOnStartup();
    System.out.println(
      " Load on startup: " + load);
  }

  System.out.println("Servlet Mappings:");
  HashMap maps = webapp.getMappings();
  System.out.println(" " + maps);

  System.out.println("Welcome File List:");

  String[] files = webapp.getWelcomeFiles();
  for (int i = 0; i < files.length; i++) {
    System.out.println(" Filename: " + files[i]);
  }

  HashMap taglibs = webapp.getTaglibs();
  System.out.println("Taglibs:");
  System.out.println(" " + taglibs);

  }
}  

The only binding here to the XMLBeans implementation is the statement

  
MyWebApp webapp = new XMLBeansImpl();

To remove this last sign of the implementation tool you could place the name of the implementing class in a properties file. Let's finish by doing that. We create a file, webapp.properties, with these contents:

RECEIVER=dk.hansen.playground.XMLBeansImpl  

The snippet of code to read this line and instantiate the proper class is below:

  
. . .
  String pFile   = "webapp.properties";
  // Load property file from same directory as this class 
  ClassLoader cl = new SimpleWebAppReader().getClass().getClassLoader();
  InputStream in = cl.getResourceAsStream(pFile);
  Properties p   = new Properties();
  p.load(in);
  String className = p.getProperty("RECEIVER");
  MyWebApp webapp  = (MyWebApp)Class.forName(className).newInstance();

  String file = "try1.xml";
  webapp.init(new File(file));
  . . . (continues like above) . . .

Exception handling has been totally ignored. You shouldn't do that in your applications! A somewhat more elegantly coded class, WebAppReader, may be found in the download.

Conclusion

My experience is that XMLBeans is easy to use, has a simple, understandable API, and actually handles a lot of schema files without problems. Just to test it, I tried to give the scomp tool some very large schema files from my j2ee v1.4 installation, and none of them were rejected. I got a jar file created for all of them. The largest file had more than 2000 lines of schema definitions. I also wrote a small program based upon the largest schema file, parsed an xml file containing 1800 lines, validated it (OK), and saved it back to disk--all without problems. Finally I also validated the saved xml file in XMLSpy. It was valid.

I measured the following computing times in milliseconds on my 1800Mhz Win2K computer:

The indications are that XMLBeans is a quality product, that I personally will look forward to using in real projects. Happy coding!

Interested in reading more about the performance of XMLBeans versus other tools like Castor, JAXB and Xerces? If so, then check out the links in the resources section.

Resources