Java and XML: putting SAX to work
by Keld H. Hansen
Why use XML?
XML is becoming increasingly popular. Why is that? After all XML is just a
data format, isn't it? To get an understanding of XML's popularity, take a look
at this file:
458345-3456,4367.12,6.5
436122-3456,18.30,5
419023-4523,-43218.45,9.5
What do you think it is? Let's take another file--with the same numbers:
<banking>
<account>
<number>458345-3456</number>
<balance>4367.12</balance>
<interest>6.5</interest>
</account>
<account>
<number>436122-3456</number>
<balance>18.30</balance>
<interest>5</interest>
</account>
<account>
<number>419023-4523</number>
<balance>-43218.45</balance>
<interest>9.5</interest>
</account>
</banking>
Suddenly the numbers make more sense, right? So XML is not just a data
format, it may also add some kind of information to your data. You might argue
that these files are probably read by a computer program, which really doesn't
prefer one format over the other, and you'll of course need to know what
"number", "balance", and "interest" really mean in order to use the data in a
real application. You're right, but programmers still use a lot of time looking
at data files--especially when their programs aren't working--and most humans
would prefer the last file format.
But the real power in XML is that it is a standardized format. If you give
the above XML-file to Internet Explorer for example it'll actually understand it
and show it in a nice format:
In the olden days when computing was done on mainframes (if you don't know
what a mainframe is then just think of it as a very large server), every program
used its own hand-made file formats--often for good reasons. If you compare the
sizes of the two files above then the second one contains more than four times
as many characters as the first one. If you have slow network connections and
expensive storage medias then a factor four means a lot. Numbers can also be
packed much more economically in binary formats. For example four bytes for a
Java "int" compared with up to ten bytes when the number is in text format.
But today disk storage is cheap, and networks are fast, so unless you have
very large data amounts you can afford the XML-format. Again, the important
factor which makes XML a good choice is its standardized format. Especially when
one considers the nature of the vast global communication network we have
today--which includes an array of computer brands and multiple geographical
dialects--it becomes imperative to have a common language/method to tie
everything together or to allow that communication. Thus, XML simplifies the
client-server world we live in.
But without backup from users and vendors no standard will survive. XML
is backed by lots of tools, API's, and committees.
We'll start our journey into the XML world by solving a simple task.
Contents:
Reading an XML file
Putting SAX to work
A DVD class
A complete event handler program for SAX
Sorting the data
Presenting the results in a browser
Conclusion
Resources
New on the Java Boutique:
New Review:
Time Management Made Easy with the Quartz Enterprise Job Scheduler
Why not just use the Java timer API? This open source scheduling
API boasts simplicity, ease-of-integration, a well-rounded feature
set, and it's free!
New Applet:
Reverse Complement
Reverse Complement is a simple applet that converts DNA or RNA
sequences into three useful formats.
Elsewhere on internet.com:
WebDeveloper Java
Lots of Java information on webdeveloper.com
WDVL Java
Thorough Java resource at the Web Developer's Virtual Library.
ScriptSearch Java
Hundreds of free Java code files to download.
jGuru: Your View of the Java Universe
Customizable portal with online training, FAQs, regular news updates, and tutorials.
|