17-Jun-03 (Created: 17-Jun-03) | More in 'Aspire Product News'

Use Aspire and Tomcat to retrieve XML declaratively from any relational database

Aspire in combination with Tomcat presents a very viable and cost-effective option for generating XML directly and declaratively from relational databases. Developers will start by choosing a set of SQL statements or stored procedures. Developers will then arrange these assets into a calling hierarchy in a configuration file. Aspire will execute this hierarchical data definition and returns a java based hierarchical data set. Aspire has pre-defined transformations that can convert this hierarchical data set into varieties of XML formats. Aspire also allows you to transform the resulting XML using XSLT via JAXP. The final XML or HTML can then be sent to the browser or consumed programmatically.

In addition to SQL and Stored-Procedures, you can also use File readers, java classes, and potentially others in their place. These assets are called relational adapters producing relational data sets. New relational adapters can be written quite easily for other enterprise data sources when needed.

The main focus of this article to allow programmers to quickly download Aspire and start generating XML from data bases. This is a cook-book/tutorial like approach that tells you everything you need to get started and be on your way to generate XML with very little coding. This article also encourages see-before-you-paint approach to web page development, meaning you can see your data on the web page as XML or text first before actually writing code for that page. This is very useful for debugging and discovering available fields in the data.

How to get your first XML declaratively with no coding from a database

Without further ado let us look into the steps for generating an XML document with out using any procedural coding. These steps are

  1. Install and configure Aspire and Tomcat
  2. Understand how a hierarchical data set look like
  3. Prepare your SQL or Stored Procs or FileReaders or any other adapters
  4. Declare a hierarchical data set in a configuration file utilizing the above adapters
  5. Define a URL to access this XML
  6. Fire off that URL to see the XML in a classical format
  7. Fire off the same URL to see the XML in an object like typed format

Install and configure Aspire and tomcat

You can download Aspire either as a simple jar file or a complete demo webapp that can be directly configured under tomcat as a webapp.  Downloading the webapp is better for working with examples outlined in this article.  The quick install section at Aspire's website outlines the downloading and installing Aspire. If you run into any problems please email me at [email protected]. Most common complaint setting up Aspire has been the differences in the web.xml between different Tomcat versions. The download includes web.xml files for release 3 and release 4 of Tomcat. For other releases adjust the web.xml accordingly. Additional information on configuring Aspire for Tomcat can also be found at the following two previous O'Reilly articles on Aspire

  1. For Tomcat Developers, Aspire Comes in a JAR
  2. Improve Your Career with Tomcat and Aspire

Understand how a hierarchical data set look like

XML is a by product of Aspire's reliance on hierarchical data sets. When you understand hierarchical data sets, it is easy to conclude the ensuing XML support. A hierarchical data set is a conceptual data structure that looks like the following:


<AspireDataSet>
    <!-- A set of key value pairs at the root level -->
    <key1>val1</key1>   
    <key2>val2</key2>

    <!-- A set of named loops -->
    <loop name="loop">
    </loop>
    <loop name="loop2">
    </loop>
</AspireDataSet>

So a Hierarchical Data Set is a set of key, value pairs. A given set of key/value pairs could yield 'n' number of independent loops. Each loop is essentially a table worth of data. The name "loop" is synonymous with "table". I haven't used "table" because people might literally take "table" to mean to hold data ONLY from a relational table. A "loop" can hold data from any data source and not just a relational table.  Let me now introduce the internal structure of a loop:


<loop name="loopname">
    <row>
        <!-- a set of key value pairs -->
        <key1>val1</key1>
        <key2>val2</key2>

        <!-- a set of named loops -->
        <loop name="loopname1">
        </loop>

        <!-- a set of named loops -->
        <loop name="loopname2">
        </loop>
    </row>
    <row>
    </row>
</loop>

The only odd thing here is the structure of a "row". A "row" is expectedly a collection of key/value pairs. But here a "row" is extended to include not only key/value pairs but also another recursive set of 'n' number independent loops. This extension gives rise to trees with any amount of depth (should I say height!).

The moment I have shown the hierarchical data as XML, there is a possibility that people might take a hierarchical data set to be literally XML and hence literally DOM and hence lot of memory inside of the JVM. No need to panic. The Hierarchical Data Set has its own Java API and need not be represented as a DOM and majority of the time is a forward-only-traversing-cursor-like-lazy-loading tree. Please refer to the "Hierarchical Data Sets Article" for examining this java API.

Prepare your SQL or Stored Procs or FileReaders or any other adapters

So a Hierarchical Data Set is a  nested collection of nodes where each node is a collection of rows. The declarative goal of Hierarchical Data Set is to associate a relational adapter to each of these nodes. The output of that relational adapter will constitute the contents for that node. Let us say you are trying to retrieve a collection of reports that are organized into categories. So your hierarchy is:


A set of Categories
    A set of Reports for each category

This means that you will have (potentially) the following two separate SQL statements


1). Select * from categories_table
2). Select * from reports_table where category = {category_name}

The first select will result in a collection of categories with no input. The second select will retrieve a collection of reports with in that category. The category name or id becomes the input that query.

Declare a hierarchical data set in a configuration file

Given the above two SQL statements let us see how we can use those two statements to declaratively construct a hierarchical data set.  The declarative definition follows:


# section1 - reportsHDS data definition
request.reportsHDS.className=com.ai.htmlgen.DBHashTableFormHandler1
request.reportsHDS.loopNames=categoriesLoop

#section2 - loop data for categoriesloop
request.reportsHDS.categoriesLoop.class_request.className=com.ai.htmlgen.GenericTableHandler6
request.reportsHDS.categoriesLoop.loopNames=reportsLoop
request.reportsHDS.categoriesLoop.query_request.className=com.ai.db.DBRequestExecutor2
request.reportsHDS.categoriesLoop.query_request.db=<database_alias>
request.reportsHDS.categoriesLoop.query_request.stmt=Select * from categories_table

#section3 - sub-loop data for reports loops
request.reportsLoop.class_request.classname=com.ai.htmlgen.GenericTableHandler6
request.reportsLoop.query_request.classname=com.ai.db.DBRequestExecutor2
request.reportsLoop.query_request.db=<database_alias>
request.reportsLoop.query_request.stmt=Select * from reports_table where category = {category_name}

There are three sections in this definition. The data set is named "reportsHDS". The first section is telling Aspire that the java class "com.ai.htmlgen.DBHashTableFormHandler1" is responsible for returning an object implementing "ihds". Unless you code your own implementation of "ihds" this is the class that you use again and again in every data set definition as this class is the one that is pre-fabricated and knows how to compose relational assets into hierarchical assets. The line2 of section1 is telling to "DBHashTableFormHandler1" that this main data set has one loop called "categoriesLoop".

Section2 is a definition for the loop "categoriesLoop". A loop structure in Aspire uses two java classes a class request (GenericTableHandler6) and a query request (com.ai.db.DBRequestExecutor2). A "com.ai.db.DBRequestExecutor2" reads a set of records from a database and makes them look like a collection of rows and columns. GenericTableHandler6 is responsible for taking this collection and apply such value added features as aggregate values, row numbers etc and also implements "ihds" interface at the loop level. As with "DBHashtableFormHandler1", GenericTableHandler6 is reused with most of your definitions. com.ai.db.DBRequestExecutor2 might change depending on your data sources.

For example the following parts exist in this category


    RowFileReader
    DBRequestExecutor2 ( For reading SQL)
    StoredProcedureExecutor2 (For  reading from Stored Procedures)
    XMLReader (For reading xml files)
    Or you can write your own reader that implements IDataCollection

If you were to have a sibling for "categoriesLoop" you would have repeated this Section2 for that sibling with identical structure except for the name of the loop and the sql statement.

Section2 has also indicated that "categoriesLoop" has a child called "reportsLoop". GenericTableHandler6 will take this cue and look for section3 identified by "reportsLoop".

Section3 defines "reportsLoop". The definition is identical to Section2 except that "reportsLoop" has no children. Both section2 and section3 uses com.ai.db.DBRequestExecutor2. But in practice they can use any combination of data reader parts.

Finally let me call this file where these definitions exist "reportsHDS.properties". You include this file in Aspire's master aspire.properties as follows


application.includeFiles=aspire:\\samples\\hello-world\\properties\\hello-world.properties,\
aspire:\\samples\\reports\\reportsHDS.properties,\
aspire:\\samples\\xml-reader\\xml-reader.properties

For the sake of completeness I have included a couple of lines above and below that inclusion process.

Define a URL to access the data definition as XML

Once the Hierarchical Data Set is declaratively defined, you need to link this definition to a URL so that you can invoke your data definition from a web page. You will write the following two lines to accomplish this:


    reportsHDSURL.formHandlerName=reportsHDS
    request.reportsHDS.form_handler.class_request.className=com.ai.htmlgen.DBHashTableFormHandler1

The first line is basically linking a URL name to the data definition. The second line is needed because of a backward compatibility quirk in the product. Now we are ready to see the data as XML.

Fire off a URL to see the XML in a classical format

To see the data as Classic-XML invoke the URL as follows:

http://localhost/aspire/servlet/DisplayServlet?url=reportsHDSURL&aspire_output_format=Classic-xml

The above URL string assumes you have installed the aspire demo application under tomcat and you have included your definition properties file in this webapp. If the above URL results in a userid/password challenge use your first name as the user-id and a password of "aspire".

The displayed output on the page will look like the following

<AspireDataSet>
    <key name="aspire_output_format">Classic-xml</key>
    <loop name="CategoriesLoop">
        <row>
            <key name="category_name">Java Howto Reports</key>
            <loop name="reportsLoop">
                <row>
                    <key name="report_name">servlets</key>
                </row>
                <row>
                    <key name="report_name">jdbc</key>
                </row>
            </loop>
        </row>
        <row>
            <key name="category_name">.Net Howto Reports</key>
            <loop name="reportsLoop">
                <row>
                    <key name="report_name">c#</key>
                </row>
                <row>
                    <key name="report_name">vb</key>
                </row>
            </loop>
        </row>
    </loop>
</AspireDataSet>

This output xml format is called Classic-xml, because the keys are named as actually "key" elements. It is possible to define an XSD for this quite easily as the only elements you are looking at are "loop", "row", "key" and "AspireDataSet". When you contrast this with the Object-xml you will know why I have called this Classic-xml.

Fire off a URL to see the XML in an object like typed format

The URL to retrieve the Object-XML using the same data definition is as follows:

http://localhost/aspire/servlet/DisplayServlet?url=reportsHDSURL&aspire_output_format=Object-xml

The produced Object XML looks like this


<AspireDataSet>
    <aspire_output_format>Classic-xml</aspire_output_format>
    <CategoriesLoop>
        <row>
            <category_name>Java Howto Reports</category_name>
            <reportsLoop>
                <row>
                    <report_name>servlets</report_name>
                </row>
                <row>
                    <report_name>jdbc</report_name>
                </row>
            </reportsLoop>
        </row>
        <row>
            <category_name>Java Howto Reports</category_name>
            <reportsLoop>
                <row>
                    <report_name>c#</report_name>
                </row>
                <row>
                    <report_name">vb</report_name>
                </row>
            </reportsLoop>
        </row>
    </CategoriesLoop>
</AspireDataSet>

This format for XML is more suitable for binding to objects in java and other languages as the column names translate to field names. Even the word "row" can be replaced by an individual object name if desired.

Supply an XSLT to transform the classic XML

Now that you know how to get the XML out, let us see how you can drop in an XSLT style sheet to transform the XML to a target HTML. For this you will slightly modify the URL linking process as follows:


The previous two lines that link a URL to a data definition
    reportsHDSURL.formHandlerName=reportsHDS
    request.reportsHDS.form_handler.class_request.className=com.ai.htmlgen.DBHashTableFormHandler1

To the previously discussed above two lines add the following:

    request.reportsHDSURL.transform.className=com.ai.xml.CommonXSLTransform
    request.reportsHDSURL.transform.domConverterAbsoluteRequestName=generictransform.classic-xml
    reportsHDSURL=<your xslt filename path>

The first line is pointing to a java class that is going to perform the actual XSL transform. The second line is pointing to a java class (indirectly through a request name) that is responsible for creating a suitable DOM for the XSLT you are looking for. At the time of this writing there are two formats available. Classic-xml and Object-xml. The third line is pointing to the actual XSLT filename that has the code to transform the XML to HTML.

Once the configuration is in place, you can call the following URL to see the transformed HTML page:

http://localhost/aspire/servlet/DisplayServlet?url=reportsHDSURL

Notice that you have simply dropped the "aspire_output_format" argument from the URL, This is an imporant observation. Because every URL in your page can have a link with this additional argument that can retrieve the page data as XML, or Excel, or Text based on the format type.

Supply an XSLT to transform the object XML

If you would like to use the Object-xml format for your XSLT transformation then change the following line as such


old
    request.reportsHDSURL.transform.domConverterAbsoluteRequestName=generictransform.classic-xml
new
    request.reportsHDSURL.transform.domConverterAbsoluteRequestName=generictransform.object-xml

How to have Aspire generate your own specific base XML structure

Say after the above two choices, you are not happy with both the formats. It is fairly simple to generate a format of your own. Refer to "Hierarchical Data Sets Article" to see how this can be done in detail. Other wise the key lies in the definition "transform.domConverterAbsoluteRequestName=generictransform.object-xml". The "generictransform.object-xml" is pre-defined in the properties to identify a java class name that is responsible for generating the DOM that you are interested in. Here are those pre-defined generic transformations:


    GenericTransform.Classic-xml.classname=com.ai.generictransforms.ClassicXMLGenericTransform
    GenericTransform.Object-xml.classname=com.ai.generictransforms.ObjectXMLGenericTransform

    #Excel transform
    GenericTransform.Excel.classname=com.ai.generictransforms.ExcelGenericTransform
    GenericTransform.Excel.headers=Content-Type=application/vnd.ms-excel|Content-Disposition=filename=aspire-hierarchical-dataset.xls

    #Text
    GenericTransform.Text.classname=com.ai.generictransforms.DebugTextTransform

You can take a look at the source files for each of these classes to see how they are implemented in the distribution.

Role of generic transformation in XML generation

The above specified transformations are called generic transformations. These transformations are responsible for converting a hierarchical data set to any one of output formats identified by aspire_output_format

The available formats at the time of writing are

  1. text
  2. excel
  3. Classic-xml
  4. Object-xml

Programmatically obtaining XML

So far you have obtained XML from a URL. Some times you may want to get the XML stream programmatically. Read the "Hierarchical Data Sets Article" for a complete details on how to access this XML programmatically. But briefly here is the process

  1. Obtain a hierarchical data set identified by an interface called ihds
  2. Obtain a proper XML transformation you need
  3. Call the XML transformation to receive either an XML stream or DOM by passing in the "ihds" interface

Let me go into each of these steps in some detail:

Obtain a hierarchical data set identified by an interface called ihds


    Hashtable args = new Hashtable();
    args.put("key1".toLowerCase(), "value1");

    ihds hds = (ihds)AppObjects.getObject("reportsHDS",args);

The above sample code is very typical of programming in Aspire. First you prepare your arguments as a hashtable with keys converted to lowercase. Then you will invoke a factory service to call a symbolic name such as "reportsHDS" in this case. This is similar to calling a SOAP service called "reportsHDS". Aspire will locate a java class responsible for this name and instantiate the class and call its well known method to return the results. In this case the class happens to be returning a hierarchical dataset allowing us to typecast it to "ihds".

Obtain a proper XML transformation you need


Sample code
    IGenericTransform gt = 
        (IGenericTransform)
        AppObjects.getObjectAbsolute("generictransforms.classic-xml",null);
    PrintWriter out;
    gt.transform(ihds,out);
    out.close();

The sample code here is a repeat of the same pattern. Essentially "generictransforms.classic-xml" is a symbolic name pointing to a java class that will return a java object supporting an interface IGenericTransform. The source code in package com.ai.generictransforms can show you the APIs available on this interface. The method "transform" will write the XML to the print-writer that is passed in. You can close the print-writer when you are done with it.

The only difference between "getObject(..)" and "getObjectAbsolute(..)" is that "getObject(some_name,...)" is equivalent to "getObjectAbsolute("request." + some_name,..). So getObject() is a shortcut and assumes that all symbolic names are prefixed by "request." in the configuration file.

Look into the O'Reilly article For Tomcat Developers, Aspire Comes in a JAR for understanding configuration and factory service in Aspire in detail.

Call the XML transformation to receive either an XML stream or DOM

Programmatically you may not want a print-writer worth of XML but a DOM. The classes that are responsible for converting an "ihds" to a DOM support an interface called IhdsDOMConverter. As it happens the XML generictransforms classes that are introduced before also implement this interface making them the ideal candidates not only to generate the textual XML but also DOM. Let us look at the sample code to get the DOM for our data definition:

Sample code
    IhdsDOMConverter domConverter = 
        (IhdsDOMConverter)
        AppObjects.getObjectAbsolute("generictransforms.classic-xml",null);
        Document = domConverter.convert(hds);

Conclusion

Aspire gives you lot of flexibility for obtaining and transforming XML. The resulting XML facilities can be used to participate in B2B, develop SOAP, or even to develop discoverable middle tier solutions. The unique thing is how fast one can develop the XML from relational databases.

Implication of the above to Tomcat development community

I believe the implication of these facilities that are documented here are quite exciting to the Tomcat developer community. The development community don't have to depend on expensive or database-specific solutions for generating XML for making their sites participate in a B2B network. Aspire gives them lot of flexibility for obtaining and transforming XML. The resulting XML facilities can be used to participate either in B2B, or develop SOAP, or even to develop discoverable middle tier solutions. The unique thing is how fast one can develop the XML from relational databases.

Additional References

  1. For Tomcat Developers, Aspire Comes in a JAR - Read this article for understanding the basic services of Aspire: Configuration, Factory, Data Access.
  2. Improve Your Career with Tomcat and Aspire - Read this article for rapidly developing web applications under Tomcat with little or no java coding for relational databases using SQL and Stored Procedures for any relational database
  3. Is it not time to invite the cathedral to the bazaar - Understand the focus of Aspire as a tool for bringing J2EE to a broader audience
  4. Transparent data pipelines  - Read this article for understanding the architectural arguments for declarative abstractions, hierarchical datasets and pluggable transformations
  5. A JSP architecture for Oracle stored procedures, July 2001 Java Report
  6. Generating/consuming XML documents from relational databases, Vol2, Issue 4, XML Journal - Read this 2001 article outlining the same basic principles from an architectural perspective while the current article in O'Reilly presents an implementation that the users can immediately put to use.
  7. Aspire home page - Visit this page for downloading, installing and reading more about Aspire