20-Jan-06 (Created: 20-Jan-06) | More in 'Howto-Advanced'

How to add data retrieval parts to Aspire

Aspire depends on two data abstractions. One is a relational abstraction representing a collection of rows and columns and the other a hierarchical abstraction which is a grouping of the relational abstraction with a context. The central class in relational abstraction is represented by a type called IDataCollection. The hierarchical abstraction is represented by a type called "ihds".

Aspire also uses configuration files for declaring both types of data. Consider the following web page and a data definition


testpageURL=/jsps/testpage.jsp
testpageURL.transformType=JSP
testpageURL.transform.className=com.ai.jsp.JSPTransform
testpageURL.dataRequestName=testpageForm

#main data
request.testpageForm.className=com.ai.htmlgen.DBHashTableFormHandler1
request.testpageForm.mainDataRequest.className = com.ai.db.DBRequestExecutor2
request.testpageForm.mainDataRequest.db =kendev
request.testpageForm.mainDataRequest.stmt =select * from projects

#a child of main data 
request.testpageForm.sqlloop.class_request.className=com.ai.htmlgen.RandomTableHandler7
request.testpageForm.sqlloop.query_request.className=com.ai.db.DBRequestExecutor2
request.testpageForm.sqlloop.query_request.db=kendev
request.testpageForm.sqlloop.query_request.stmt=select onesqlstatement from sqlstatements

In the example above the "ihds" is represented by DBHashTableFormHandler1. The relational abstraction is represented by DBRequestExecutor2. A table handler such as RandomTableHandler7 adapts the relational abstraction to the hierarchical abstraction

So adding data into ihds is simply a matter of specializing the relational model represented by the dbrequestexecutor2, although you can also do this at the RandomTableHandler7 level. It is much simpler to do at the relational level.

Writing a relational adapter

In the example above DBRequestExecutor2 is a class with an execute method that is expected to return an IDataCollection reference. Any class that can implement a similar interface can be replaced in its place.

I am going to show sample code of an example class I wrote for retrieveing matching document urls from a Lucene search collection. This example will demonstrate some of the important features needed for doing similar work.

Sample code


package com.indent.lucene.similarity;

import com.ai.application.interfaces.*;
import java.io.*;
import java.util.*;

import org.apache.lucene.document.Document;
import org.apache.lucene.index.TermFreqVector;
import org.apache.lucene.search.Query;

import com.ai.application.utils.*;
import com.ai.data.*;
import com.ai.common.*;

/**
 * LocateSimilarDocumentsPart
 * ******************************
 * 1. Searches for similar documents based on input
 * 2. Collects term frequency vector from description to search for similar documents 
 * 3. The returned documents are packaged as an IDataCollection
 * 4. This will allow for using this part in page design directly
 *
 * Expected input args
 * ******************************
 * app: Indent app name
 * id:  Indent lucene document id
 * numofdocs: Maximum number of similar docs to be returned
 * 
 * Output/Behaviour
 * 1. Returns IDataCollection
 * 2. resultName: Completed hello word string
 * 3. Will write a debug message to the log
 *
 */

//It is a good idea to extend the AFacotryPart
public class LocateSimilarDocumentsPart2 extends AFactoryPart
{
   //The return object must be of type IDataCollection
    protected Object executeRequestForPart(String requestName, Map inArgs)
            throws RequestExecutionException
    {
       
       IndentLuceneIndex li = null; 
       try
      {
          //You can use Aspire factory instantiations for singletons
          li = (IndentLuceneIndex)AppObjects.getObject("indentluceneindex",null);
          
          //Collect input args
          //You can throw an exception if the args are not there
          String app = (String)inArgs.get("app");
          String id = (String)inArgs.get("id");
          String numOfDocs = (String)inArgs.get("numofdocs");
          int iNumOfDocs = Integer.parseInt(numOfDocs);
          
          //Get a list of documents
          List documentList = 
             getSimilarDocuments(li,app,id,iNumOfDocs);
          
           AppObjects.log("Number of documents found:" + documentList.size());
           AppObjects.log("Number of documents requested:" + iNumOfDocs);

           //Print the documents in a log for debugging purposes
           Iterator itr = documentList.iterator();
           int stopCount = 0;
           while(itr.hasNext())
           {
              Document doc = (Document)itr.next();
              li.printDocDetails(doc);
              stopCount++;
              if (stopCount >= iNumOfDocs)
              {
                  AppObjects.log("Number of documents quota fulfilled:" + iNumOfDocs);
                 break;
              }
           }
         //Package the returned collection as an Aspire Data Collection
          return getDocumentCollection(documentList);
      }
       catch(IOException x)
      {
          throw new RequestExecutionException("Error:Getting similar documents from IndentLuceneIndex",x);
      }
       finally
      {
          if (li != null)
          {
             li.closeSessionQuietly();
          }
      }
    }//eof-function
    
    /**
     * Construct an IDataCollection from the 
     * lucene document list and return.
     * 
     * @param documentList
     * @return
     */
    private IDataCollection getDocumentCollection(List documentList)
    {
      //The columns of the collections goes into a vector as strings
       Vector columnNamesVector = new Vector();
       columnNamesVector.add(IndentLuceneIndex.FIELD_ID);
       columnNamesVector.add(IndentLuceneIndex.FIELD_APP);
       columnNamesVector.add(IndentLuceneIndex.FIELD_DOC);
       columnNamesVector.add(IndentLuceneIndex.FIELD_TITLE);
       columnNamesVector.add(IndentLuceneIndex.FIELD_DESCRIPTION);


      //ListDataCollection is an implementation of IDataCollection
      //It requires the column names in a vector
       
       ListDataCollection luceneDocumentCollection 
         = new ListDataCollection(columnNamesVector);

      //Fill it up with rows
       Iterator luceneDocItr = documentList.iterator();
       while(luceneDocItr.hasNext())
       {
          Document doc = (Document)luceneDocItr.next();
          IDataRow collectionRow = getDataRow(doc
                               ,new VectorMetaData(columnNamesVector));
          luceneDocumentCollection.addDataRow(collectionRow);
       }
       return luceneDocumentCollection;
    }
    
   //Convert a collection of elements or columns into a Row
    private IDataRow getDataRow(Document luceneDoc, IMetaData columnMetaData)
    {
      //Collect a list of string column values
       List columnValues = new ArrayList();
       columnValues.add(luceneDoc.get(IndentLuceneIndex.FIELD_ID));
       columnValues.add(luceneDoc.get(IndentLuceneIndex.FIELD_APP));
       columnValues.add(luceneDoc.get(IndentLuceneIndex.FIELD_DOC));
       columnValues.add(luceneDoc.get(IndentLuceneIndex.FIELD_TITLE));
       columnValues.add(luceneDoc.get(IndentLuceneIndex.FIELD_DESCRIPTION));
       
      //Construct a list data row which is an implementation of IDataRow
       return new ListDataRow(columnMetaData,columnValues);
    }
    
    public List getSimilarDocuments(IndentLuceneIndex li, String app, String id, int numOfDocs)
    throws IOException
    {
       
       int docnum = li.searchForDocumentNumber(app,id);
       logSearchWords(li,docnum,IndentLuceneIndex.FIELD_TITLE);
       logSearchWords(li,docnum,IndentLuceneIndex.FIELD_DESCRIPTION);
       logSearchWords(li,docnum,IndentLuceneIndex.FIELD_CONTENTS);
       
       Document doc = li.searchForDocument(app,id);
       Query q = RelevanceUtils.getRelevanceQuerySimple("contents",getSearchWords(li,doc,docnum));
       return li.searchForDocsUsingQuery(q);
    }

    protected List getSearchWords(IndentLuceneIndex li, Document doc,int docnum)
    throws IOException
    {
       List sampleList = new ArrayList();
       String titleWords[] = li.getTermVectors(docnum,IndentLuceneIndex.FIELD_TITLE).getTerms();
       String descWords[] = li.getTermVectors(docnum,IndentLuceneIndex.FIELD_DESCRIPTION).getTerms();
       for(int i=0;i<titleWords.length;i++)
       {
          sampleList.add(titleWords[i]);
          AppObjects.info(this,"Adding searchword:" + titleWords[i]);
       }
       for(int i=0;i<descWords.length;i++)
       {
          sampleList.add(descWords[i]);
          AppObjects.info(this,"Adding searchword:" + descWords[i]);
       }
       
       return sampleList;
    }
    
    private void logSearchWords(IndentLuceneIndex li, int  docnum, String fieldName)
    throws IOException
    {
       TermFreqVector tfv = li.getTermVectors(docnum,fieldName);
      String words[] = tfv.getTerms();
      AppObjects.log("Number of terms:" + words.length + " in " + fieldName);
    }
}//eof-class

Important points

1. Derive from AFactoryPart as it does some ground work for you

2. You can also derive from ADataCollectionProducer if you want a type safe approach

3. You have to return a type of IDataCollection

4. ListDataCollection is a simple implementation that uses a list of lists to represent rows and columns

5. You can use AppObjects.getValue(...) to read additional parameters your class may need from the config file

References

  1. Understanding data options in Aspire