Knowledge Folders is a web application that holds and displays content for multiple users. I have been wondering lately if I could expose the content from this single web application as multiple web sites with their own domain names? Could I use virtual hosts to do this? Or do I need to use reverse proxies? How and where am I going to register domain names? What entries I need to make in tomcat configuration files? How do I handle emails for these independent domains? What else do I need to do in my web application? What will the end result look like?

After a few weeks of effort, I was able to expose Knowledge Folders as multiple web sites with their own domain names. It turned out I didn't need to go to reverse proxies for now and could use virtual hosts. I was also able to get my multiple domain names from godaddy.com. I was able to use tomcat host/alias settings to effectively route traffic from all these domains to the same webapp. Using index.jsp of the webapp I was able to separate the content between different domains. After all this effort I ended up with a way to publish online web sites very quickly and expose them as their own domains. The resulting web sites will have a number of features that static web sites won't be able to accomplish easily.

Background

I wrote Knowledge Folders a few years ago as a work around for keeping my notes online. I used to keep these "rapid notes" using microsoft outlook and a few macros. Especially the ability to file these notes into classified folders appealed to me. When I took that application to the web it was natural for me to make it a multi user system allowing a number of users to manage their own notes and perhaps also share their notes.

Initially these notes used to be various SQL scripts that run on a database. The initial release even had an execution engine to run these notes against a target database and return the results. I have abandoned this later to focus on the morphing that was to come.

At about the same time I was in search of a tool to document my open sourced tool Aspire/j2ee. I was looking more on the lines of wiki/web logs. Not being happy with what I found I have changed Knowledge Folders to focus on being able to document open source software. At this time Knowledge Folders was basically a collection accounts (or users), files, and folders. This gave Knowledge Folders the semblance of "Knowledge Folders" where knowledge is created and classified into various folders.

Later that year I have introduced the idea of master pages (background html similar to tiles) to give a face lift and proper portrayal to the content. This took Knowledge Folders towards a content management system, where content can be portrayed with proper backgrounds.

Later I have added some collaboration features and task management for individual users.

During all this time there is a single domain through which users have accessed their accounts. Although not difficult it was awkward to pass the account urls for individual users to ones friends or any intended audience. I wanted to expose each user as his/her own domain.

My original thought involving reverse proxy servers

Initially the problem sounded like a case where I will have the individual domains pointing to a "piece of code" on the server that will inturn read the content from the single web app. This intermediate piece of code would some how associate the incoming domain name to an account in Knowledge Folders and use Knowledge Folders as a source/sink to read/write web pages. In essence it is working a like a proxy to the actual Knowledge Folders.

By reading the definition for Reverse Proxy from wiki pedia it seemed like it could be used for the purpose. In fact this may turn out to be a good solution some time in the future if I was to further segregate content than what is currently achieved. Perhaps the Reverse Proxy Patterns PDF by Peter Sommerland might throw some light on possibilites.

I was also hoping to use the reverse proxy facility of apache to accomplish this.

Although I have mentioned the links on reverse proxies here for further research, I will mention the key elements briefly .

What are reverse proxies?

Reverse proxies are web servers that stay in front of other web servers, possibly internal to a corporate network. This indirection turns out to be useful in a number of situations. These proxy servers typically read or intercept communications from a browser and rewrite to the back end servers. Users are usually exposed only to the domain names of these reverse proxy servers and not the backend servers. The reverse proxy servers will inturn call some internal servers to fullfill the request. They typically break the incoming ip pipe and open a separate pipe to the target servers. As a result, reliably implementing a proxy server is not easy as it has to behave like an exact target server, but at the same time truly intercept all of the data and http headers.

What are reverse proxies used for?

Reverse proxies are routinely used to offload ssl certificates. In this scenario https traffic is routed to a reverse proxy server. The reverse proxy server will convert the traffic from https to http and then forward that request to an http internal server. In this approach a single reverse proxy server can be used to off load SSL and hence save certificates to multiple back end servers. Nevertheless, sometimes this approach poses issues for sendredirect on the taget server. When "sendRedirect" is used, some times a relative url is being translated into an absolute url using wrong scheme (http vs https). Fortunately this can be dealt with by rewriting SendRedirect.

Reverse proxies can also be used to expose a single domain for multiple web applications on the backend. Each separate server can be mapped to a path based on the main domain.

There are also approches to provide role based security using Proxy server gate keepers by monitoring every url.

Implications to web application development in the face of reverse proxies

It is imperative that all the urls should be relative for reverse proxies to work well. This is because the reverse proxy is rewriting the page using a different and typically an external name. So internal names are not known to the outside world. So urls on your web pages delivered by back end servers should typically read:


/webapp/resource1.html

What are virtual hosts?

Although a solution involving reverse proxies seemed possible, I came to find out that the hosting facility at Indent is hosting my web app on tomcat and not Apache. On some initial research I couldn't find out if tomcat supports reverse proxies. So my exploration lead towards virtual hosts and see if they solve the problem as well.

A virtual host allows multiple domain names for a given ip address. In other words a given ip address can have any number of host names. When requests are received on behalf of these host names a web server can decide to deliver content from different root directories or different web apps in case of tomcat.

For example


www.host1.com  points to /webapp1
www.host2.com points to /webapp1
www.host3.com points to /webapp2

See how the host names and web apps are bound in a many to many relationship.. There will be one host entry for each host. When multiple host names are bound to the same web app one can use Tomcat aliases facility.

Examples of virtual hosts in Tomcat

Based on this here is a sample set up for Knowledge Folders


     <Host name="www.knowledgefolders.com" 
             appBase="D:/webpage_demos/akc"
              unpackWARs="true" 
              autoDeploy="true" 
              xmlValidation="false" 
              xmlNamespaceAware="false">
           
               <Alias>knowledgefolders.com</Alias>

               <Alias>www.knowledgefolders.net</Alias>
               <Alias>knowledgefolders.net</Alias>

               <Alias>www.knowledgefolders.org</Alias>
               <Alias>knowledgefolders.org</Alias>
            
               <Alias>www.satyakomatineni.com</Alias>
               <Alias>www.kavithakomatineni.com</Alias>

               <Context path="" docBase="D:/webpage_demos/akc" 
                   debug="0" reloadable="false"/>
               <Context path="/akc" docBase="D:/webpage_demos/akc" 
                  debug="0" reloadable="false"/>
     </Host>

See how all of the following host names point to the same web app "akc" (previous name for knowledgefolders)

knowledgefolders.com
www.knowledgefolders.com

knowledgefolders.net
www.knowledgefolders.net

knowledgefolders.org
www.knowledgefolders.org

www.satyakomatineni.com
www.kavithakomatineni.com

Registering a domain name

Originally Knowledge Folders was hosted on a static ip at Indent, Inc. With potential changes to internal ip I have decided to get a proper domain address for Knowledge Folders.

I went to godaddy.com on the advice of a friend. Godaddy.com is very good. They have excellent support and prices seemed very cheap. I have registered three domains in the process

knowledgefolders.org
knowledgefolders.net
knowledgefolders.com

Registering these domains is quite simple at godaddy. But setting up the rest took some work. Knowledge Folders is physically hosted with Indent at Peak 10, a hosting facility, on a dedicated windows server where as the domain names are registered at godaddy.

Securing the name servers

To make the domains work the first thing I needed to know from peak 10 are the name servers that would be used to resolve the host names. I needed two name servers. For instance the name servers for peak 10 are

NS1.JAX.PEAK-10.COM
NS1.CLT.PEAK-10.COM

Setting up the ip address association to name servers

The next step is to tell the peak 10 staff the domain names I have resgistered and the physical ip address to which the host names should be pointing to.

With these changes I am able to access Knowledge Folders with all of the domain names.

Changes to knowledge folders

So I am able to take multiple domain names and point them to the same web app on a given physical ip. So for instance when I have accessed


http://www.satyakomatineni.com

I was taken to the home page of Knowledge Folders.

But my intention was to go to the homepage of the account identified by the userid of "satya". This needed some changes to the following in knowledge folders

index.jsp
some new definitions in the properties files

The general idea is to have the index.jsp identify the incoming host name and provided that there is a way to associate the the domain name to an account then the index.jsp can transfer control to the home page of that account.

Example index.jsp

The source code of index.jsp that accomplishes this is as follows


<!--
*************************************************************
* Sample code for knowing the Knowledge Folders url:
* Standard aspire libraries
*************************************************************
-->
<%@ page import="com.ai.htmlgen.*" %>
<%@ page import="com.ai.application.utils.*" %>
<%@ page import="com.ai.common.*" %>

<!--
*************************************************************
* html header
*************************************************************
-->
<html><head>
<title>Welcome to Aspire Knowledge Center</title>
<link rel="stylesheet" type="text/css" href="/akc/style/style.css">
<script src="/akc/js/genericedits1.js"></script>

<!--
*************************************************************
* Figure out home page, 
* if not found use the main home page of Knowledge Folders
*************************************************************
-->
<%
   String hostname = request.getServerName();
   String homepageurl = AppObjects.getValue("aspire.multiweb." 
                        + hostname + ".homepageurl",null);
   String targeturl = "";
   if (homepageurl == null)
   {
      targeturl = "/akc/akchome.html";
   }
   else
   {
      //hostuserid exists
      targeturl = homepageurl;
   }
   String debug = request.getParameter("debug");
%>
<script>

<!--
*************************************************************
* gotoHomePage() on load
*************************************************************
-->
function gotoHomePage()
{
   debugAlert("gethost on the client side:" + getHost());
   debugAlert("<%=hostname%>:<%=homepageurl%>");
   var targeturl = "<%=targeturl%>";
   debugAlert(targeturl);
   document.location.replace(targeturl);
   
}
<!--
*************************************************************
* some debugging support
*************************************************************
-->
function debugAlert(message)
{
   var debug = "<%=debug%>";
   if (debug == "true")
   {
      alert(message);
   }
}
</script>
</head>
<!--
*************************************************************
* onload
*************************************************************
-->
<body onload="gotoHomePage()">
</body></html>

Example properties files

Here is the Aspire/J2EE configuration file to support the "domain name to account" translation or mapping


aspire.multiweb.www.satyakomatineni.com.userid=satya
aspire.multiweb.www.satyakomatineni.com.homepageurl=\
/akc/update?request_name=GotoHomepageURL&ownerUserId=satya

Summary of a setup procedures for creating a new website in Knowledge Folders

  1. Register a domain with godaddy or any domain name registrar
  2. Provide name servers for the domain
  3. Associate/inform ip address with name servers using an email to the hosting providers
  4. Add an alias to tomcat server.xml under the host corresponding to the web app
  5. Make changes to aspire configuration to tie the domain name to an individual account
  6. Optionally set up an email account for the domain
  7. Write down the passwords for your all of the accounts you have setup

The email option

As it exists today Knowledge Folders is quite flexible and convenient to create on line web sites without any additional tools. This is a great advantage for small companies that wants to have a web presence quickly with out having to buy any hosting space. Nevertheless these small companies also want a basic email setup addressed at the domain so that they can use it on the business cards or general advertisement.

This can be done in two ways.

Setting it up with Indent

You see there are three players in this solution. The domains are registered at godaddy. The windows server on which the software runs is sitting on peak 10 network requiring that I use their name servers. The actual windows server is owned and operated by Indent.

So the first option involved alerting Indent to create email accounts. Indent uses james mail server. Indent usually uses a manual process to either create a full fledged email account or provide email forwarding for that account.

Using godaddy's email accounts

Being a registrar of the domain godaddy is allowing a free email account for each registered domain. You can also purchase additional emails if needed. godaddy also offers email forwarding. godaddy has online tools to manage these email accounts.

But it is tricky to use these email accounts at godaddy if the original mail server for your domain is at a hosting facility. You have to setup mx records and cname records at the mail server to accomplish this.

godaddy recommends the following to be added to the domain name system manager

MX 0 - smtp.secureserver.net 
MX 10 - mailstore1.secureserver.net   

What are mx records and how do they work

According to an MX faq a mail sender program checks the domain name system to see if the server has an mx record pointing to another mail server. If it has then it will use that server as the target server. It may even be recursive. This is one way to redirect the mail. This is how the mx records at peak 10 will reroute the mail to godaddy and I could use the email accounts at godaddy. It is sufficient to set the mx records just for the root domain name and not the cnames. For instance to set up mx records for www.knowledgefolders.com it is sufficient to set them for "knowledgefolders.com" and no need to set one for "www.knowledgefolders.com", because the email is always going to be addressed at "somemail@knowledgefolders.com".

Adding cnames to further tune the email solution

According to some documentation at http://www.simpledns.com/help/index.html?rec_cname.htm cname records are aliases at the domain name server (in this case peak 10) redirecting the traffic. For example I can set up a cname record at peak 10 for pop.knowledgefolders.com pointing to pop.godaddy.com. This will allow microsoft outlook to specify the pop name server as your domain name server. Samething can be done for smtp cname. And also for the webmail at godaddy if needed.

Limitations of cnames

cnames are aliases to host names. They also introduce new names into the domain name space. For example if I have a domain registered as "knowledgefolders.com" then a cname record can introduce another host into the domain name space called "myhost.knowledgefolders.com". This only works as long as the new host names you are introducing are all suffixed with "knowledgefolders.com". For example you can not introduce a cname called "somehost.some-domain.com" when you don't own the "some-domain.com".

But you are entitled nevertheless to point "somehost.your-domain.com" to "some-other-host.someoneelsesdomain.com". This is how the indirection of smtp and pop mail servers is achieved.

End result

At the end of all of this I was able to publish the following distinct web sites using various accounts in Knowledge Folders. Let us take a look

  1. www.knowledgefolders.com points to the original home page of multi-accounted Knowledge Folders
  2. www.knowledgefolders.org points to the documentation web site for Knowledge Folders
  3. www.knowledgefolders.net points to my personal account in Knowledge Folders
  4. www.satyakomatineni.com points to my personal account in Knowledge Folders as well
  5. www.kavithakomatineni.com points to the website I manage for my daughter

I use Knowledge Folders for a number of things

  1. I manage my web logs
  2. I manage documentation for Aspire/J2EE
  3. I support Aspire/J2EE using feedback
  4. I manage documentation for Knowledge Folders
  5. I collaborate with teams to develop web sites using a project portal concept where documentatiion about that project is maintained
  6. I create static web sites with for small companies
  7. I manage my daily, weekly, and monthly tasks, todos
  8. I run tutorials
  9. I do my research
  10. I publish articles

Future possibilities for web hosting: webos!!

Currently the process of creating online web sites is a very disjointed process. There are numerous steps one has to follow to get a web presence. The trend is on certainly towards simplifying this process. Especially with something like Knowledge Folders it is possible to imagine where a consumer can visit the site and create an account and be on the web with their content right away. The back end details can be automated. Especially when you can manage your tasks, schedules, publish, and collaborate from the same site we are heading towards something like a "webos".