Installing Cocoon 2 and Tomcat 4 on Apache Web Server (on Linux)

Preface

Common problems for web developers are:

A very flexible way of doing that is to do the site in XML and transform the XML using XSLT stylesheets to generate the HTML, WML or other final output.

Cocoon from the Apache Project provides a very flexible way of doing this in a Java servlet environment.

In order to install Cocoon you will need the following

I assume you have java installed. I'd recommend JDK1.3.1 from Sun.

The Installation

This installation was carried out on RH6.2 (kernel 2.2.14) and RH7.1 (2.4.2-2) Linux box. The version of Apache is 1.3.12. During the setup I used several resources including the Cocoon AJP documentation and other parts of the cocoon documentation. Another very useful page was Daniel Schneider's (TEFCA) mod_jk on Solaris page.

After some research I elected to go with mod_jk rather than mod_webapp. Mod_jk allows load balancing. It also allows some requests to be routed to Apache instead of them all going to Tomcat. I have also been using mod_jk with Tomcat for nearly a year.

Downloads

NB: The latest version of Tomcat is 4.0.2. However, there seems to be an issue with Cocoon 2 and Tomcat 4.0.2 according to the cocoon-users@xml.apache.org mailing list.

Installation

Apache

Apache comes with most Linux distributions. I'll assume you have Apache installed or are able to install it from your distribution CDs. Alternatively, you can download it from apache.org. The most important change you might need to configure in your /etc/httpd/httpd.conf is the ServerName. You might also want to add a virtual host definition. You can add extra interfaces on your loopback or ethernet devices and add virtual hosts for each of thos if you wish. In RedHat, you can do that using 'netcfg' and selecting 'interfaces' and going from there.

Tomcat

I unpacked the tarball to /opt and then put a symbolic link to jakarta-tomcat-4.0.1. Tomcat 4 uses CATALINA_HOME. You'll also need to set JAVA_HOME.

export JAVA_HOME=/usr/lib/jdk
export CATALINA_HOME=/opt/tomcat
ln -sf /opt/jakarta-tomcat-4.0.1 /opt/tomcat

Then startup Tomcat

$CATALINA_HOME/bin/startup.sh

You should now be able to acess the Tomcat welcome page.

http://yourhost:8080/index.html

Install ant

Just unbundle the ant tarball somewhere ant set ANT_HOME accordingly. I put mine in /opt.

export ANT_HOME=/opt/ant
ln -sf /opt/jakarta-ant-1.4.1 /opt/ant

I also installed the optional ant jar to $ANT_HOME/lib so I wouldn't be bitten later on. (That is, in case it was required ;-))

Build mod_jk

Unpack the jakarta-tomcat-connectors tarball.

cd jakarta-tomcat-connectors-4.0.2-01-src/jk/

Setup the build.properties file

cp -p build.properties build.properties.orig
cp -p build.properties.sample build.properties

I set it up so that it looks like this

#
# sample build.properties for ajp connector.
# edit to taste...
#

# Directory where catalina is installed. It can 
# be either 4.0 or 4.1
tomcat40.home=/opt/tomcat

# If you want to build/install on  both 4.0 
# and 4.1, set this to point to 4.0 and 'catalina.home'
# to point to 4.0
# ( most people need only the first, but developers should
# have both )
#tomcat41.home=../../jakarta-tomcat-4.1/build

# Directory where tomcat3.3 is installed
#tomcat33.home= ../../jakarta-tomcat/build/tomcat

# Location of Apache2, Apache1.3, Netscape, IIS
#apache2.home=/opt/apache2
apache13.home=/
#iplanet.home=/opt/iplanet6
# iplanet.home=d:/tools/sdk/netscape


# APR location - by default the version included in Apache2 is used.
# Don't edit unless you install 'standalone' apr.
apr.include=${apache2.home}/include
apr.lib=${apache2.home}/lib

# Compile-time options for native code
so.debug=true
so.optimize=false
so.profile=false

# Settings for building NetWare binaries.  Uncomment these and modify for your
# environment to build NetWare binaries.
#
# novellndk.dir=d:/tools/nwsdk
# build.compiler.base=d:/tools/mw/6.0
# build.compiler.cc=${build.compiler.base}/bin/mwccnlm
# build.compiler.ld=${build.compiler.base}/bin/mwldnlm

# Settings for building Windows binaries.  Uncomment these and modify for your
# environment to build Windows binaries.
#
# build.compiler.base=c:/Program Files/Microsoft Visual Studio/VC98
# build.compiler.cc=${build.compiler.base}/bin/cl
# build.compiler.ld=${build.compiler.base}/bin/link

Do the build.

$ANT_HOME/bin/ant
$ANT_HOME/bin/ant install 

Build the apache module

cd native/apache-1.3/

Edit the Makefile for your OS, if necessary. I think I just made sure that APXS was setup correctly. So the build and copy the resulting Apache module to wherever your apache modules reside.

make -f Makefile.linux all
cp -p mod_jk.so /usr/lib/apache/

Now you need to setup Apache for mod_jk so that it can connect to Tomcat. At the following to your httpd.conf. While we're at it we'll also put in the cocoon bits and pieces. There is a way of getting this included from $CATALINA_HOME/conf/auto/mod_jk.conf. But I've never managed to get it working to my satisfaction. (Adding Include /opt/tomcat/conf/auto/mod_jk.conf to httpd.conf. I need to investigate how to get *.jsp JkMount'd etc to do that)

LoadModule jk_module modules/mod_jk.so
AddModule mod_jk.c

JkWorkersFile /opt/tomcat/conf/jk/workers.properties
JkLogFile /opt/tomcat/logs/mod_jk.log

#
# Log level to be used by mod_jk
#
JkLogLevel info
# Root context mounts for Tomcat
#
#JkMount /*.jsp ajp12
#JkMount /servlet/* ajp12
JkMount /*.jsp ajp13
JkMount /servlet/* ajp13

JkMount /cocoon/* ajp13

#########################################################################
# The following line makes apache aware of the location of the /examples context
#
Alias /examples "/opt/tomcat/webapps/examples"
<Directory "/opt/tomcat/webapps/examples">
    Options Indexes FollowSymLinks
</Directory>

#
# The following line mounts all JSP files and the /servlet/ uri to tomcat
#
JkMount /examples/servlet/* ajp13
JkMount /examples/*.jsp ajp13

#
# The following line prohibits users from directly accessing WEB-INF
#
<Location "/examples/WEB-INF/">
    AllowOverride None
    deny from all
</Location>

#
# The following line prohibits users from directly accessing META-INF
#
<Location "/examples/META-INF/">
    AllowOverride None
    deny from all
</Location>

If you have any virtual hosts which will also be accessing cocoon you'll need to add a JkMount for those. You might also want to add some aliases for directories outside of cocoon for static files such as images and javascript.

<VirtualHost myVirtualHost>
        ...
   Alias /html someDirOutsideCocoon/html
   Alias /images someDirOutsideCocoon/html/images
   Alias /javascript someDirOutsideCocoon/html/javascript

   JkMount /*.jsp ajp13
   JkMount /servlet/* ajp13
   JkMount /cocoon/* ajp13
</VirtualHost>

Now setup Tomcat for mod_jk. Edit $CATALINA_HOME/conf/server.xml. After

<Server port="8005" shutdown="SHUTDOWN" debug="0">

add

<Server port="8005" shutdown="SHUTDOWN" debug="0">
   <Listener className="org.apache.ajp.tomcat4.config.ApacheConfig"
   modJk="/usr/lib/apache/mod_jk.so" jkDebug="info"/>

The arguments modJk and jkDebug are there if you want to use the Include /opt/tomcat/conf/auto/mod_jk.conf mechanism in httpd.conf. I will include the config for that here for future reference (i.e. when I get that going)

Make sure an ajp13 connector is defined in the section

  <Service name="Tomcat-Standalone">

You'll see a whole lot of <connector definitions. I added mine after the SSL HTTP/1.1 Connector.

    <!-- Define an AJP 1.3 Connector on port 8009 -->
    <Connector className="org.apache.ajp.tomcat4.Ajp13Connector"
               port="8009" minProcessors="5" maxProcessors="75"
               acceptCount="10" debug="0"/>

After the default virtial host definition

     <Host name="localhost" debug="0" appBase="webapps"
           unpackWARs="true">

add the ApacheConfig listener (<Host name="localhost"...)

      <!-- Define the default virtual host -->
      <Host name="localhost" debug="0" appBase="webapps" unpackWARs="true">
<Listener className="org.apache.ajp.tomcat4.config.ApacheConfig" 
          append="true" />

If you have any virtual hosts add listeners for those. That is, after the default virtual host entry (which finishes with a </Host>)

      </Host>
      <Host name="yourHost" >
           <Listener className="org.apache.ajp.tomcat4.config.ApacheConfig" 
                        append="true"  />
           <Context path="/cocoon" 
                docBase="webapps/cocoon" debug="0" reloadable="true" >
           </Context>
           <Context path="" 
                    docBase="yourDocBase" />
           <Context path="/examples" 
                 docBase="webapps/examples" 
                 crossContext="false"
                 debug="0" 
                 reloadable="true" > 

           </Context>
        </Host>

Again the listeners are for generating the ApacheConfig for Tomcat 4. I intend to get that working properly some day.

You will need a workers.properties file which you need to put in $CATALINA_HOME/conf/jk/workers.properties. Here is one I had lying around and haven't checked for Tomcat 4 but worked.

#
# workers.properties -
#
# This file provides jk derived plugins with with the needed information to
# connect to the different tomcat workers.
#
# As a general note, the characters $( and ) are used internally to define
# macros. Do not use them in your own configuration!!!
#
# Whenever you see a set of lines such as:
# x=value
# y=$(x)\something
#
# the final value for y will be value\something
#
# Normaly all you will need to modify is the first properties, i.e.
# workers.tomcat_home, workers.java_home and ps. Most of the configuration
# is derived from these.
#
# When you are done updating workers.tomcat_home, workers.java_home and ps
# you should have 3 workers configured:
#
# - An ajp12 worker that connects to localhost:8007
# - An ajp13 worker that connects to localhost:8009
# - A jni inprocess worker.
# - A load balancer worker
#
# However by default the plugins will only use the ajp12 worker. To have
# the plugins use other workers you should modify the worker.list property.
#
#

#
# workers.tomcat_home should point to the location where you
# installed tomcat. This is where you have your conf, webapps and lib
# directories.
#
workers.tomcat_home=c:\jakarta-tomcat

#
# workers.java_home should point to your Java installation. Normally
# you should have a bin and lib directories beneath it.
#
workers.java_home=c:\jdk1.2.2

#
# You should configure your environment slash... ps=\ on NT and / on UNIX
# and maybe something different elsewhere.
#
ps=\
# ps=/

#
#------ ADVANCED MODE ------------------------------------------------
#---------------------------------------------------------------------
#

#
#------ DEFAULT worket list ------------------------------------------
#---------------------------------------------------------------------
#
#
# The workers that your plugins should create and work with
#
worker.list=ajp12, ajp13

#
#------ DEFAULT ajp12 WORKER DEFINITION ------------------------------
#---------------------------------------------------------------------
#

#
# Defining a worker named ajp12 and of type ajp12
# Note that the name and the type do not have to match.
#
worker.ajp12.port=8007
worker.ajp12.host=corp2.bluetoad.com.au
worker.ajp12.type=ajp12
#
# Specifies the load balance factor when used with
# a load balancing worker.
# Note:
#  ----> lbfactor must be > 0
#  ----> Low lbfactor means less work done by the worker.
worker.ajp12.lbfactor=1

#
#------ DEFAULT ajp13 WORKER DEFINITION ------------------------------
#---------------------------------------------------------------------
#

#
# Defining a worker named ajp13 and of type ajp13
# Note that the name and the type do not have to match.
#
worker.ajp13.port=8009
worker.ajp13.host=corp2.bluetoad.com.au
worker.ajp13.type=ajp13
#
# Specifies the load balance factor when used with
# a load balancing worker.
# Note:
#  ----> lbfactor must be > 0
#  ----> Low lbfactor means less work done by the worker.
worker.ajp13.lbfactor=1

#
# Specify the size of the open connection cache.
#worker.ajp13.cachesize

#
#------ DEFAULT LOAD BALANCER WORKER DEFINITION ----------------------
#---------------------------------------------------------------------
#

#
# The loadbalancer (type lb) workers perform wighted round-robin
# load balancing with sticky sessions.
# Note:
#  ----> If a worker dies, the load balancer will check its state
#        once in a while. Until then all work is redirected to peer
#        workers.
worker.loadbalancer.type=lb
worker.loadbalancer.balanced_workers=ajp12, ajp13


Install coccon

Unbundle the cocoon tarball and copy cocoon.war to $CATALINA_HOME/webapps.

Restart Tomcat and Apache. You might find this script useful. I placed it in $CATALINA_HOME/bin.

#!/bin/sh
BASEDIR=`dirname $0`
$BASEDIR/shutdown.sh
ps axlw | grep java | grep -v grep >/dev/null 2>&1 
echo -n "Shutting down Tomcat ..."
while [ $? -eq 0 ]
do
   echo -n .
   sleep 1
   ps axlw | grep java | grep -v grep >/dev/null 2>&1 
done

echo "Done."
echo -n "Starting up  Tomcat ..."
$BASEDIR/startup.sh
echo "Done."
/etc/rc.d/init.d/httpd restart

You should be able to access cocoon and get the welcome page

http://yourServerName/cocoon/

Configuring Cocoon

You really need to read the Cocoon Overview and concepts. There is also more cocoon documentation. I will outline some basics here.

Sitemap

The most important configuration file for Cocoon 2 is the sitemap ($CATALINA_HOME/webapps/cocoon/sitemap.xmap). It is what the name implies - a map of all of the files on your Cocoon site. When a file is passed to Cocoon by Apache and Tomcat, Cocoon matches the URL against a series of pattern matches in the sitemap. In the default setup, and as outlined here, when http://yourhost/cocoon/hello.html is accessed it is passed to Cocoon. Cocoon will then match "hello.html" against it's pattern matches. If a match is found, the entry will define how the file is to be handled by Cocoon. Say there is the following entry in the sitemap.

   <map:match pattern="hello.html">
    <map:generate src="docs/samples/hello-page.xml"/>
    <map:transform src="stylesheets/page/simple-page2html.xsl"/>
    <map:serialize type="html"/>
   </map:match>

This example uses the default generator, transformer and the serializer or type "html". This means that Cocoon will take the XML file $CATALINA_HOME/webapps/cocoon/docs/samples/hello-page.xml as input and generate SAX events to pass on to the transformer. The transformer will use the XSLT stylesheet $CATALINA_HOME/webapps/cocoon/stylesheets/page/simple-page2html.xsl to transform the file. The resulting SAX events are passed as input to the serializer. The serializer, in this case will generate HTML and send that back to the browser.

The output from each stage is passed as input to the next stage through this pipeline.

The generators , transformers and serializers are all defined in the sitemap. Here are the definitions for those used in the above example.

  <map:generators default="file">

   <map:generator name="file"           logger="sitemap.generator.file"           label="content,data"
                  src="org.apache.cocoon.generation.FileGenerator"
                  pool-max="32" pool-min="16" pool-grow="4"/>

     ...

  <map:transformers default="xslt">

   <map:transformer name="xslt"            logger="sitemap.transformer.xslt"
                    src="org.apache.cocoon.transformation.TraxTransformer"
                    pool-max="32" pool-min="16" pool-grow="4">
    <use-request-parameters>false</use-request-parameters>
    <use-browser-capabilities-db>false</use-browser-capabilities-db>
    <use-deli>false</use-deli>
   </map:transformer>

      ...


  <map:serializers default="html">
         ...
   <map:serializer name="html"   mime-type="text/html"        logger="sitemap.serializer.html"
                   src="org.apache.cocoon.serialization.HTMLSerializer">
      <encoding>iso-8859-1</encoding>
   </map:serializer>

A Sample JSP Deployment

My goal was to have JSP pages which generated XML rather than HTML. The output of the JSP pages is transformed using an XSLT stylesheet. From this HTML is generated and sent back to the browser.

I added the jsp files in a new directory in $CATALINA_HOME/webapps/cocoon/docs/samples/.

Then I edited $CATALINA_HOME/webapps/cocoon/sitemap.xmap to include definitions for processing my JSPs.

My JSPs need to take request parameters and also need to use European characters. So I added a new generator definition.

...
<map:generator  name="jsp-with-params"         src="org.apache.cocoon.generation.JspGenerator">
    <use-request-parameters>true</use-request-parameters>
    <encoding>iso-8859-1</encoding>
</map:generator>

Then I added the definition for my new application.

  <map:match pattern="newcec/*">
    <map:generate type="jsp-with-params" src="/docs/samples/newcec/{1}.jsp"/>

    <map:select>
      <map:when test="wap">
        <map:transform src="stylesheets/wapcec.xsl"/>
      </map:when>
      <map:when test="lynx">
        <map:transform src="stylesheets/plaincec.xsl"/>
      </map:when>
      <map:when test="opera">
        <map:transform src="stylesheets/newcec.xsl"/>
      </map:when>
      <map:otherwise>
        <map:transform src="stylesheets/newcec.xsl"/>
      </map:otherwise>
    </map:select>
<!-- <map:transform src="stylesheets/newcec.xsl"/> -->
    <map:serialize type="html"/>
    <map:handle-errors>
      <map:transform src="stylesheets/system/error2html.xsl"/>
      <map:serialize status-code="500"/>
    </map:handle-errors>
   </map:match>

The interesting part here is that I can have different stylesheets for different browsers. For Lynx, I can have a plain text version.

I also modified the HTMLSerializer definition so that European characters would work.

   <map:serializer name="html"   mime-type="text/html"        logger="sitemap.serializer.html"
                   src="org.apache.cocoon.serialization.HTMLSerializer">
      <encoding>iso-8859-1</encoding>
   </map:serializer>

MVC Architecture

If you would like to add additional servlets such as controller servlets to your cocoon context, you will need to edit the Cocoon deployment descriptor $CATALINA_HOME/webapps/cocoon/WEB-INF/web.xml. You will need to add the definitions for those servlets to that file before the Cocoon servlet definition. e.g.

...
   <servlet>
      <servlet-name>action</servlet-name>
      <servlet-class>AuthActionServlet</servlet-class>
      <init-param>
         <param-name>action-mappings</param-name>
         <param-value>actions</param-value>
      </init-param>
   </servlet>

   <servlet>
      <servlet-name>authenticate</servlet-name>
      <servlet-class>AuthenticateServlet</servlet-class>
   </servlet>
...
  <servlet>
    <servlet-name>Cocoon2</servlet-name>
    <display-name>Cocoon2</display-name>
    <description>The main Cocoon2 servlet</description>

Then you'll need to add the mappings from your URL patterns to the servlets.

   <servlet-mapping>
      <servlet-name>action</servlet-name>
      <url-pattern>*.do</url-pattern>
   </servlet-mapping>

   <servlet-mapping>
      <servlet-name>authenticate</servlet-name>
      <url-pattern>*.authenticate</url-pattern>
   </servlet-mapping>

  <!--
    Cocoon handles all the URL space assigned to the webapp using its
    sitemap.
...

A note on redirects

Because of the pipeline concept, you cannot do things like put redirects in your JSP pages or in your controller servlets. That will break things. It can be argued that redirects are bad design. You can use RequestDispatcher.forward(). You must keep that output from one stage as input to the next stage idea going. A quick perusal of the JspGenerator code will make it clear to you.

Where to get more information

The cocoon mailing list is a good resource. The volume is probably around 50 messages a day. In the interest of usual mailing list etiquette, please make a note of how you unsubscribe.

Before posting, please, please, check the cocoon mailing list archive and FAQ first. Some questions (like using Tomcat 4.0.2) come with regular monotony.