Caffeinated Bitstream

Bits, bytes, and words.

Taming Roller's URL strategy

When I decided to start this blog, I installed the Roller 4.0 weblog software. Many different blogs can run in one instance of Roller, and the URLs for the blogs are arranged as subdirectories of a master Roller URL. For instance, if you installed Roller to be /roller, then your blogs might have URLs like /roller/my_blog, /roller/potato_farming_in_pocatello, and /roller/i_like_lettuce. That's fine for many uses, but I prefer to have more concise URLs. I'd like my blog to be referenced from the root of the web site, like /my_blog. Why should I have to conform to how Roller thinks I should set up my web site?

Configuring a web server to remap incoming URLs is a simple matter, and can be easily accomplished with tools like Apache's mod_rewrite. Such remapping techniques are well known and I won't bother going into detail about it here. However, what about outgoing URLs? With the rewrite rules in place, users can easily access the blog at /my_blog, but all the links on the page point back to the ugly URLs. It's a simple matter to redirect these in the web server so that they work, but it's nasty and wasteful to be constantly sending redirect messages, and besides... what would Googlebot think of such shenanigans?

I get stubborn about these sorts of things, so I decided to roll up my sleeves and see what was going on under the hood of Roller. It turns out that Roller provides a URLStrategy interface that can be swapped out programmatically with different implementations to provide different URL behaviors. Also, to my astonishment, Roller 4.0 is hooked together with Guice -- the lightweight dependency injection system developed at Google. I haven't worked with Guice, but I have used the Spring Framework's inversion of control library for dependency injection, so I know that these systems are built to make it easy to swap out components -- just what I'm looking for!

Unfortunately, there's no facility in Roller to configure the Guice dependency injection at runtime -- as far as I can tell, the Guice configuration is hard-wired in the code. (As opposed to an XML configuration file, as is common in Spring applications.) However, there is a runtime property to select the class that does the configuring -- the so-called Guice module. This means that to provide my own URLStrategy, I must supply at least two new classes: a custom replacement for Roller's JPAWebloggerModule class which is used to configure Guice, and the replacement for the MultiWeblogURLStrategy class which currently defines the URL behavior.

I start by configuring Roller to use my custom Guice module, by adding this line to my roller-custom.properties file:

guice.backend.module=com.davidsimmons.roller.CustomWebloggerModule
This property will configure Roller to use my CustomWebloggerModule class, which configures Guice with all the same bindings as Roller's own JPAWebloggerModule class does, except it binds my CustomURLStrategy class to the URLStrategy interface instead of the default MultiWeblogURLStrategy implementation. My CustomURLStrategy extends MultiWeblogURLStrategy, so it is almost the same, except that it knows about special weblogs that I want to be referenced from the root of the web site. For these weblogs, my custom class post-processes the URL to remove the first component of the URL path. Voilà! All my links now reflect the friendly version of the URL.

There is one gotcha -- when I reparented the blog, important cookies stopped working because they were tied to the /roller path. To solve this for the JSESSIONID, I configured my Tomcat servlet container to use an empty path by including emptySessionPath="true" in the Connector attributes. There are a few more path-dependent cookies that Roller uses that may come back to bite me... we'll see.

The sample code is available here: simmons-customurl.tar.bz2