Sunday, May 23, 2010

Don't Host Crowd and Jira in the same Servlet Container

This took up quite a bit of my Saturday figuring out, so I figured I'd add some pointers for other people to find.

Atlassian doesn't recommend that you host their applications in the same Tomcat instance, rather encouraging you to deploy them in different Tomcat instances and JVMs through the "Standalone" distributions. However, recommendations never stopped me before, so at OpenGamma we have two different sets of Atlassian infrastructure:

  • One set of Confluence, Crowd and Jira for the FudgeMsg project running in one Tomcat container on one VM
  • One set of Crowd and Jira for our corporate use running in one Tomcat container on one VM
  • Bamboo and FishEye for OpenGamma corporate use running behind our firewall in their own standalone implementations in their own VMs.

Yesterday I tried to upgrade Crowd and Jira for the OpenGamma corporate installation. It wasn't pretty.

First of all, I ran into this KnowledgeBase issue, where Confluence, Crowd, and Jira all ship with different versions of the Felix jar for plugin management. This stopped the plugin system from starting for Jira (since Tomcat launches the Crowd application first), so Jira was pretty borked.

Then I ran into something far more pernicious, which other people should be aware of.

Reindexing Requires Crowd

If you're running your applications with Crowd, the application delegates user-related information to Crowd and uses a RESTful approach to loading the data. So far, so good.

However, when Jira attempts to in-situ upgrade an installation (which again they don't recommend), it will do its database wrangling, and then reindex the system with the new functionality (in our case, going from 4.0 to 4.1.1 to allow searches based on votes and watches on issues). All this upgrading it does in the application startup logic, which happens in the Tomcat main thread.

When it gets to reindexing, it then attempts about 10 different RESTful calls to your Crowd instance, as it doesn't have any caches populated on user data. However, while Tomcat has opened up port 80 (or 8080 or whatever) for Crowd, it hasn't enabled application dispatch to the Crowd servlets yet.

This means that all 10 remote calls (which are actually local to the single Tomcat instance since both apps are co-hosted) hang, and the entire server startup process fails.

The only workaround is to launch Crowd in its own servlet container, change your Jira's crowd.properties to refer to the new one, startup, and then undo what you've done.

The Moral Of The Story

When your software vendor recommends that you don't do something, don't do it unless you have an exceptional reason to do so.

If you're a software vendor and you support something but don't recommend it, still test it as there will be customers who ignore your recommendations and do it anyway.

blog comments powered by Disqus