Tuesday, April 08, 2008

Interesting real-world Apache Problem

I'm working with a large client who has a number of web servers behind a load balancer. This morning one Apache 1.3 had failed to come up on one of them. The client sends a SIGUSR1 to each Apache once an hour to force a graceful reload. This particular machine had operated correctly restarting Apache once per hour for 54 hours (since a recent reboot of the machine) and then died.

A quick look in the Apache error.log file showed the following:

module "mod_jk.c" could not be loaded, because the dynamic module limit was reached. Please increase DYNAMIC_MODULE_LIMIT and recompile.

Naturally I went looking for a problem with mod_jk which was the wrong place to look. Scrolling through the log file I noticed that every time Apache restarted we'd get the error:

Cannot remove module mod_include.c: not found in module list

This was where the real problem lay. A quick httpd -l showed that mod_include was compiled into the client's Apache and looking in the httpd.conf revealed that mod_include was also being loaded with LoadModule:

LoadModule includes_module modules/mod_include.so

When a module is both statically linked into Apache and dynamically loaded you run into a nasty problem: Apache doesn't complain when you start, but it will fail to unload the double loaded module on exit. So for every SIGUSR1 a single slot of the DYNAMIC_MODULE_LIMIT was used up. The default DYNAMIC_MODULE_LIMIT is 64 and with 10 real dynamic modules and a boot once per hour it took 54 hours to consume every slot in the module limit.

Removing the errorneous LoadModule fixed the problem.

No comments: