Tuesday, April 22, 2008

Bookmark-based registration

Recently, I've been learning Ruby on Rails and I can never learn anything unless I build something with it. I also recently read Programming Collective Intelligence and had a desire to use some of those algorithms too.

I'll post more about the actual web site I created another time; currently it's in alpha form running on Heroku. The web site is used for naming babies, the initial alpha-release is the girls' names only site EmilyOrEmma? which uses "Hot or Not" style voting and manages to incorporate in one page the Levenshtein distance, Metaphone algorithm and both item-based and user-based filtering.

But my biggest bug bear with baby naming web sites is the need to create an account. You can browse baby names all you want, but as soon as you want to do something like add a name to a list of favourites you are forced into registration. Or you don't have to register but anything you do is ephemeral.

For my site, I came up with a better solution: bookmark-based registration. If you visit the site you'll see at the bottom of every page a link to bookmark. This link is unique to you. It contains your user id and a hash which I used to prevent forgery.

Bookmark this link and you can return to your recommendations, saved names, etc. any time.

Tuesday, April 08, 2008

Interesting real-world Apache Problem

I'm working with a large client who has a number of web servers behind a load balancer. This morning one Apache 1.3 had failed to come up on one of them. The client sends a SIGUSR1 to each Apache once an hour to force a graceful reload. This particular machine had operated correctly restarting Apache once per hour for 54 hours (since a recent reboot of the machine) and then died.

A quick look in the Apache error.log file showed the following:

module "mod_jk.c" could not be loaded, because the dynamic module limit was reached. Please increase DYNAMIC_MODULE_LIMIT and recompile.

Naturally I went looking for a problem with mod_jk which was the wrong place to look. Scrolling through the log file I noticed that every time Apache restarted we'd get the error:

Cannot remove module mod_include.c: not found in module list

This was where the real problem lay. A quick httpd -l showed that mod_include was compiled into the client's Apache and looking in the httpd.conf revealed that mod_include was also being loaded with LoadModule:

LoadModule includes_module modules/mod_include.so

When a module is both statically linked into Apache and dynamically loaded you run into a nasty problem: Apache doesn't complain when you start, but it will fail to unload the double loaded module on exit. So for every SIGUSR1 a single slot of the DYNAMIC_MODULE_LIMIT was used up. The default DYNAMIC_MODULE_LIMIT is 64 and with 10 real dynamic modules and a boot once per hour it took 54 hours to consume every slot in the module limit.

Removing the errorneous LoadModule fixed the problem.

Friday, April 04, 2008

Digg 3 Million

A quick update on my previous estimate of Digg users shows that, as predicted, Digg passed the 3 million user mark during March, 2008.

Comparing this estimated data and data from the Digg API shows that around 20% of the 3m accounts are not active. I speculated before that these accounts had been banned for spamming or other activities (that's around 600,000 bad accounts).

Growth appears to be the same as before adding around 150,000 accounts per month.

Thursday, April 03, 2008


It's 3am and there's a crisis somewhere in the world

If you follow US politics then you'll know that Hillary Clinton has a couple of ads that start with a 3am phone call to the Whitehouse. The first ad was intended as a slam against Barack Obama implying that he didn't have the experience to deal with such a crisis. The second is going up against John McCain claiming he doesn't want to do anything about the housing finance problem in the US.

You know if it's 3am and there's a crisis in the world there's only one place and only one man to call.

CTU and Jack Bauer.

First of all, he's already up. 3am is nothing to Jack. Hillary and McCain both look like they could use the sleep and Obama looks like he gets his beauty sleep every night. So, Jack's ready to go before any of them.

As much as you might think Obama is David Palmer (safe pair of hands), he's more like a Wayne Palmer (a slick little fighter) and you know what that means: gets blown up within five minutes of being president and then pops a brain vein and the evil VP has to take over. The only good thing to say about Obama is that he would call Jack.

Now, McCain might look like a Bauer type with his military background and heroic time spent as a PoW. But here's the difference. McCain was 5 years as a PoW, no one came to get him out so he can't be that valuable. Also, if Jack had been held hostage in North Vietnam for 5 years there wouldn't be a North Vietnam now. Because you can bet the life of the next random CTU cast member the Jack would have (a) escaped and (b) annihilated everyone involved.

So, that leaves Hillary. She can't tell sniper fire from a little girl with flowers. In fact she reminds me more and more of the evil Vice Presidents that pop up in every 24 trying to take power from the real president. Hell, she's probably even got backing from Phillip Bauer.

Only one word of warning: make sure it's a man that calls Jack. If it's a woman he's bound to have been involved with her, she'll turn out to be a double agent or her father will be evil, and Jack'll be distracted.

If it has to be a woman, make sure it's Chloe.