Thursday, August 19, 2010

Solving Kevin Rose's email problem

Yesterdy, Kevin Rose wrote about his email problem.

My stats:
938 unread work emails.
1002 unread personal emails.

The madness has to stop. What was once a 30 minute annoyance is now my full-time job.

And he followed up with suggestions on how to deal with this problem. I faced a similar problem about 10 years ago when I was receiving a ton of varied email that needed sorting. My solution was to create POPFile.

POPFile became popular as a spam filter, but what I made it for was to sort my flood of mail into arbitrary categories of my choosing using the same technology (Naive Bayesian text classification) that became popular for spam filtering. It's POPFile that got be invited by Paul Graham to the MIT Spam Conference.

Here's what the POPFile site has to say:

POPFile is an automatic mail classification tool. Once properly set up and trained, it will scan all email as it arrives and classify it based on your training. You can give it a simple job, like separating out junk e-mail, or a complicated one-like filing mail into a dozen folders. Think of it as a personal assistant for your inbox.

I've been using it for years. Most recently I have it connect to my GMail account using IMAP and automatically label my messages. By changing labels I can teach POPFile when it makes mistakes. I've toyed with the idea of making this a paid service (especially since I have a really fast version of POPFile written in C that I sell to people), but I think the market's too small.

Just how many people have a POPFile-sized problem?


martijn said...

I guess something like POPfile would work well for large helpdesks that get 100s of emails in every day. Pre-sorting them and sending them to the right people could save some them quite a bit of time.

Anonymous said...

Actually, I had a horrible accident setting up a disconnected-IMAP client with my GMail and lost all the categorisation I'd set up over the years. One of these days I keep meaning to re-process it all through a lot of filters, but a Bayesian classifier would be nice.

pqs said...

Please, create this service. POPFile is not good for me, as I use three computers and a mobile phone to read my email and I don't want to have one of them running all the time just to classify my mail. I guess more people than you think need this service.