But there's a real problem with most FOSS: for something that prides itself on the source being readable by everyone, and even cames up with 'laws' like 'given enough eyeballs all bugs are shallow', the actual source code of most FOSS is horrible, unreadable, garbage. Actually, I wonder if 'Linus' Law' shouldn't actually be something like 'Linus' Necessity': given that the source is so horrible we need lots of people so that one of them will be able to figure out what the hell it was we wrote.
When I started my well-known open source project I decided that I'd better make the code readable for two reasons: firstly, I was sure that I wouldn't get to work on it often so I'd have to come back and read old code and comments and other coding standards would make that easier; secondly, I was sure that other people were going to read my code.
The second thing turned out to be really important for two reasons: firstly, other people were able to read my code and contribute and I kept them to a similar coding standard and style and hence the code is (reasonably, I'm not claiming I'm perfect) readable. Perhaps more importantly one day I was being interviewed for a job and the interviewer said: "Yes, we've all read your code". They'd downloaded my project and checked me out. (I got the job).
Now, I'm not trying to slam all FOSS here and for the purposes of this entry I have not examined some of the most famous projects (e.g. Linux kernel, Apache, Firefox, ...), but I decided to take a look at the top 50 most downloaded projects of all time on SourceForge.
Then I would pick at random two source files (each source file had to be fairly large, i.e. more than 100 lines of code) and score them being as generous as possible using the following categories and assigned a score to each. I weighted the scores heavily towards doing simple things that have a high benefit (for example, describing the purpose of a file of function):
- File Description (FD): did the file I open have some sort of description (near the top) of what the purpose of the file was for. I wasn't asking for a detailed explanation, but just a little helper so that a new reader could get going on the purpose. Score: +5 (if present), -5 (if not)
- Function/Interface Description (FID): did any of the functions, or interfaces, in the file have a description. I would have liked to have seen all the arguments specified and return codes and caveats explained, but I was extremely generous: even if one function had a little header with a minimal description of the function it got into this category. Score: +5 (if present), -5 (if not)
- Useful Comments (UC): did the file contain at least one useful comment. A useful comment points out something that isn't obvious to the reader, or some trap for the unwary. Score: +1 (if present), -1 (if not)
- Stupid Comments (SC): did the file contain at least one stupid comment like 'increment i' or 'loop through records'. Score: -1 (if present), +1 (if not)
- Understandable (U): did I feel like I would be able to understand most of the code given 30 minutes of reading the file and browsing the rest of the source. This was very subjective, but was used to take into account things like clearly named functions, or really well named member variables. Score: +5 (if understandable), -5 (if not)
- Commented out code (COC): people we have source code control systems. Don't // out your code, or #if 0 it. ok? Score: -1 (if present), +1 (if not)
- Bonus (B): I had a special bonus category which I could hand out if I felt like it. A positive score here was for particularly well documented, and written code, neutral for most code and negative for really hideous stuff. Score: +10 (loved it), -10 (yuck), 0 (in general)
What I found was not a pretty picture:
- 65% don't bother describing even in the most minimal way even one of the functions I saw
- 60% of the projects don't bother with describing the purpose of a file
- 59% of the projects scored negatively using my system
- 53% contained useless comments
- 40% looked incomprehensible to me without major effort
- 33% contained commented out or #if 0 code
The best projects were (in order of score): GNUWin32 (thanks GNU Project!), GTK+ and The GIMP installers for Windows, NASA World Wind, Ghostscript, WINE, Miranda, MinGW (thanks GNU Project!), Erases, and DC++.
Come on FOSS people. Have some pride in your work! Remember, writing some decent comments is a gift you are given to people who read your code, and to yourself.
(Note that if you are the author of one of the projects above it's possile that I made a mistake and just happened to pick the wrong files to read. Send me examples of how great your code is and I'll publish a rebuttal here).
Here's a table with all the data:
|GTK+ and The GIMP installers for Windows||1||1||1||-1||1||-1||1||28||10|
|NASA World Wind||1||1||1||-1||1||-1||0||18||20|
|ABC [Yet Another Bittorrent Client]||-1||1||1||1||1||-1||0||6||21|
|MinGW - Minimalist GNU for Windows||1||1||1||1||1||-1||0||16||33|
|The CvsGui project||1||1||-1||1||1||-1||0||14||36|
|DOSBox DOS Emulator||-1||-1||1||1||1||-1||0||-4||41|
|XOOPS Dynamic Web CMS||-1||-1||1||-1||1||-1||0||-2||46|
|Wine Is Not an Emulator||1||1||1||1||1||-1||0||16||49|