Thursday, April 19, 2007

UseTheSource (re-)launches: social code snippet web site

Almost 8 years ago I registered the domain name (as in Use the Source, Luke!) and for a long time ran it as a blog (back when people were first starting to say the word 'weblog'). It even got me an appearance with Leo Laporte on The Screen Savers (if you are old enough to remember that show :-)

Well, UseTheSource is back, but not as a blog, it's back as a Reddit/Digg style social news web site with a twist: it's only intended to accept links to pieces of code. It's unashamedly for programmers who want to see working code.

So if you see a cool hack, a great explanation of an API, an neat algorithm think about submitting it to UseTheSource. You can also submit articles that are relevant to the site, but bear in mind that I'm trying to encourage people to submit things with working code!

The code snippets are categorized by language (I'll add languages on demand), and the site supports Digg-style voting plus user-defined tags. To get things going I've submitted a bunch of suitable items (most from my own writing).

Of course, it's very much in a 'beta' state and I'll be happy to receive feedback on bugs you encounter or suggestions for improvements.

Tuesday, April 17, 2007

No newsletter for April 15, 2007

This blog post is really for people who read my spam and anti-spam newsletter. There won't be one this week (it was due on April 15) because I wanted to spend time in the newsletter reviewing the MIT Spam Conference 2007.

Unfortunately, the videos of the presentations on the web site are either of very poor quality or have no sound at all. This makes reviewing the presentations very hard. Bill Yerazunis has promised better videos 'coming soon', and I'm waiting for them.

While I'm complaining I also find that the 'download an ISO' to get the papers to be ridiculous. I've asked Bill to provide individual links to each of the papers and presentations so that people can just click and download what they want. He says...

We *could* have individual links. I assume the
individual authors would do so in any case.

And I'll probably do that later.

But right now, I'm sorta trying to get people to
at least glance at all the papers, and put the CDROMs into
their local libraries; that's the motivation.

So, my next newsletter will be out on April 30 with a review of the MIT Spam Conference and other regular news.

Monday, April 16, 2007

Debugging: Solaris bus error caused by taking pointer to structure member

Take a look at this sample program that fails horribly when compiled on Solaris using gcc (I haven't tried other compilers, and I'm not pointing my finger at gcc here, this is a Sun gotcha).

Here's an example program (simplified for something much more complex that I was debugging), that illustrates how memory alignment on SPARC systems can bite you if you are doing low-level things in C. In the example the program allocates space for a thing structure which will be prepended with a header. The header structure has a dummy byte array called data which will be used to reference the start of the thing.
struct thing {
  int an_int;

struct header {
  short id;
  char data[0];

struct header * maker( int size ) {
  return (struct header *)malloc( sizeof( struct header ) + size );

int main( void ) {
  struct header * a_headered_thing = maker( sizeof( struct thing ) );

  struct thing * a_thing = (struct thing *)&(a_headered_thing->data[0]);

  a_thing->an_int  = 42;

If you build this on a SPARC machine you'll get the following error when you run it:
Bus Error (core dumped)

Annoyingly, if you build a debugging version of this program the problem magically goes away and doesn't dump core in the debugger. So you either resort to printf-style debugging or going into gdb and looking at the assembly output.

Here's what happens when you run this in gdb (non-debug code):
(gdb) run

Program received signal SIGSEGV, Segmentation fault.
0x000106d8 in main ()

Since you can't get back to the source we're forced to do a little disassembly:
(gdb) disassemble
Dump of assembler code for function main:
: save %sp, -120, %sp 0x000106b4
: mov 4, %o0 0x000106b8
: call 0x10688 0x000106bc
: nop 0x000106c0
: st %o0, [ %fp + -20 ] 0x000106c4
: ld [ %fp + -20 ], %o0 0x000106c8
: add %o0, 2, %o0 0x000106cc
: st %o0, [ %fp + -24 ] 0x000106d0
: ld [ %fp + -24 ], %o1 0x000106d4
: mov 0x2a, %o0 0x000106d8
: st %o0, [ %o1 ]
: mov %o0, %i0 0x000106e0
: nop 0x000106e4
: ret 0x000106e8
: restore 0x000106ec
: retl 0x000106f0
: add %o7, %l7, %l7 End of assembler dump.

I've highlighted the offending instruction. From the code you can clearly see that the o0 register contains the value 0x2a (which is, of course, 42) and hence we are looking at code corresponding to the line a_thing->an_int = 42;. The st instruction is going to write the 42 into the an_int field of thing. The address of an_int is stored in o1.

Asking gdb for o1's value shows us:
(gdb) info registers o1
o1             0x2094a  133450

An int is 4 bytes and you can easily see that the address of an_int stored in o1 is not 4 byte aligned (133450 mod 4 = 2, or just stare at the bottom nybble). The SPARC architecture insists that the data accesses be correctly aligned for the size of the access. In this case we need 4 byte assignment (note that malloc will make sure that things are correctly aligned and the compiler will pack structures to the correct alignment while minimizing space).

In this case, the code fails because the data member is byte aligned (since we declared it as a character array), but then we take a pointer to it and treat it as structure with an integer member. Oops. Bus error.

(Note you could have discovered this with printf and %p to get the pointer values without going into the debugger and poking around in the assembly code).

There are a couple of ways to fix it. The first is to pad the header structure so that data is correctly aligned: adding 4 bytes of padding in the form of a short while make the problem go away:
struct header {
  short id;
  short padding;
  char data[0];

That's ugly and requires careful commenting and could be a maintenance problem if maker is used to make things requiring a different alignment, or the header structure is modified.

It's slightly cleaner to not have padding but change the type of data to something like the alignment you want:
struct header {
  short id;
  int data[0];

(Or even double data[0] to get 8 byte alignment). With gcc you could even make this really clear by using the aligned attribute to create a special type:
typedef char aligned_data __attribute__ ((aligned (8)));

struct header {
  short id;
  aligned_data data[0];

I think that's the clearest option of all. With a little documentation around this it should be maintainable.

Friday, April 06, 2007

Ego Blogging: My Portrait

I was looking for a portrait to place on my web site as part of the overhaul and after having unsuccessfully worked with various providers at Elance I finally asked Leo Laporte who had done his picture.

Turns out it's Snaggy at GeekCulture so I put a small number of $ on the table and received the following:

The first person to see this said, "It makes you look about 20!". Even if I do look a little younger than I am, it's a very good likeness. Now, I just need to find a way to insert it into my home page.