Thursday, September 16, 2010

What happened to the Open Graph Protocol?

Back in April, Facebook announced a new 'protocol' (actually metadata markup for web pages) called Open Graph Protocol. The 'protocol' was yet another set of metadata for marking up web pages (akin to RDFa and microformats and HTML5 microdata) to give them some machine readable semantic information (BTW How long before the term 'semantic web' goes the way of 'artificial intelligence'?).

In its initial incarnation the Open Graph Protocol had one major use: enabling the sprinkling of Facebook Like buttons all over the web. And the metadata proposed in the 'protocol' was very Facebook specific. For example, a person could be identified as one of 'actor, athlete, author, director, musician, politician, or public_figure'. So, no butcher, baker or candlestick maker.

And here's where every single metadata/semantic web system runs into trouble: how do you define a schema for the meaning of things? How do you decide what's meaningful and what the right taxonomy or ontology is? Whilst Open Graph Protocol proposed a simple and easy to implement markup syntax for web pages, it didn't attack the real problem: what is the metadata itself? In fact, any fool can come up with a markup language, that's the easy part.

One really well developed metadata standard is the Dublin Core (which Open Graph Protocol is 'inspired by'). It's been worked on since 1994. 16 years into Dublin Core there's a workable set of metadata from describing things like the title of a page, or the author of a book. The lesson is that if you mess with metadata be in for a long haul: getting it right is hard.

One metadata standard is the Dewey Decimal System that tries to give a numerical meaning to any book. Here are the subcategories under '200 Religion':

210 Philosophy & theory of religion
220 The Bible
230 Christianity & Christian theology
240 Christian practice & observance
250 Christian pastoral practice & religious orders
260 Christian organization, social work & worship
270 History of Christianity
280 Christian denominations
290 Other religions

This is precisely the problem that Open Graph Protocol has today: its metadata is heavily influenced by the thinking of its creators. In the Dewey Decimal System it's clear that christianity was important and everything else was an "Other religion". For Facebook, metadata that matters to Facebook is what's important.

Unfortunately, a read of the Open Graph Protocol mailing list (which contains just 390 messages in total) shows that the problem of defining good metadata isn't being attacked. And the mailing list's traffic is faltering:

Admittedly, half way through September there have been 13 messages (more than the 8 in August).

My take is that Open Graph Protocol hasn't been widely adopted because its underlying real goal is not better semantic data for web sites, but better semantic data for Facebook.


If you enjoyed this blog post, you might enjoy my travel book for people interested in science and technology: The Geek Atlas. Signed copies of The Geek Atlas are available.


<$BlogCommentDateTime$> <$BlogCommentDeleteIcon$>

Post a Comment

Links to this post:

<$BlogBacklinkControl$> <$BlogBacklinkTitle$> <$BlogBacklinkDeleteIcon$>
Create a Link

<< Home