Thursday, September 16, 2010

What happened to the Open Graph Protocol?

Back in April, Facebook announced a new 'protocol' (actually metadata markup for web pages) called Open Graph Protocol. The 'protocol' was yet another set of metadata for marking up web pages (akin to RDFa and microformats and HTML5 microdata) to give them some machine readable semantic information (BTW How long before the term 'semantic web' goes the way of 'artificial intelligence'?).

In its initial incarnation the Open Graph Protocol had one major use: enabling the sprinkling of Facebook Like buttons all over the web. And the metadata proposed in the 'protocol' was very Facebook specific. For example, a person could be identified as one of 'actor, athlete, author, director, musician, politician, or public_figure'. So, no butcher, baker or candlestick maker.

And here's where every single metadata/semantic web system runs into trouble: how do you define a schema for the meaning of things? How do you decide what's meaningful and what the right taxonomy or ontology is? Whilst Open Graph Protocol proposed a simple and easy to implement markup syntax for web pages, it didn't attack the real problem: what is the metadata itself? In fact, any fool can come up with a markup language, that's the easy part.

One really well developed metadata standard is the Dublin Core (which Open Graph Protocol is 'inspired by'). It's been worked on since 1994. 16 years into Dublin Core there's a workable set of metadata from describing things like the title of a page, or the author of a book. The lesson is that if you mess with metadata be in for a long haul: getting it right is hard.

One metadata standard is the Dewey Decimal System that tries to give a numerical meaning to any book. Here are the subcategories under '200 Religion':

210 Philosophy & theory of religion
220 The Bible
230 Christianity & Christian theology
240 Christian practice & observance
250 Christian pastoral practice & religious orders
260 Christian organization, social work & worship
270 History of Christianity
280 Christian denominations
290 Other religions

This is precisely the problem that Open Graph Protocol has today: its metadata is heavily influenced by the thinking of its creators. In the Dewey Decimal System it's clear that christianity was important and everything else was an "Other religion". For Facebook, metadata that matters to Facebook is what's important.

Unfortunately, a read of the Open Graph Protocol mailing list (which contains just 390 messages in total) shows that the problem of defining good metadata isn't being attacked. And the mailing list's traffic is faltering:


Admittedly, half way through September there have been 13 messages (more than the 8 in August).

My take is that Open Graph Protocol hasn't been widely adopted because its underlying real goal is not better semantic data for web sites, but better semantic data for Facebook.

4 comments:

zenkat said...

If you like metadata and schema, you should check out Freebase:

http://www.freebase.com

It's graph data -- done right. It's free'n'open, too.

devonianfarm said...

One thing I like about the Open Graph Protocol is that it takes a logical approach to RDFa embedding... By putting RDFa triples in the HEAD in a straightforward manner, we can effectively extend the META and LINK elements which have languished for more than a decade.

Now, the ontology of things that Facebook supports is both smart and stupid.

It's smart in that it is simple; because it's small, people are going to use it correctly and it will be possible for people to interpret it correctly.

Freebase and Dbpedia provide us a huge "ABox" of named entities, which really are a superior way to refer to a named entity like "Megan Fox". A "TBox" for things in general is a much tougher problem, so I can't blame Facebook for taking the low road. For my project at

http://ookaboo.com/

I'm going to have to develop my own feature type ontology, because none of the existing ones are good enough. Other people, who've got different goals, will probably think my ontology sucks, but that's the whole point -- a "good" ontology is "good" for a particular purpose, not "good" in general.

I'

inkdroid.org said...

Maybe the lack of discussion means that it's easy enough that people don't have to argue about it. I don't know about you, but I've seen quite a bit of those "like" buttons in my travels around the web.

Theah said...

Totally get what you mean. The first thing I thought of when reading the "type" element was how biased it was. It was obviously designed by Facebook.

I do like that they suggest for you to extend it - and they do mention if they pick up a lot of people using the extension they will adopt it as a type.

In my opinion, it's only current value is controlling what text, title, and image appear when a user shares your link on Facebook.