Monday, December 03, 2007

Transitive decay in social networks

When I was doing research in computer security we often used to say "trust isn't transitive". What we meant was that if Alice trusts Bob and Bob trusts Charlie then we can't assume that Alice trusts Charlie. Another way to think of this is to look at your own friends and friends of friends; do you trust the friends of friends? It's likely that you do not (if you did then it's likely that they would actually already be a friend and not a FOAF).

Clearly, trust is not a constant value across all friends, so each of your N friends will have a trust value ("how much you trust them"), which I'll call Ti, assigned to them. A friend you'd trust with your life has Ti = 1 (perhaps they're a candidate for a BFF), and a friend you don't trust at all has Ti = 0. (I'll ignore the question of why you even have friends with Ti = 0, but in the context of computer social networks you probably do have some).

In social situations we are only exposed to this FOAF trust problem occasionally, but with 'social networking' a current web buzzword we see social networks, or social graphs, and can traverse them. Many web sites are trying to use this graph traversal to build services (e.g. LinkedIn allows you to send requests across the network, or ask questions; Facebook is hoping that graph traversal will be the new application distribution method).

But any graph traversal suffers from the FOAF trust problem. In a social network online this gets expressed by statements like "Just because Alice likes the Werewolf application and shares it with Bob and Charlie is friends with Bob, that doesn't mean that Charlie wants to be a Werewolf", or "A message crossing between more than one hop won't get passed all the time".

I dare say that LinkedIn, Facebook and others could actually characterize the rate at which the FOAF attenuates messages (be they actual messages, application installations, or any other meme) passing through the network.

I'm going to posit that the amount of trust a user would place in a FOAF (and a FOAFOAF, a FOAFOAFOAF, ... ) decays rapidly with the number of FOAF hops traversed.

Intuitively, if Alice trusts Bob with Talice,bob and Bob trusts Charlie with Tbob,charlie then how much does Alice trust Charlie? Less than she trusts Bob because Charlie is not her friend, and she can only evaluate Charlie based on Bob's recommendation. The more she trusts Bob the more she should trust Charlie. So some sort of estimate Talice,charlie is created from Talice,bob and Tbob,charlie taking into account these trust estimates.

A simple combination would be Talice,charlie = Talice,bob * Tbob,charlie (this assumes quite the opposite of the original declaration above: here trust is transitive to a certain degree).

The problem with this is that it treats all trust relationships as having equal weight, no matter how far they are from the original person (in this case, Alice). Imagine the case where Alice trusts Bob with Talice,bob = 1, Bob trusts Charlie Tbob,charlie = 1. This formula gives Talice,charlie = 1 which would probably not reflect most people's intuitive grasp of trust. If in addition, Charlie trusts Dave with Tcharlie,dave = 1 then we get Talice,dave = 1 which seems even more unlikely.

What's needed is a way to decay trust the further apart people are.

One way to to this is for each person to have their own damping factor the encodes how much they trust another person's trust. So Alice might trust other people's recommendations with factor Dalice (in the range [0,1]). The formula would be updated to have

Talice,charlie = Dalice * Talice,bob * Tbob,charlie

Talice,dave = Dalice * Talice,charlie = Dalice * Talice,bob * Dbob * Tbob,charlie * Tcharlie,dave

But that's still essentially linear. I think trust looks more like an inverse square law so that distance is explicitly encoded. With that

Talice,charlie = Dalice * Talice,bob * Tbob,charlie / 1^2

Talice,dave = Dalice * Talice,charlie / 2^2 = Dalice * Talice,bob * Dbob * Tbob,charlie * Tcharlie,dave / 4

This seems to fit better intuition because trust of distant people drops away very rapidly. Now, since this is only a hypothesis it would be interesting to measure the reach of messages passing inside a social network to look at the actual 'pass it on' rates to see if they match intuition.

Anyone out there got lots of social network data I could analyze? Perhaps there's a Facebook application developer who's tracked enough invite/install data that this could be verified.


Bart said...

I think you're going to have to account for FOAF webs rather than chains. E.g. if Talice,ed is non-zero and Ted,dave is non-zero, then that needs to be included in the formula for Talice,dave.

Bob Jonkman said...

John, you may want to analyze some of the links in the PGP key servers. Encoded in each public signature are (should be) a number of signatures, each of which is granted a level of validity. You can traverse from one signature to another, merely by following the links on the HTML page, or by importing the keys into GnuPG and checking their links recursively with the --list-keys command.

The signatures also encode how much the person holding the key trusts the keysigner, but I don't think this is made available from public keys. Of course, not having the actual trust relationships may invalidate this idea completely...