Skip to main content

Mendelian Randomization: getting genes to run randomized trials for you

One of the core elements of my day job is dealing with causal relations: we try to understand what cause caused an effect. An area where much work has been done in understanding causal relationships is medicine where randomized controlled trials are used to understand the relationship between taking a medicine and some outcome.

But some things are hard to perform a trial on. It's all very well if you have a medicine to try out, but what if you want to know if, for example, having low serum cholesterol is associated with an increased risk of cancer?

That's not an idle question, the Framingham Heart Study, was thought to have shown a relationship between serum cholesterol and cancer (e.g. The serum cholesterol-cancer relationship: an analysis of time trends in the Framingham Study). But the question is: given that there appears to be a relationship, is it causal? Does low serum cholesterol cause cancer?

It could be that it's the other way around (called reverse causation: Low Cholesterol May Be Marker of Undiagnosed Cancer): if you are likely to get cancer you are likely to have low serum cholesterol. Or it could be that there's a confounding factor: something causes both low serum cholesterol and cancer.

It turns out that genetics, and specifically the fact that genes are randomly assigned during meiosis (in humans, for example, half the genes come from the mother and half from the father). Gregor Mendel's law of independent assortment says that the alleles of genes are chosen randomly from the possible alleles when a baby is being formed from the genetic material of mother and father.

This Mendelian Randomization means that it's possible to have nature perform a randomized trial for you. If you can find an allele that affects the trait you are trying to understand you can use it to sample the population to look for a cause and effect relationship.

In the case of low serum cholesterol there's a specific allele associated with Apolipoprotein E. The variant Apo E2 is associated with low serum cholesterol. And because of Mendel's law of independent assortment it will be assigned randomly in the population.

In 1986 Martijn B. Katan published a letter in The Lancet pointing out that Apo E2 causes a rare disease where patients have almost zero serum cholesterol.

Since Apo E2 is randomly assigned by Mendel's laws it's enough to look at the population and examine cancer rates and their relationship to the presence of the Apo E2 gene. So a 'trial' can be run by selecting a control group from the population and examining the rate of Apo E2 in that control. Then a group with cancer is tested for Apo E2.

If there's really a connection between low serum cholesterol and cancer then the cancer group should have a higher prevalence of Apo E2 than the control. You can think of the presence of Apo E2 being random across the population, if it's less than random in the cancer group (i.e. there's more or less than expected) then a causal relationship can be inferred. One way to see that is to look at a causal diagram of the relationships.


The arrows in the diagram represent causal relationships.

1. There's an arrow from Apo E2 to serum cholesterol because it is known that this allele causes low serum cholesterol.

2. The hypothesis is expressed in the arrow from low serum cholesterol to cancer. It's that arrow that's being determined.

3. There are other factors (age, diet, location, illnesses) which could affect both serum cholesterol and cancer.

4. There's no arrow leading to Apo E2 because it is completely determined by Mendel's laws. There's also no arrow from Apo E2 directly to the other factors because they are not affected by Apo E2.

5. There's no arrow directly from Apo E2 to cancer because there's no known direct relationship between the two.

(Note that these assumptions have to be justified. For example, #1 needs biological justification, as does #5).

With those relationships in place it's just a matter of performing the statistical test on the control group and cancer group to see if more Apo E2 is present in the cancer group (there's more on that in Mendelian randomization as an instrumental variable approach to causal inference).

This technique has been used to show a causal relationship between alcohol intake and blood pressure (see Alcohol Intake and Blood Pressure: A Systematic Review Implementing a Mendelian Randomization Approach) and to show no causal relationship between a mother's BMI and the fatness of her offspring (see Exploring the Developmental Overnutrition Hypothesis Using Parental–Offspring Associations and FTO as an Instrumental Variable).

And what of low serum cholesterol and cancer? A study (Apolipoprotein E Genotype, Plasma Cholesterol, and Cancer: A Mendelian Randomization Study) from 2009 concludes: "These findings suggest that low cholesterol levels are not causally related to increased cancer risk."

Thanks, Mendel!

Comments

Trevor Burnham said…
If I understand you correctly, you only have a population with random assignment if every person in it has one parent with allele A and one parent without it. Do you only recruit such people for your Apo E2 work?
Jonathan Histed said…
Fascinating: thank you for that :)

Popular posts from this blog

How to write a successful blog post

First, a quick clarification of 'successful'. In this instance, I mean a blog post that receives a large number of page views. For my, little blog the most successful post ever got almost 57,000 page views. Not a lot by some other standards, but I was pretty happy about it. Looking at the top 10 blog posts (by page views) on my site, I've tried to distill some wisdom about what made them successful. Your blog posting mileage may vary. 1. Avoid using the passive voice The Microsoft Word grammar checker has probably been telling you this for years, but the passive voice excludes the people involved in your blog post. And that includes you, the author, and the reader. By using personal pronouns like I, you and we, you will include the reader in your blog post. When I first started this blog I avoid using "I" because I thought I was being narcissistic. But we all like to read about other people, people help anchor a story in reality. Without people your bl

Your last name contains invalid characters

My last name is "Graham-Cumming". But here's a typical form response when I enter it: Does the web site have any idea how rude it is to claim that my last name contains invalid characters? Clearly not. What they actually meant is: our web site will not accept that hyphen in your last name. But do they say that? No, of course not. They decide to shove in my face the claim that there's something wrong with my name. There's nothing wrong with my name, just as there's nothing wrong with someone whose first name is Jean-Marie, or someone whose last name is O'Reilly. What is wrong is that way this is being handled. If the system can't cope with non-letters and spaces it needs to say that. How about the following error message: Our system is unable to process last names that contain non-letters, please replace them with spaces. Don't blame me for having a last name that your system doesn't like, whose fault is that? Saying "Your

The Elevator Button Problem

User interface design is hard. It's hard because people perceive apparently simple things very differently. For example, take a look at this interface to an elevator: From flickr Now imagine the following situation. You are on the third floor of this building and you wish to go to the tenth. The elevator is on the fifth floor and there's an indicator that tells you where it is. Which button do you press? Most people probably say: "press up" since they want to go up. Not long ago I watched someone do the opposite and questioned them about their behavior. They said: "well the elevator is on the fifth floor and I am on the third, so I want it to come down to me". Much can be learnt about the design of user interfaces by considering this, apparently, simple interface. If you think about the elevator button problem you'll find that something so simple has hidden depths. How do people learn about elevator calling? What's the right amount of