Thursday, January 14, 2010

Mendelian Randomization: getting genes to run randomized trials for you

One of the core elements of my day job is dealing with causal relations: we try to understand what cause caused an effect. An area where much work has been done in understanding causal relationships is medicine where randomized controlled trials are used to understand the relationship between taking a medicine and some outcome.

But some things are hard to perform a trial on. It's all very well if you have a medicine to try out, but what if you want to know if, for example, having low serum cholesterol is associated with an increased risk of cancer?

That's not an idle question, the Framingham Heart Study, was thought to have shown a relationship between serum cholesterol and cancer (e.g. The serum cholesterol-cancer relationship: an analysis of time trends in the Framingham Study). But the question is: given that there appears to be a relationship, is it causal? Does low serum cholesterol cause cancer?

It could be that it's the other way around (called reverse causation: Low Cholesterol May Be Marker of Undiagnosed Cancer): if you are likely to get cancer you are likely to have low serum cholesterol. Or it could be that there's a confounding factor: something causes both low serum cholesterol and cancer.

It turns out that genetics, and specifically the fact that genes are randomly assigned during meiosis (in humans, for example, half the genes come from the mother and half from the father). Gregor Mendel's law of independent assortment says that the alleles of genes are chosen randomly from the possible alleles when a baby is being formed from the genetic material of mother and father.

This Mendelian Randomization means that it's possible to have nature perform a randomized trial for you. If you can find an allele that affects the trait you are trying to understand you can use it to sample the population to look for a cause and effect relationship.

In the case of low serum cholesterol there's a specific allele associated with Apolipoprotein E. The variant Apo E2 is associated with low serum cholesterol. And because of Mendel's law of independent assortment it will be assigned randomly in the population.

In 1986 Martijn B. Katan published a letter in The Lancet pointing out that Apo E2 causes a rare disease where patients have almost zero serum cholesterol.

Since Apo E2 is randomly assigned by Mendel's laws it's enough to look at the population and examine cancer rates and their relationship to the presence of the Apo E2 gene. So a 'trial' can be run by selecting a control group from the population and examining the rate of Apo E2 in that control. Then a group with cancer is tested for Apo E2.

If there's really a connection between low serum cholesterol and cancer then the cancer group should have a higher prevalence of Apo E2 than the control. You can think of the presence of Apo E2 being random across the population, if it's less than random in the cancer group (i.e. there's more or less than expected) then a causal relationship can be inferred. One way to see that is to look at a causal diagram of the relationships.

The arrows in the diagram represent causal relationships.

1. There's an arrow from Apo E2 to serum cholesterol because it is known that this allele causes low serum cholesterol.

2. The hypothesis is expressed in the arrow from low serum cholesterol to cancer. It's that arrow that's being determined.

3. There are other factors (age, diet, location, illnesses) which could affect both serum cholesterol and cancer.

4. There's no arrow leading to Apo E2 because it is completely determined by Mendel's laws. There's also no arrow from Apo E2 directly to the other factors because they are not affected by Apo E2.

5. There's no arrow directly from Apo E2 to cancer because there's no known direct relationship between the two.

(Note that these assumptions have to be justified. For example, #1 needs biological justification, as does #5).

With those relationships in place it's just a matter of performing the statistical test on the control group and cancer group to see if more Apo E2 is present in the cancer group (there's more on that in Mendelian randomization as an instrumental variable approach to causal inference).

This technique has been used to show a causal relationship between alcohol intake and blood pressure (see Alcohol Intake and Blood Pressure: A Systematic Review Implementing a Mendelian Randomization Approach) and to show no causal relationship between a mother's BMI and the fatness of her offspring (see Exploring the Developmental Overnutrition Hypothesis Using Parental–Offspring Associations and FTO as an Instrumental Variable).

And what of low serum cholesterol and cancer? A study (Apolipoprotein E Genotype, Plasma Cholesterol, and Cancer: A Mendelian Randomization Study) from 2009 concludes: "These findings suggest that low cholesterol levels are not causally related to increased cancer risk."

Thanks, Mendel!


Trevor Burnham said...

If I understand you correctly, you only have a population with random assignment if every person in it has one parent with allele A and one parent without it. Do you only recruit such people for your Apo E2 work?

Jonathan Histed said...

Fascinating: thank you for that :)