…what metaphysics is to physics. The old joke came to mind when a reporter asked me yesterday to comment on a paper published in the BMJ. “Intake of saturated and trans unsaturated fatty acids and risk of all cause mortality, cardiovascular disease, and type 2 diabetes: systematic review and meta-analysis of observational studies” by de Souza, et al.
Now the title ““Intake of saturated and trans unsaturated fatty acids…” tells you right off that this is not good news; lumping together saturated fat and trans-fat represents a clear indication of bias. A stand-by of Atkins-bashers, it is a way of vilifying saturated fat when the data won’t fit. In the study that the reporter asked about, the BMJ provided a summary:
“There was no association between saturated fats and health outcomes in studies where saturated fat generally replaced refined carbohydrates, but there was a positive association between total trans fatty acids and health outcomes Dietary guidelines for saturated and trans fatty acids must carefully consider the effect of replacement nutrients.”
“But?” The two statements are not really connected. In any case the message is clear: saturated fat is Ok. Trans-fat is not Ok. So, we have to be concerned about both saturated fat and trans-fat. Sad.
And “systematic” means the system that the author wants to use. This usually means “we searched the database of…” and then a meta-analysis. Explained by the overly optimistic “What is …?” series:
The jarring notes are “precise estimate” in combination with “combining…independent studies.” In practice, you usually only repeat an experiment exactly if you suspect that something was wrong with the original study or if the result is sufficiently outside expected values that you want to check it. Such an examination sensibly involves a fine-grained analysis of the experimental details. The idea underlying the meta-analysis, however, usually unstated, is that the larger the number of subjects in a study, the more compelling the conclusion. One might make the argument, instead, that if you have two or more studies which are imperfect, combining them is likely to lead to greater uncertainty and more error, not less. I am one who would make such an argument. So where did meta-analysis come from and what, if anything, is it good for?
I am trained in enzyme and protein chemistry but I have worked in a number of different fields including invertebrate animal behavior. I never heard of meta-analysis until very recently, that is, until I started doing research in nutrition. In fact, in 1970 there weren’t any meta-analyses, at least not with that phrase in the title, or at least not as determined by my systematic PubMed search. By 1990, there were about a 100 and by 2014, there were close to 10, 000 (Figure 1).
Figure 1. Logarithm of the number of papers in PubMed search with the title containing “meta-analysis” vs. Year of publication
This exponential growth suggests that the technique grew by reproducing itself. It suggests, in fact, that its origins are in spontaneous generation. In other words, it is popular because it is popular. (It does have obvious advantages; you don’t have to do any experiments). But does it give any useful information?
Meta-analysis
If you have a study that is under-powered, that is, if you only have a small number of subjects, and you find a degree of variability in the outcome, then combining the results from your experiment with another small study may point you to a consistent pattern. As such, it is a last-ditch, Hail-Mary kind of method. Applying it to large studies that have statistically meaningful results, however, doesn’t make sense, because:
- If all of the studies go in the same direction, you are unlikely to learn anything from combining them. In fact, if you come out with a value for the output that is different from the value from the individual studies, in science, you are usually required to explain why your analysis improved things. Just saying it is a larger n won’t cut it, especially if it is my study that you are trying to improve on.
- In the special case where all the studies show no effect and you come up with a number that is statistically significant, you are, in essence saying that many wrongs can make a right as described in a previous blog post on abuse of meta-analyses. In that post, I re-iterated the statistical rule that if the 95% CI bar crosses the line for hazard ratio = 1.0 then this is taken as an indication that there is no significant difference between the two conditions that are being compared. The example that I gave was the meta-analysis by Jakobsen, et al. on the effects of SFAs or a replacement on CVD outcomes (Figure 2). Amazingly, in the list of 15 different studies that she used, all but one cross the hazard ratio = 1.0 line. In other words, only one study found that keeping SFAs in the diet provides a lower risk than replacement with carbohydrate. For all the others there was no significant difference. The question is why an analysis was done at all. What could we hope to find? How could 15 studies that show nothing add up to a new piece of information? Most amazing is that some of the studies are more than 20 years old. How could these have had so little impact on our opinion of saturated fat? Why did we keep believing that it was bad?
Figure 2. Hazard ratios and 95% confidence intervals for coronary events and deaths in the different studies in a meta-analysis from Jakobsen, et al.Major types of dietary fat and risk of coronary heart disease: a pooled analysis of 11 cohort studies. Am J Clin Nutr 2009, 89(5):1425-1432.
3. Finally, suppose that you are doing a meta-analysis on several studies and that they have very different outcomes, showing statistically significant associations in different directions. For example, if some studies showed substituting saturated fat for carbohydrate increased risk while some showed that it decreased risk. What will you gain by averaging them? I don’t know about you but it doesn’t sound good to me. It makes me think of the old story of the emerging nation that was planning to build a railroad and didn’t know whether to use a gauge that matched the country to the north or the gauge of the country to the south. The parliament voted to use a gauge that was the average of the two.