Archive for the ‘Evidence Based Medicine’ Category

Baseball is like church. Many attend. Few understand.

— Leo Durocher.

The movie Moneyball provides an affirmative answer to an important question in literature and drama: can you present a scene and bring out the character of a subject that is boring while, at the same time, not make the presentation boring?  The movie, and  Michael Lewis’sbook that it is based on, are about baseball and statistics!  For fans, baseball is not boring so much as incredibly slow, providing a soothing effect like fishing, interspersed with an occasional big catch. The movie stars Brad Pitt as Billy Beane, the General Manager of the Oakland Athletics baseball team in the 1990s.  A remarkably talented high school athlete, Billy Beane, for unknown reasons, was never able to play out his potential as an MLB player but, in the end, he had a decisive effect on the game at the managerial level. The question is how the A’s, with one-third of the budget of the Yankees, could have been in the play-offs three years in a row and, in 2001, could win 102 games.  The movie is more or less faithful to the book and both are as much about organizations and psychology as about sports. The story was “an example of how an unscientific culture responds, or fails to respond, to the scientific method” and the science is substantially statistical.

In America, baseball is a metaphor for just about everything. Probably because it is an experience of childhood and adolescence, lessons learned from baseball stay with us. Baby-boomers who grew up in Brooklyn were taught by Bobby Thompson’s 1951 home-run, as by nothing later, that life isn’t fair. The talking heads in Ken Burns’s Baseball who found profound meaning in the sport are good examples. Former New York Governor Mario Cuomo’s comments were quite philosophical although he did add the observation that getting hit in the head with a pitched ball led him to go into politics.

One aspect of baseball that is surprising, especially when you consider the money involved, is the extent to which strategy and scouting practices have generally ignored hard scientific data in favor of tradition and lore. Moneyball tells us about group think, self-deception and adherence to habit in the face of science. For those of us who a trying to make sense of the field of nutrition, where people’s lives are at stake and where numerous professionals who must know better insist on dogma — low fat, no red meat — in the face of contradictory evidence, baseball provides some excellent analogies.

The real stars of the story are the statistics and the computer or, more precisely, the statistics and computer guys: Bill James an amateur analyzer of baseball statistics and Paul DePodesta, assistant General Manager of the A’s who provided information about the real nature of the game and how to use this information. James self-published a photocopied book called 1977 baseball abstract: featuring 18 categories of statistical information you just can’t find anywhere else. The book was not just about statistics but was in fact a critique of traditional statistics pointing out, for example, that the concept of an “error;” was antiquated, deriving from the early days of gloveless fielders and un-groomed playing fields of the 1850s. In modern baseball, “you have to do something right to get an error; even if the ball is hit right at you, and you were standing in the right place to begin with.” Evolving rapidly, the Abstracts became a fixture of baseball life and are currently the premium (and expensive) way to obtain baseball information.

It is the emphasis on statistics that made people doubt that Moneyball could be made into a movie and is probably why they stopped shooting the first time around a couple of years ago. Also, although Paul DePodesta (above) is handsome and athletic, Hollywood felt that they should cast him as an overweight geek type played by Jonah Hill. All of the characters in the film have the names of the real people except for DePodesta “for legal reasons,” he says. Paul must have no sense of humor.

The important analogy with nutrition research and the continuing thread in this blog, is that it is about the real meaning of statistics. Lewis recognized that the thing that James thought was wrong with the statistics was that they

“made sense only as numbers, not as a language. Language, not numbers, is what interested him. Words, and the meaning they were designed to convey. ‘When the numbers acquire the significance of language,’ he later wrote, ‘they acquire the power to do all the things which language can do: to become fiction and drama and poetry … . And it is not just baseball that these numbers through a fractured mirror, describe. It is character. It is psychology, it is history, it is power and it is grace, glory, consistency….’”

By analogy, it is the tedious comparison of quintiles from the Harvard School of Public Health proving that white rice will give you diabetes but brown rice won’t or red meat is bad but white meat is not, odds ratio = 1.32. It is the bloodless, mindless idea that if the computer says so, it must be true, regardless of what common sense tells you. What Bill James and Paul DePodesta brought to the problem was understanding that the computer will only give you a meaningful answer if you ask the right question; asking what behaviors accumulated runs and won ball games, not which physical characteristics — runs fast, looks muscular — that seem to go with being a ball player… the direct analog of “you are what you eat,” or the relative importance of lowering you cholesterol vs whether you actually live or die.

As early as the seventies, the computer had crunched baseball stats and come up with clear recommendations for strategy. The one I remember, since it was consistent with my own intuition, was that a sacrifice bunt was a poor play; sometimes it worked but you were much better off, statistically, having every batter simply try to get a hit. I remember my amazement at how little effect the computer results had on the frequency of sacrifice bunts in the game. Did science not count? What player or manager did not care whether you actually won or lost a baseball game. The themes that are played out in Moneyball, is that tradition dies hard and we don’t like to change our mind even for our own benefit. We invent ways to justify our stubbornness and we focus on superficial indicators rather than real performance and sometimes we are just not real smart.

Among the old ideas, still current, was that the batting average is the main indicator of a batter’s strength. The batting average is computed by considering that a base-on-balls is not an official at bat whereas a moments thought tells you that the ability to avoid bad pitches is an essential part of the batter’s skill. Early on, even before he was hired by Billy Beane, Paul DePodesta had run the statistics from every twentieth century baseball team. There were only two offensive statistics that were important for a winning team percentage: on-base percentage (which included walks) and slugging percentage. “Everything else was far less important.” These numbers are now part of baseball although I am not enough of a fan to know the extent to which they are still secondary to the batting average.

One of the early examples of the conflict between tradition and science was the scouts refusal to follow up on the computer’s recommendation to look at a fat, college kid named Kevin Youkilis who would soon have the second highest on-base percentage after Barry Bonds. “To Paul, he’d become Euclis: the Greek god of walks.”

The big question in nutrition is how the cholesterol-diet-heart paradigm can persist in the face of the consistent failures of experimental and clinical trials to provide support. The story of these failures and the usurpation of the general field by idealogues has been told many times. Gary Taubes’s Good Calories, Bad Calories is the most compelling and, as I pointed out in a previous post, there seems to have been only one rebuttal, Steinberg’s Cholesterol Wars. The Skeptics vs. the Preponderance of Evidence. At least within the past ten year, a small group have tried to introduce new ideas, in particular that it is excessive consumption of dietary carbohydrate, not dietary fat, that is the metabolic component of the problems in obesity, diabetes and heart disease and have provided extensive, if generally un-referenced, experimental support. An analogous group tried to influence baseball in the years before Billy Beane. Larry Lucchino, an executive of the San Diego Padres described the group in baseball as being perceived as something of a cult and therefore easily dismissed. “There was a profusion of new knowledge and it was ignored.” As described in Moneyball “you didn’t have to look at big-league baseball very closely to see its fierce unwillingness to rethink any it was as if it had been inoculated against outside ideas.”

“Grady Fuson, the A’s soon to be former head of scouting, had taken a high school pitcher named Jeremy Bonderman and the kid had a 94 mile-per-hour fastball, a clean delivery, and a body that looked as if it had been created to wear a baseball uniform. He was, in short, precisely the kind of pitcher Billy thought he had trained the scouting department to avoid…. Taking a high school pitcher in the first round — and spending 1.2 million bucks to sign — that was exactly this sort of thing that happened when you let scouts have their way. It defied the odds; it defied reason. Reason, even science, was what Billy Beane was intent on bringing to baseball.”

The analogy is to the deeply ingrained nutritional tradition, the continued insistence on cholesterol and dietary fat that are assumed to have evolved in human history in order to cause heart disease. The analogy is the persistence of the lipophobes, in the face of scientific results showing, at every turn, that these were bad ideas, that, in fact, dietary saturated fat does not cause heart disease. It leads, in the end, to things like Steinberg’s description of the Multiple risk factor intervention trial. (MRFIT; It’s better not to be too clever on acronyms lest the study really bombs out): “Mortality from CHD was 17.9 deaths per 1,000 in the [intervention] group and 19.3 per 1,000 in the [control] group, a statistically nonsignificant difference of 7.1%”). Steinberg’s take on MRFIT:

“The study failed to show a significant decrease in coronary heart disease and is often cited as a negative study that challenges the validity of the lipid hypothesis. However, the difference in cholesterol level between the controls and those on the lipid lowering die was only about 2 per cent. This was clearly not a meaningful test of the lipid hypothesis.”

In other words, cholesterol is more important than outcome or at least a “diet designed to lower cholesterol levels, along with advice to stop smoking and advice on exercise” may still be a good thing.

Similarly, the Framingham study which found a strong association between cholesterol and heart disease found no effect of dietary fat, saturated fat or cholesterol on cardiovascular disease.  Again, a marker for risk is more important than whether you get sick.  “Scouts” who continued to look for superficial signs and ignore seemingly counter-intuitive conclusions from the computer still hold sway on the nutritional team.

“Grady had no way of knowing how much Billy disapproved of Grady’s most deeply ingrained attitude — that Billy had come to believe that baseball scouting was at roughly the same stage of development in the twenty-first century as professional medicine had been in the eighteenth.”

Professional medicine? Maybe not the best example.

What is going on here? Physicians, like all of us, are subject to many reinforcers but for humans power and control are usually predominant and, in medicine, that plays out most clearly in curing the patient. Defeating disease shines through even the most cynical analysis of physician’s motivations. And who doesn’t play baseball to win. “The game itself is a ruthless competition. Unless you’re very good, you don’t survive in it.”

Moneyball describes a “stark contrast between the field of play and the uneasy space just off it, where the executives in the Scouts make their livings.” For the latter, read the expert panels of the American Heat Association and the Dietary Guidelines committee, the Robert Eckels who don’t even want to study low carbohydrate diets (unless it can be done in their own laboratory with NIH money). In this

“space just off the field of play there really is no level of incompetence that won’t be tolerated. There are many reasons for this, but the big one is that baseball has structured itself less as a business and as a social club. The club includes not only the people who manage the team but also in a kind of women’s auxiliary many of the writers and commentators to follow and purport to explain. The club is selective, but the criteria for admission and retention and it is there many ways to embarrass the club, but being bad at your job isn’t one of them. The greatest offense a club member can commit is not ineptitude but disloyalty.”

The vast NIH-USDA-AHA social club does not tolerate dissent. And the media, WebMD, Heart.org and all the networks from ABCNews to Huffington Post will be there to support the club. The Huffington Post, who will be down on the President of the United States in a moment, will toe the mark when it comes to a low carbohydrate story.

The lessons from money ball are primarily in providing yet another precedent for human error, stubbornness and, possibly even stupidity, even in an area where the stakes are high. In other words, the nutrition mess is not in our imagination. The positive message is that there is, as they say in political science, validator satisfaction. Science must win out. The current threat is that the nutritional establishment is, as I describe it, slouching toward low-carb, doing small experiments, and easing into a position where they will say that they never were opposed to the therapeutic value of carbohydrate restriction. A threat because they will try to get their friends funded to repeat, poorly, studies that have already been done well. But that is another story, part of the strange story of Medicineball.

“Doctors prefer large studies that are bad to small studies that are good.”

— anon.

The paper by Foster and coworkers entitled Weight and Metabolic Outcomes After 2 Years on a Low-Carbohydrate Versus Low-Fat Diet, published in 2010, had a surprisingly limited impact, especially given the effect of their first paper in 2003 on a one-year study.  I have described the first low carbohydrate revolution as taking place around that time and, if Gary Taubes’s article in the New York Times Magazine was the analog of Thomas Paine’s Common Sense, Foster’s 2003 paper was the shot hear ’round the world.

The paper showed that the widely accepted idea that the Atkins diet, admittedly good for weight loss, was a risk for cardiovascular disease, was not true.  The 2003 Abstract said “The low-carbohydrate diet was associated with a greater improvement in some risk factors for coronary heart disease.” The publication generated an explosive popularity of the Atkins diet, ironic in that Foster had said publicly that he undertook the study in order to “once and for all,” get rid of the Atkins diet.  The 2010 paper by extending the study to 2 years would seem to be very newsworthy.  So what was wrong?  Why is the new paper more or less forgotten?  Two things.  First, the paper was highly biased and its methods were so obviously flawed — obvious even to the popular press — that it may have been a bit much even for the media. It remains to be seen whether it will really be cited but I will suggest here that it is a classic in misleading research and in the foolishness of intention-to-treat (ITT).

(more…)

Asher Peres was a physicist, an expert in information theory who died in 2005 and was remembered for his scientific contributions as well as for his iconoclastic wit and ironic aphorisms. One of his witticisms was that “unperformed research has no results ”  Peres had undoubtedly never heard of intention-to-treat (ITT), the strange statistical method that has appeared recently, primarily in the medical literature.  According to ITT, the data from a subject assigned at random to an experimental group must be included in the reported outcome data for that group even if the subject does not follow the protocol, or even if they drop out of the experiment.  At first hearing, the idea is counter-intuitive if not completely idiotic  – why would you include people who are not in the experiment in your data? – suggesting that a substantial burden of proof rests with those who want to employ it.  No such obligation is usually met and particularly in nutrition studies, such as comparisons of isocaloric weight loss diets, ITT is frequently used with no justification and sometimes demanded by reviewers.   Not surprisingly, there is a good deal of controversy on this subject.  Physiologists or chemists, hearing this description usually walk away shaking their head or immediately come up with one or another obvious reductio ad absurdum, e.g. “You mean, if nobody takes the pill, you report whether or not they got better anyway?” That’s exactly what it means.

On the naive assumption that some people really didn’t understand what was wrong with ITT — I’ve been known to make a few elementary mistakes in my life — I wrote a paper on the subject.  It received negative, actually hostile. reviews from two public health journals — I include an amusing example at the end of this post.  I even got substantial grief from Nutrition & Metabolism, where I was the editor at the time, but where it was finally published.  The current post will be based on that paper and I will provide a couple of interesting cases from the medical literature.  In the next post I will discuss a quite remarkable new instance — Foster’s two year study of low carbohydrate diets — of the abuse of common sense that is the major alternative to ITT.

To put a moderate spin on the problem, there is nothing wrong with ITT, if you explicitly say what the method shows — the effect of assigning subjects to an experimental protocol; the title of my paper was Intention-to-treat.  What is the question? If you are very circumspect about that question, then there is little problem.  It is common, however, for the Abstract of a paper to correctly state that patients “were assigned to a diet” but by the time the Results are presented, the independent variable has become, not “assignment to the diet,” but “the diet” which most people would assume meant what people ate, rather than what they were told to eat. Caveat lector.  My paper was a kind of over-kill and I made several different arguments but the common sense argument gets to the heart of the problem in a practical way.  I’ll describe that argument and also give a couple of real examples.

Common sense argument against intention-to-treat

Consider an experimental comparison of two diets in which there is a simple, discrete outcome, e.g. a threshold amount of weight lost or remission of an identifiable symptom. Patients are randomly assigned to two different diets: diet group A or diet group B and a target of, say, 5 kg weight loss is considered success. As shown in the table above, in group A, half of the subject are able to stay on the diet but, for whatever reason, half are not. The half of the patients in group A who did stay on the diet, however, were all able to lose the target 5 kg.  In group B, on the other hand, everybody is able to stay on the diet but only half are able to lose the required amount of weight. An ITT analysis shows no difference in the two outcomes, while just looking at those people who followed the diet shows 100 % success.  This is one of the characteristics of ITT: it always makes the better diet look worse than it is.

         Diet A         Diet B
Compliance (of 100 patients)   50   100
Success (reached target)   50    50
ITT success   50/100 = 50%   50/100 = 50%
“per protocol” (followed diet) success   50/50 = 100%   50/100 = 50%

Now, you are the doctor.  With such data in hand should you advise a patient: “well, the diets are pretty much the same. It’s largely up to you which you choose,” or, looking at the raw data (both compliance and success), should the recommendation be: “Diet A is much more effective than diet B but people have trouble staying on it. If you can stay on diet A, it will be much better for you so I would encourage you to see if you could find a way to do so.” Which makes more sense? You’re the doctor.

I made several arguments trying to explain that there are two factors, only one of which (whether it works) is clearly due to the diet. The other (whether you follow the diet) is under control of other factors (whether WebMD tells you that one diet or the other will kill you, whether the evening news makes you lose your appetite, etc.)  I even dragged in a geometric argument because Newton had used one in the Principia: “a 2-dimensional outcome space where the length of a vector tells how every subject did…. ITT represents a projection of the vector onto one axis, in other words collapses a two dimensional vector to a one-dimensional vector, thereby losing part of the information.” Pretentious? Moi?

Why you should care.  Case I. Surgery or Medicine?

Does your doctor actually read these academic studies using ITT?  One can only hope not.  Consider the analysis by Newell  of the Coronary Artery Bypass Surgery (CABS) trial.  This paper is astounding for its blanket, tendentious insistence on what is correct without any logical argument.  Newell considers that the method of

 “the CABS research team was impeccable. They refused to do an ‘as treated’ analysis: ‘We have refrained from comparing all patients actually operated on with all not operated on: this does not provide a measure of the value of surgery.”

Translation: results of surgery do not provide a measure of the value of surgery.  So, in the CABS trial, patients were assigned to Medicine or Surgery. The actual method used and the outcomes are shown in the Table below. Intention-to-treat analysis was, as described by Newell, “used, correctly.” Looking at the table, you can see that a 7.8% mortality was found in those assigned to receive medical treatment (29 people out of 373 died), and a 5.3% mortality (21 deaths out of 371) for assignment to surgery.  If you look at the outcomes of each modality as actually used, it turns out that that medical treatment had a 9.5% (33/349) mortality rate compared with 4.1% (17/419) for surgery, an analysis that Newell says “would have wildly exaggerated the apparent value of surgery.”

Survivors and deaths after allocation to surgery or medical treatment
Allocated medicine Allocated surgery
  Received surgery     Received medicine   Received surgery     Received medicine
Survived 2 years   48   296   354   20
Died    2    27    15    6
Total   50   323   369   26

Common sense suggests that appearances are not deceiving. If you were one of the 33-17 = 16 people who were still alive, you would think that it was the potential report of your death that had been exaggerated.  The thing that is under the control of the patient and the physician, and which is not a feature of the particular modality, is getting the surgery implemented. Common sense dictates that a patient is interested in surgery, not the effect of being told that surgery is good.  The patient has a right to expect that if they comply, the physician would avoid conditions where, as stated by Hollis,  “most types of deviations from protocol would continue to occur in routine practice.” The idea that “Intention to treat analysis is … most suitable for pragmatic trials of effectiveness rather than for explanatory investigations of efficacy” assumes that practical considerations are the same everywhere and that any practitioner is locked into the same abilities or lack of abilities as the original experimenter.

What is the take home message.  One general piece of advice that I would give based on this discussion in the medical literature: don’t get sick.

Why you should care.  Case II. The effect of vitamin E supplementation

A clear cut case of how off-the-mark ITT can be is a report on the value of antioxidant supplements. The Abstract of the paper concluded that “there were no overall effects of ascorbic acid, vitamin E, or beta carotene on cardiovascular events among women at high risk for CVD.” The study was based on an ITT analysis but,on the fourth page of the paper, it turns out that removing subjects due to

“noncompliance led to a significant 13% reduction in the combined end point of CVD morbidity and mortality… with a 22% reduction in MI …, a 27% reduction in stroke …. a 23% reduction in the combination of MI, stroke, or CVD death (RR (risk ratio), 0.77; 95% CI, 0.64–0.92 [P = 005]).”

The media universally reported the conclusion from the Abstract, namely that there was no effect of vitamin E. This conclusion is correct if you think that you can measure the effect of vitamin E without taking the pill out of the bottle.  Does this mean that vitamin E is really of value? The data would certainly be accepted as valuable if the statistics were applied to a study of the value of replacing barbecued pork with whole grain cereal. Again, “no effect” was the answer to the question: “what happens if you are told to take vitamin E” but it still seems is reasonable that the effect of a vitamin means the effect of actually taking the vitamin.

The ITT controversy

Advocates of ITT see its principles as established and may dismiss a common sense approach as naïve. The issue is not easily resolved; statistics is not axiomatic: there is no F=ma, there is no zeroth law.  A good statistics book will tell you in the Introduction that what we do in statistics is to try to find a way to quantify our intuitions. If this is not appreciated, and you do not go back to consideration of exactly what the question is that you are asking, it is easy to develop a dogmatic approach and insist on a particular statistic because it has become standard.

As I mentioned above, I had a good deal of trouble getting my original paper published and one  anonymous reviewer said that “the arguments presented by the author may have applied, maybe, ten or fifteen years ago.” This criticism reminded me of Molière’s Doctor in Spite of Himself:

Sganarelle is disguised as a doctor and spouts medical double-talk with phony Latin, Greek and Hebrew to impress the client, Geronte, who is pretty dumb and mostly falls for it but:

Geronte: …there is only one thing that bothers me: the location of the liver and the heart. It seemed to me that you had them in the wrong place: the heart is on the left side but the liver is on the right side.

Sgnarelle: Yes. That used to be true but we have changed all that and medicine uses an entirely new approach.

Geronte: I didn’t know that and I beg your pardon for my ignorance.

 In the end, it is reasonable that scientific knowledge be based on real observations. This has never before been thought to include data that was not actually in the experiment. I doubt that nous avons changé tout cela.

“These results suggest that there is no superior long-term metabolic benefit of a high-protein diet over a high-carbohydrate in the management of type 2 diabetes.”  The conclusion is from a paper by Larsen, et al. [1] which, based on that statement in the Abstract, I would not normally bother to read; it is good that you have to register trials and report failures but from a broader perspective, finding nothing is not great news and just because Larsen couldn’t do it, doesn’t mean it can’t be done.  However, in this case, I received an email from International Diabetes published bilingually in Beijing: “Each month we run a monthly column where choose a hot-topic article… and invite expert commentary opinion about that article” so I agreed to write an opinion. The following is my commentary:

“…no superior long-term metabolic benefit of a high-protein diet over a high-carbohydrate ….” A slightly more positive conclusion might have been that “a high-protein diet is as good as a high carbohydrate diet.”  After all, equal is equal. The article is, according to the authors, about “high-protein, low-carbohydrate” so rather than describing a comparison of apples and pears, the conclusion should emphasize low carbohydrate vs high carbohydrate.   It is carbohydrate, not protein, that is the key question in diabetes but clarity was probably not the idea. The paper by Larsen, et al. [1] represents a kind of classic example of the numerous studies in the literature whose goal is to discourage people with diabetes from trying a diet based on carbohydrate restriction, despite its intuitive sense (diabetes is a disease of carbohydrate intolerance) and despite its established efficacy and foundations in basic biochemistry.  The paper is characterized by blatant bias, poor experimental design and mind-numbing statistics rather than clear graphic presentation of the data. I usually try to take a collegial approach in these things but this article does have a unique and surprising feature, a “smoking gun” that suggests that the authors were actually aware of the correct way to perform the experiment or at least to report the data.

Right off, the title tells you that we are in trouble. “The effect of high-protein, low-carbohydrate diets in the treatment…” implying that all such diets are the same even though  there are several different versions, some of which (by virtue of better design) will turn out to have had much better performance than the diet studied here and, almost all of which are not “high protein.” Protein is one of the more stable features of most diets — the controls in this experiment, for example, did not substantially lower their protein even though advised to do so –and most low-carbohydrate diets advise only carbohydrate restriction.  While low-carbohydrate diets do not counsel against increased protein, they do not necessarily recommend it.  In practice, most carbohydrate-restricted diets are hypocaloric and the actual behavior of dieters shows that they generally do not add back either protein or fat, an observation first made by LaRosa in 1980.

Atkins-bashing is not as easy as it used to be when there was less data and one could run on “concerns.” As low-fat diets continue to fail at both long-term and short-term trials — think Women’s Health Initiative [2] — and carbohydrate restriction continues to show success and continues to bear out the predictions from the basic biochemistry of the insulin-glucose axis  [3], it becomes harder to find fault.  One strategy is to take advantage of the lack of formal definitions of low-carbohydrate diets to set up a straw man.  The trick is to test a moderately high carbohydrate diet and show that, on average, as here, there is no difference in hemoglobin A1c, triglycerides and total cholesterol, etc. when compared to a higher carbohydrate diet as control —  the implication is that in a draw, the higher carbohydrate diet wins.  So, Larsen’s low carbohydrate diet contains 40 % of energy as carbohydrate.  Now, none of the researchers who have demonstrated the potential of carbohydrate restriction would consider 40 % carbohydrate, as used in this study, to be a low-carbohydrate diet. In fact, 40 % is close to what the American population consumed before the epidemic of obesity and diabetes. Were we all on a low carbohydrate diet before Ancel Keys?

What happened?  As you might guess, there weren’t notable differences on most outcomes but like other such studies in the literature, the authors report only group statistics so you don’t really know who ate what and they use an intention-to-treat (ITT) analysis. According to ITT, a research report should include data from those subjects that dropped out of the study (here, about 19 % of each group). You read that correctly.  The idea is based on the assumption (insofar as it has any justification at all) that compliance is an inherent feature of the diet (“without carbs, I get very dizzy”) rather than a consequence of bias transmitted from the experimenter, or distance from the hospital, or any of a thousand other things.  While ITT has been defended vehemently, the practice is totally counter-intuitive, and has been strongly attacked on any number of grounds, the most important of which is that, in diet experiments, it makes the better diet look worse.  Whatever the case that can be made, however, there is no justification for reporting only intention-to-treat data, especially since, in this paper, the authors consider as one of the “strengths of the study … the measurement of dietary compliance.”

The reason that this is all more than technical statistical detail, is that the actual reported data show great variability (technically, the 95 % confidence intervals are large).  To most people, a diet experiment is supposed to give a prospective dieter information about outcome.  Most patients would like to know: if I stay on this diet, how will I do.  It is not hard to understand that if you don’t stay on the diet, you can’t expect good results.  Nobody knows what 81 % staying on the diet could mean.  In the same way, nobody loses an average amount of weight. If you look at  the spread in performance and in what was consumed by individuals on this diet, you can see that there is big individual variation Also, being “on a diet”, or being “assigned to a diet” is very different than actually carrying out dieting behavior, that is, eating a particular collection of food.  When there is wide variation, a person in the low-carb group may be eating more carbs than some person in the high-carb group.  It may be worth testing the effect of having the doctor tell you to eat fewer carbs, but if you are trying to lose weight, you want them to test the effect of actually eating fewer carbs.

When I review papers like this for a journal I insist that the authors present individual data in graphic form.  The question in low-carbohydrate diets is the effect of amount of carbohydrate consumed on the outcomes.  Making a good case to the reader involves showing individual data.  As a reviewer, I would have had the authors plot each individual’s consumption of carbohydrate vs for example, individual changes in triglyceride and especially HbA1c.  Both of these are expected to be dependent on carbohydrate consumption.  In fact, this is the single most common criticism I make as reviewer or that I made when I was co-editor-in chief at Nutrition and Metabolism.

So what is the big deal?  This is not the best presentation of the data and it is really hard to tell what the real effect of carbohydrate restriction is. Everybody makes mistakes and few of my own papers are without some fault or other. But there’s something else here.  In reading a paper like this, unless you suspect that something wasn’t done correctly, you don’t tend to read the Statistical analysis section of the Methods very carefully (computers have usually done most of the work).  In this paper, however, the following remarkable paragraph jumps out at you.  A real smoking gun:

  • “As this study involved changes to a number of dietary variables (i.e. intakes of calories, protein and carbohydrate), subsidiary correlation analyses were performed to identify whether study endpoints were a function of the change in specific dietary variables. The regression analysis was performed for the per protocol population after pooling data from both groups. “

What?  This is exactly what I would have told them to do.  (I’m trying to think back. I don’t think I reviewed this paper).  The authors actually must have plotted the true independent variable, dietary intake — carbohydrate, calories, etc. — against the outcomes, leaving out the people who dropped out of the study.  So what’s the answer?

  • “These tests were interpreted marginally as there was no formal adjustment of the overall type 1 error rate and the p values serve principally to generate hypotheses for validation in future studies.”

Huh?  They’re not going to tell us?  “Interpreted marginally?”  What the hell does that mean?  A type 1 error refers to a false positive, that is, they must have found a correlation between diet and outcome in distinction to what the conclusion of the paper is.  They “did not formally adjust for” the main conclusion?  And “p values serve principally to generate hypotheses?”  This is the catch-phrase that physicians are taught to dismiss experimental results that they don’t like.  Whether it means anything or not, in this case there was a hypothesis, stated right at the beginning of the paper in the Abstract: “…to determine whether high-protein diets are superior to high-carbohydrate diets for improving glycaemic control in individuals with type 2 diabetes.”

So somebody — presumably a reviewer — told them what to do but they buried the results.  My experience as an editor was, in fact, that there are people in nutrition who think that they are beyond peer review and I had had many fights with authors.  In this case, it looks like the actual outcome of the experiment may have actually been the opposite of what they say in the paper.  How can we find out?  Like most countries, Australia has what are called “sunshine laws,” that require government agencies to explain their actions.  There is a Australian Federal Freedom of Information Act (1992) and one for the the state of Victoria (1982). One of the authors is supported by NHMRC (National Health and Medical Research Council)  Fellowship so it may be they are obligated to share this marginal information with us.  Somebody should drop the government a line.

Bibliography

1. Larsen RN, Mann NJ, Maclean E, Shaw JE: The effect of high-protein, low-carbohydrate diets in the treatment of type 2 diabetes: a 12 month randomised controlled trial. Diabetologia 2011, 54(4):731-740.

2. Tinker LF, Bonds DE, Margolis KL, Manson JE, Howard BV, Larson J, Perri MG, Beresford SA, Robinson JG, Rodriguez B et al: Low-fat dietary pattern and risk of treated diabetes mellitus in postmenopausal women: the Women’s Health Initiative randomized controlled dietary modification trial. Arch Intern Med 2008, 168(14):1500-1511.

3. Volek JS, Phinney SD, Forsythe CE, Quann EE, Wood RJ, Puglisi MJ, Kraemer WJ, Bibus DM, Fernandez ML, Feinman RD: Carbohydrate Restriction has a More Favorable Impact on the Metabolic Syndrome than a Low Fat Diet. Lipids 2009, 44(4):297-309.

…the association has to be strong and the causality has to be plausible and consistent. And you have to have some reason to make the observation; you can’t look at everything.  And experimentally, observation may be all that you have — almost all of astronomy is observational.  Of course, the great deconstructions of crazy nutritional science — several from Mike Eades blog and Tom Naughton’s hysterically funny-but-true course in how to be a scientist —  are still right on but, strictly speaking, it is the faulty logic of the studies and the whacko observations that is the problem, not simply that they are observational.  It is the strength and reliability of the association that tells you whether causality is implied.  Reducing carbohydrates lowers triglycerides.  There is a causal link.  You have to be capable of the state of mind of the low-fat politburo not to see this (for example, Circulation, May 24, 2011; 123(20): 2292 – 2333).

It is frequently said that observational studies are only good for generating hypotheses but it is really the other way around.  All studies are generated by hypotheses.  As Einstein put it: your theory determines what you measure.  I ran my post on the red meat story passed April Smith  and her reaction was “why red meat? Why not pancakes” which is exactly right.  Any number of things can be observed. Once you pick, you have a hypothesis.

Where did the first law of thermodynamics come from?

Thermodynamics is an interesting case.  The history of the second law involves a complicated interplay of observation and theory.  The idea that there was an absolute limit to how efficient you could make a machine and by extension that all real processes were necessarily inefficient largely comes from the brain power of Carnot. He saw that you could not extract as work all of the heat you put into a machine. Clausius encapsulated it into the idea of the entropy as in my Youtube video.

©2004 Robin A. Feinman

 The origins of the first law, the conservation of energy, are a little stranger.  It turns out that it was described more than twenty years after the second law and it has been attributed to several people, for a while, to the German physicist von Helmholtz.  These days, credit is given to a somewhat eccentric German physician named Robert Julius Mayer. Although trained as a doctor, Mayer did not like to deal with patients and was instead more interested in physics and religion which he seemed to think were the same thing.  He took a job as a shipboard physician on an expedition to the South Seas since that would give him time to work on his main interests.  It was in Jakarta where, while treating an epidemic with the practice then of blood letting, that he noticed that the venous blood of the sailors was much brighter than when they were in colder climates as if “I had struck an artery.” He attributed this to a reduced need for the sailors to use oxygen for heat and from this observation, he somehow leapt to the grand principle of conservation of energy, that the total amount of heat and work and any other forms of energy does not change but can only be interconverted. It is still unknown what kind of connections in his brain led him to this conclusion.  The period (1848) corresponds to the point at which science separated from philosophy. Mayer seems to have had one foot in each world and described things in the following incomprehensible way:

  • If two bodies find themselves in a given difference, then they could remain  in a state of rest after the annihilation of [that] difference if the  forces that were communicated to them as a result of the leveling of  the difference could cease to exist; but if they are assumed to be indestructible,  then the still persisting forces, as causes of changes in relationship,  will again reestablish the original present difference.

(I have not looked for it but one can only imagine what the original German was like). Warmth Disperses and Time Passes. The History of Heat, Von Baeyer’s popular book on thermodynamics, describes the ups and downs of Mayer’s life, including the death of three of his children which, in combination with rejection of his ideas, led to hospitalization but ultimate recognition and knighthood.  Surely this was a great observational study although, as von Baeyer put it, it did require “the jumbled flashes of insight in that sweltering ship’s cabin on the other side of the world.”

It is also true that association does imply causation but, again, the association has to have some impact and the proposed causality has to make sense.  In some way, purely observational experiments are rare.  As Pasteur pointed out, even serendipity is favored by preparation.  Most observational experiments must be a reflection of some hypothesis. Otherwise you’re wasting tax-payer’s money; a kiss of death on a grant application is to imply that “it would be good to look at.…”  You always have to have something in mind.  The great observational studies like the Framingham Study are bad because they have no null hypothesis. When the Framingham study first showed that there was no association between dietary total and saturated fat or dietary cholesterol, the hypothesis was quickly defended. The investigators were so tied to a preconceived hypothesis, that there was hardly any point in making the observations.

In fact, a negative result is always stronger than one showing consistency; consistent sunrises will go by the wayside if the sun fails to come up once.  It is the lack of an association between the decrease in fat consumption during the epidemic of obesity and diabetes that is so striking.  The figure above shows that the  increase in carbohydrate consumption is consistent with the causal role of dietary carbohydrate in creating anabolic hormonal effects and with the poor satiating effects of carbohydrates — almost all of the increase of calories during the epidemic of obesity and diabetes has been due to carbohydrates.  However, this observation is not as strong as the lack of an identifiable association of obesity and diabetes with fat consumption.  It is the 14 % decrease in the absolute amount of saturated fat for men that is the problem.  If the decrease in fat were associated with decrease in obesity, diabetes and cardiovascular disease, there is little doubt that the USDA would be quick to identify causality.  In fact, whereas you can find the occasional low-fat trial that succeeds, if the diet-heart hypothesis were as described, they should not fail. There should not be a single Women’s Health Initiative, there should not be a single Framingham study, not one.

Sometimes more association would be better.  Take intention-to-treat. Please. In this strange statistical idea, if you assign a person to a particular intervention, diet or drug, then you must include the outcome data (weight loss, change in blood pressure) for that person even if the do not comply with the protocol (go off the diet, stop taking the pills).  Why would anybody propose such a thing, never mind actually insist on it as some medical journals or granting agencies do?  When you actually ask people who support ITT, you don’t get coherent answers.  They say that if you just look at per protocol data (only from people who stayed in the experiment), then by excluding the drop-outs, you would introduce bias but when you ask them to explain that you get something along the lines of Darwin and the peas growing on the wrong side of the pod. The basic idea, if there is one, is that the reason that people gave up on their diet or stopped taking the pills was because of an inherent feature of the intervention: made them sick, drowsy or something like that.  While this is one possible hypothesis and should be tested, there are millions of others — the doctor was subtly discouraging about the diet, or the participants were like some of my relatives who can’t remember where they put their pills, or the diet book was written in Russian, or the diet book was not written in Russian etc. I will discuss ITT in a future post but for the issue at hand:  if you do a per-protocol you will observe what happens to people when stay on their diet and you will have an association between the content of the diet and performance.  With an ITT analysis, you will be able to observe what happens when people are told to follow a diet and you will have an association between assignment to a diet and performance.  Both are observational experiments with an association between variables but they have different likelihood of providing a sense of causality.

“In the Viking era, they were already using skis…and over the centuries, the Norwegians have proved themselves good at little else.”

–John Cleese, Norway, Home of Giants.

With the 3-foot bookshelf of popular attacks on the low-fat-diet-heart idea it is pretty remarkable that there is only one defense.  Daniel Steinberg’s Cholesterol Wars. The Skeptics vs. The Preponderance of Evidence is probably more accurately called a witness for the prosecution since low-fat, in some way or other is still the law of the land.

The Skeptics vs. the Preponderance of Evidence

The Skeptics vs. the Preponderance of Evidence

The book is very informative, if biased, and it provides an historical perspective describing the difficulty of establishing the cholesterol hypothesis. Oddly, though,  it still appears to be very defensive for a witness for the prosecution.  In any case, Steinberg introduces into evidence the Oslo Diet-Heart Study [2] with a serious complaint:

“Here was a carefully conducted study reported in 1966 with a statistically significant reduction in reinfarction [recurrence of heart attack] rate.  Why did it not receive the attention it deserved?”

“The key element,” he says, “was a sharp reduction in saturated fat and cholesterol intake and an increase in polyunsaturated fat intake. In fact. each experimental subject had to consume a pint of soybean oil every week, adding it to salad dressing or using it in cooking or, if necessary, just gulping it down!”

Whatever it deserved, the Oslo Diet-Heart Study did receive a good deal of attention.  The Women’s Health Initiative (WHI), liked it.  The WHI was the most expensive failure to date. It found that “over a mean of 8.1 years, a dietary intervention that reduced total fat intake and increased intakes of vegetables, fruits, and grains did not significantly reduce the risk of CHD, stroke, or CVD in postmenopausal women.” [3]

The WHI, adopted a “win a few, lose a few” attitude, comparing its results to the literature, where some studies showed an effect of reducing dietary fat and some did not — this made me wonder: if the case is so clear, whey are there any failures.  Anyway, it cited the Oslo Diet-Heart Study as one of the winners and attributed the outcome to the substantial lowering of plasma cholesterol.

So, “cross-examination” would tell us why, if  “a statistically significant reduction in reinfarction  rate”  it did “not receive the attention it deserved?”

First, the effect of diet on cholesterol over five years:

The results look good although, since all the numbers are considered fairly high, and since the range of values is not shown, it is hard to tell just how impressive the results really are. But we will stipulate that you can lower cholesterol on a low-fat diet. But what about the payoff? What about the outcomes?

The results are shown in Table 5 of the original paper:   Steinberg described how in the first 5 years: “58 patients of the 206 in the control group (28%) had a second heart attack” (first 3 lines under first line of blue-highlighting) but only

“…  32 of the 206 in the diet (16%)…”  which does sound pretty good.

In the end, though, it’s really the total deaths from cardiac disease.  The second blue-highlighted line in Table 5 shows the two final outcome.  How should we compare these.

1. The odds ratio or relative risk is just the ratio of the two outcomes (since there are the same number of subjects) = CHD mortality (diet)/ CHD mortality control) = 94/79 =  1.19.  This seems strikingly close to 1.0, that is, flip of a coin.  These days the media, or the report itself, would report that there was a 19 % reduction in total CHD mortality.

2, If you look at the absolute values, however, the  mortality in the controls is 94/206 = 45.6 % but the diet group had reduced this  to 79/206 = 38.3 % so the change in absolute risk is  45.6 % – 38.3 % or only 7.3 % which is less impressive but still not too bad.

3. So for every 206 people, we save 94-79 = 15 lives, or dividing 206/15 = 14 people needed to treat to save one life. (Usually abbreviated NNT). That doesn’t sound too bad.  Not penicillin but could be beneficial. I think…

Smoke and mirrors.

It’s what comes next that is so distressing.  Table 10 pools the two groups, the diet and the control group and now compares  the effect of smoking: on the whole population,  the ratio of CHD deaths in smokers vs non-smokers is 119/54 = 2.2 (magenta highlight) which is somewhat more impressive than the 1.19 effect we just saw.  Now,

1. The absolute difference in risk is (119-54)/206 = 31.6 % which sounds like a meaningful number.

2. The number needed to treat is 206/64 = 3.17  or only about 3 people need to quit smoking to see one less death

In fact, in some sense, the Oslo Diet-Heart Study provides smoking-CHD risk as an example of a meaningful association that one can take seriously. If only such a significant change had actually been found for the diet effect.

So what do the authors make of this? Their conclusion is that “When combining data from both groups, a three-fold greater CHD mortality rate is demonstrable among the hypercholesterolemic, hypertensive smokers than among those in whom these factors were low or absent.”  Clever but sneaky. The “hypercholesterolemic, hypertensive” part is irrelevant since you combined the groups. In other words, what started out as a diet study has become a “lifestyle study.”  They might has well have said “When combining data from fish and birds a significant number of wings were evident.” Members of the jury are shaking their heads.

Logistic regression. What is it? Can it help?

So they have mixed up smoking and diet. Isn’t there a way to tell which was more important?  Well, of course, there are several ways.  By coincidence, while I was writing this post, April Smith posted on facebook, the following challenge “The first person to explain logistic regression to me wins admission to SUNY Downstate Medical School!” I won although I am already at Downstate.  Logistic regression is, in fact, a statistical method that asks what the relative contribution of different inputs would have to be to fit the outcome and this could have been done but in this case, I would use my favorite statistical method, the Eyeball Test.  Looking at the data in Tables 5 and 10 for CHD deaths, you can see immediately what’s going on. Smoking is a bigger risk than diet.

If you really want a number, we calculated relative risk above. Again, we found for mortality, CHD (diet)/ CHD (control) = 94/79 =  1.19. But what happens if you took up smoking: Figure 10 shows that your chance of dying of heart disease would be increased by 119/54 = 2.2  or more than twice the risk.  Bottom line: you decided to add saturated fat to your diet, your risk would be 1.19 what it was before which might be a chance you could take faced with authentic Foie Gras.

Daniel Steinberg’s question:

“Here was a carefully conducted study reported in 1966 with a statistically significant reduction in reinfarction  rate.  Why did it not receive the attention it deserved?”

Well, it did. This is not the first critique.  Uffe Ravnskov described how the confusion of smoking and diet led to a new Oslo Trial which reductions in both were specifically recommended and, again, outcomes made diet look bad [4].  Ravnskov gave it the attention it deserved. But what about researchers writing in the scientific literature. Why do they not give the study the attention it deserves. Why do they not point out its status as a classic case of a saturated fat risk study with no null hypothesis.  It certainly deserves attention for its devious style. Of course, putting that in print would guarantee that your grant is never funded and your papers will be hard to publish.  So, why do researchers not give the Oslo-Diet-Heart study the attention it deserves?  Good question, Dan.

Bibliography

1. Steinberg D: The cholesterol wars : the skeptics vs. the preponderance of evidence, 1st edn. San Diego, Calif.: Academic Press; 2007.

2. Leren P: The Oslo diet-heart study. Eleven-year report. Circulation 1970, 42(5):935-942.

3. Howard BV, Van Horn L, Hsia J, Manson JE, Stefanick ML, Wassertheil-Smoller S, Kuller LH, LaCroix AZ, Langer RD, Lasser NL et al: Low-fat dietary pattern and risk of cardiovascular disease: the Women’s Health Initiative Randomized Controlled Dietary Modification Trial. JAMA 2006, 295(6):655-666.

4. Ravnskov U: The Cholesterol Myths: Exposing the Fallacy that Cholesterol and Saturated Fat Cause Heart Disease. Washington, DC: NewTrends Publishing, Inc.; 2000.

In 1985 an NIH Consensus Conference was able to “establish beyond any reasonable doubt the close relationship between elevated blood cholesterol levels (as measured in serum or plasma) and coronary heart disease” (JAMA 1985, 253:2080-2086).

I have been making an analogy between scientific behavior and the activities of the legal system and following that idea, the wording of the conference conclusion suggests a criminal indictment. Since the time of the NIH conference, however, data on the role of cholesterol fractions, the so-called “good (HDL)” and “bad (LDL)” cholesterols and, most recently, the apparent differences in the atherogenicity of different LDL sub-fractions would seem to have provided some reasonable doubt. What has actually happened is that the nutrition establishment, the lipophobes as Michael Pollan calls them, has extended the indictment to include dietary fat, especially saturated fat at least as accessories on the grounds that, as the Illinois Criminal Code put it “before or during the commission of an offense, and with the intent to promote or facilitate such commission, … solicits, aids, abets, agrees or attempts to aid… in the planning or commission of the offense. . . ..”

A major strategy in the indictment of saturated fat has been guilt by association.  The American Heart Association (AHA), which had long recommended margarine (the major source of trans-fats), has gone all out in condemning saturated fatty acids by linking them with trans-fats.  The AHA website has a truly deranged cartoon film of the evil brothers: “They’re a charming pair, Sat and Trans.  But that doesn’t mean they make good friends.  Read on to learn how they clog arteries and break hearts — and how to limit your time with them by avoiding the foods they’re in.”. While the risk of trans-fats is probably exaggerated — they are a small part of the diet — they have no benefit and nobody wants to defend them; dietary saturated fat, however, is a normal part of the diet, is made in your body and is less important in providing saturated fatty acids in the blood, than dietary carbohydrate.  Guilt by association is a tricky business in courts of law — just having a roommate who sells marijuana can get you into a good deal of trouble — but it takes more than somebody saying that you and the perpetrator make a charming pair.

The failure of the diet-cholesterol-heart hypothesis in clinical trials as been documented by numerous scientific articles and especially in popular books that document the original scientific sources. It is unknown what the reaction of the public is to these books.  However, amazingly, there is only one book I know of that takes the side of the lipophobes and that is Daniel Steinberg’s Cholesterol Wars. The Skeptics vs. the Preponderance of Evidence. A serious book with careful if slightly biased documentation and an uncommon willingness to answer the critics,  it is worth reading.  I will try to discuss it in detail in this and future posts.  First, the title indicates a step down from criminal prosecution.  “Preponderance of the evidence” is the standard for conviction in a civil court and is obviously a far weaker criterion.  One has to wonder why it is that the skeptics have the preponderance of the popular publications — if the scientific evidence is there and health agencies are so determined that the public know about this, why are there so few —  maybe only this one — rebutting the critics.

The Skeptics vs. the Preponderance of Evidence

In any case, what is Steinberg’s case?  The indictment on page 1 is somewhat different than one would have thought.

“….the [lipid] hypothesis relates to blood lipids not dietary lipids as the putative directly causative factor. Although diet, especially dietary lipid is an important determinant of blood lipid levels, many other factors play important roles. Moreover, there is a great deal of variability in response of individuals to dietary manipulations. Thus, it is essential to distinguish between the indirect “diet-heart” connection and the direct “blood lipid — hard” connection failure to make this distinction has been a frequent source of confusion. (his italics)”

What?  Are we really supposed to believe that diet is an incidental part of the lipid hypothesis?  Are we supposed to believe that our cholesterol is just a question of the variability of our response to diet.  Has the message really been that diet is not critical and that heart-disease is just the luck of the draw (until we start taking statins)?  This is certainly the source of confusion in my mind.  Of course by page 5, we are confronted with this:

“In 1966, Paul Leren published his classic five-year study of 412 patients who had had a prior myocardial infarction. He showed that substitution of polyunsaturated fat and saturated fat-rich butter-cream-venison diet favored by the Norwegians reduced their blood cholesterol by about 17 per cent and kept it down.  The number of secondary current events in the treated group was reduced by about one-third and the result was significant at the p < 0.03 level.”

In a future post, I will describe Paul Leren’s classic five-year study which, by 1970, had a follow-up to eleven years and the results will turn out not to be as compelling as described by Steinberg.  For the moment, it is worth considering that, given the strong message, from the AHA, from the American Diabetes Association, from the NIH Guidelines for Americans, the criterion really should be beyond a reasonable doubt. There shouldn’t be even a single failure like the Framingham Study or the Women’s Health Initiative. In fact, the preponderance of the evidence when you add them all up, isn’t there.