Applicable not just to medicine

Apr 3 2007

I’m playing catch-up after a deliver logjam for my favorite weekly newspaper, the Economist. While engaged in that catching up, I bookmarked a story from the end of February entitled “Signs of the times” (subscription req’d). The title was only mildly interesting - it was the subtitle that grabbed me:

Why so much medical research is rot

The opening premise?

PEOPLE born under the astrological sign of Leo are 15% more likely to be admitted to hospital with gastric bleeding than those born under the other 11 signs. Sagittarians are 38% more likely than others to land up there because of a broken arm. Those are the conclusions that many medical researchers would be forced to make from a set of data presented to the American Association for the Advancement of Science by Peter Austin of the Institute for Clinical Evaluative Sciences in Toronto. At least, they would be forced to draw them if they applied the lax statistical methods of their own work to the records of hospital admissions in Ontario, Canada, used by Dr Austin.

Statistics is a subject that I recall a lot of classmates having agonized over. I’m willing to stipulate that doctors are, as a group, noticeably smarter than average. Perhaps I simply hope that to be the truth, but I don’t think so. They’re certainly more educated than average, and in any event, statistics can’t be as hard as, say, the art of medical diagnostics, pharmaceuticals mastery, or remembering which bone is connected to the knee bone.

As I recall, the most important thing that came out of my several university statistics classes was the notion that correlation does not imply causation.

Dr. Austin wasn’t railing against any particular proclaimed result in medicine - he was talking about ignorance of one important tenet of statistical interpretations: complexity matters.

He also wanted to explain why so many health claims that look important when they are first made are not substantiated in later studies.

The confusion arises because each result is tested separately to see how likely, in statistical terms, it was to have happened by chance. If that likelihood is below a certain threshold, typically 5%, then the convention is that an effect is “real”. And that is fine if only one hypothesis is being tested. But if, say, 20 are being tested at the same time, then on average one of them will be accepted as provisionally true, even though it is not.

From another presentation given at the same meeting:

Unfortunately, many researchers looking for risk factors for diseases are not aware that they need to modify their statistics when they test multiple hypotheses. The consequence of that mistake, as John Ioannidis of the University of Ioannina School of Medicine, in Greece, explained to the meeting, is that a lot of observational health studies—those that go trawling through databases, rather than relying on controlled experiments—cannot be reproduced by other researchers.

Net result? Observational studies, particularly those that involve backtesting events which have already occurred, are susceptible to problematic statistical bias, specifically because “controlled experiments” aren’t possible after the fact.

Luckily, like so much else in statistics, there’s a way to factor out the “fuzz” generated by backtesting. The root problem is that not all researchers are aware of the need for such factoring-out.

The article concludes:

So, the next time a newspaper headline declares that something is bad for you, read the small print. If the scientists used the wrong statistical method, you may do just as well believing your horoscope.

Truer words were never spoken, and note well that the final reference is to “scientists”, not just to medical scientists. Beware the statistical innumeracy!

Possibly, but not necessarily, related items:
  • Of course I don’t think Bloomberg got the idea from me
  • Redux: Godzilla vs. Megalon, as reported by Punky Brewster?
  • “Can a Company Be Run as a Democracy?”

  • Actions

    Information