In recent years, Glantz has been a key player in the efforts to persuade the public that secondhand smoke is not merely dangerous but is extraordinarily lethal. His written output in this decade has included such papers as 'Even a little secondhand smoke is dangerous' [PDF] - in which it was claimed that just 30 minutes of exposure to smoke could cause a heart attack. He also co-authored the infamous Helena heart attack study, which purported to demonstrate that exposure to smoke caused 40% of heart attacks in one Montanan town. He is also perhaps the only person on earth who believes that passive smoking causes breast cancer despite a mountain of evidence to the contrary and despite the fact that active smoking does not cause the disease (1).
Glantz has been accused of twisting data and disregarding crucial facts to promote his extreme anti-smoking agenda (he is unashamedly prohibitionist). It comes as a surprise, then, to find that one of his earliest published papers was an appeal for greater scientific rigour in epidemiological studies.
Written in 1978, and published in the American Heart Associations' journal Circulation two years later, 'Biostatistics: How to detect, correct and prevent errors in the medical literature' is one of the least known works in Glantz's extensive canon. It does not get a mention on his lengthy online CV.
There is a reason for that.
The thrust of Stanton Glantz's 1980 Circulation paper was that too many researchers use statistical methods incorrectly and, therefore, many studies produce erroneous results. As a consequence, he wrote, "readers often conclude that statistical analyses are maneuverable to one's needs, meaningless, or too difficult to understand." One of the most striking sentences in the paper comes when he criticises researchers for making basic errors in study design and interpretation. He makes the rather obvious point that flaws in a study's design can result in an association being found between A and B when, in truth, none exists. Such errors in study design, he says, are often so basic as to be inexcusable:
"Ironically, these errors rarely involve sophisticated issues that provoke debate among professional statisticians, but are simple mistakes, such as neglecting to include a control group."
He stated that 44% of studies made the error of omitting a control group. It is interesting that Glantz specifically identified the failure to include a control group as a "simple mistake" since many of his own studies have been criticised for that very reason. Between 1994 and 2007, Glantz co-authored six papers which purported to show that smoking bans did not significantly damage the hospitality industry. Half of them did not include a control group. Nor was there any no control group in a study Glantz championed which claimed that the New York smoking ban resulted in 8% fewer heart attacks.
His own study of heart attack admissions (in Helena, Montana) did have the virtue of having a control group but, as I have written previously, he made a very "simple mistake" when he claimed that it was scientifically feasible for 60% of all heart attacks to be caused by secondhand smoke (to say nothing of the error of calculating a 60% drop, which later had to be corrected to 40%).
But the main complaint the Stanton Glantz of 1978 had to make about epidemiological studies involved statistical significance.
The significance of significance
Epidemiologists have wrestled for years with the question of whether the statistics they unearth represent a useful line of enquiry or are the result of chance. To deal with this problem, they have long-used a statistical test to distinguish the random from the real and it is one that many readers of this website will be familiar. It is statistical significance.
Statistical significance is not a difficult concept to define and the following quote, from Glantz's Circulation article, succinctly explains why a measure of statistical significance is necessary and how it works. First he explains why a standard of significance is necessary:
"In an experiment, an investigator rarely studies all possible members of the population, but only a small, representative sample. The mean value computed from such a sample is an estimate of the true mean that would be computed if it were possible to observe all members of the population. Because the sample used to compute the mean consists of individuals drawn at random from the population being studied, there is nothing special about this sample or its mean. In particular, had the luck of the draw been different, the investigator would have drawn a sample containing different individuals and computed a mean value."
He then explains how this can be done, using the example of drug testing:
"Traditionally, when the chances of observing the computed test statistic when the intervention has no effect are below 5%, one rejects the working assumption that the drug has no effect. There is, of course, about a 5% chance that this assertion is wrong. This 5% is the p value, or "significance level."
In other words, even if the drug being tested has no effect, there will inevitably be a small variation between the two randomly selected groups of people it is tested on. Because of chance, the difference between two random variables is almost never zero. The standard 95% significance test Glantz described is designed to distinguish small and meaningless associations from those that are significant. (In epidemiology, 'significant' does not mean 'substantial' or 'serious' as it does in normal parlance; it simply means that the association is probably not the result of chance*.)
When Glantz explained statistical significance in his Circulation paper, he did not pretend that there was anything groundbreaking or controversial about its usage, and any professional epidemiologist would have been well aware of the importance of testing for significance. Indeed, nothing in the article was in the least bit radical. Testing for significance, like avoiding "simple errors", was the bare minimum a researcher could do to avoid creating bogus results. (The young Glantz made no great claims for his paper saying: "This article presents a few basic ideas and rules of thumb.")
And, it seems, the reminder was needed. Glantz bemoaned the fact that "approximately half the articles published in medical journals that use statistical methods use them incorrectly" and recommended that "journal editors should insist that statistical methods be used correctly." In closing, he insisted that "students and research fellows should receive formal training in applied statistics, if only to increase the skepticism with which they approach the literature."
When this budding academic wrote his assault on faulty epidemiology, he was still a fairly obscure mechanical engineer-turned-junior faculty member at the University of California, San Francisco. Although he was, even then, a prominent anti-smoking activist, Glantz had no particular reason to question standard epidemiological practice. The first epidemiological study on passive smoking was still three years away from being published. Anti-smoking activists like Glantz expected it to be only a matter of time before science 'proved' that secondhand smoke was a cause of cancer and other diseases. The first step towards finding this proof would be showing statistically significant associations in epidemiological studies.
But epidemiology failed to do so. The first two studies encouraged them (Hirayama, 1981 & Trichopoulos, 1981) but Lawrence Garfinkel's large American study put a spanner in the works when he found a relative risk of 1.17 (0.85-1.61) for nonsmoking women married to smokers. The relative risk of 17% was statistically insignificant and neither Garfinkel nor the American Cancer Society (whose data he had used) pretended that the study lent any support to the passive smoking theory.
Over the next ten years a succession of studies showed no statistically significant elevation in risk (Gao (1987), Brownson (1990), Janerich (1990), Garfinkel (1985), Dalager (1986), Kabat (1984), Shimizu (1988), Nyberg (1989), Koo (1987) etc.) and one even showed a statistically significant fall in risk - ie. passive smokers were less likely to contract lung cancer (Wu-Williams (1987)).
It was at this point that the assault on epidemiological standards began. With the great majority of studies failing to show any tangible association between secondhand smoke and lung cancer, the anti-smoking movement began to belittle and disparage the very concept of statistical significance. At the forefront of this new assault was Stanton Glantz.
In 1993, the Environmental Protection Agency was caught playing hard and fast with epidemiological data in its effort to find a link between secondhand smoke and lung cancer. Even after cherry-picking the data, the EPA was only able to show the slimmest of risks by dropping the measure of statistical significance from 95% to 90%. This doubled the EPA's chances of finding a significant result and understandably raised eyebrows but Stanton Glantz was surprisingly relaxed about it. Fifteen years earlier, he had stressed how important the significance test was, but now he described it as "hairsplitting that only professors care about" and said:
"I know that scientifically it's widely used, but there is a strong body of thought that people are too slavishly tied to 95 percent."(2)
Soon afterwards, Philip Morris began to fund The Advancement of Sound Science Coalition. The organisation aimed to set basic standards for epidemiologists and, in the wake of the EPA report, Philip Morris had an obvious motive for supporting such a body. There was nothing in the Advancement of Sound Science Coalition's guidelines that would have been out of place in Austin Bradford Hill's famous criteria of causation or in Stanton Glantz article of 1980, including this remark about significance:
"Two-sided hypothesis tests are encouraged. If a one-sided test is employed, this should be noted and the rationale for using it provided. The presentation of confidence intervals for the estimate of risk gives more information than a single point value with an associated p value. Generally, 95% confidence intervals are preferred."
This was a statement of fact. 95% was indeed the preferred confidence interval in epidemiology. The EPA's use of a lowered 90% interval was almost unheard of and, in his 1980 paper, Stanton Glantz had firmly promoted the 95% confidence interval. But when The Advancement of Sound Science Coalition did the same, Glantz sang to a very different tune. Glantz accused it of "attempting to change the scientific standards of proof" and complained that its recommendations "would make it impossible to conclude that secondhand smoke - and thus other environmental toxins - caused diseases." (3) That may have been true but it was hardly the fault of epidemiology if it was unable to back up Glantz's beliefs.
Faced with a slew of studies that did not meet the test of significance, the anti-smokers preferred to dismiss the significance test itself rather than dismiss the studies. In truth, it was Glantz who was "attempting to change the scientific standards of proof" by abandoning his faith in a crucial criterion; one which could not be met by the null results that continued to appear in studies on secondhand smoke.
Things got worse for the anti-smokers in 2000, when the World Health Organisation's IARC found no statistically significant association between passive smoking and lung cancer. Glantz was incensed and he launched an all-out attack on the 95% confidence interval and, as ever, sought to blame the tobacco industry:
"The [tobacco] industry imposes a one-sided interpretation of confidence intervals, focusing the entire discussion on whether the lower bound of the 95% CI [confidence interval] for a relative risk includes 1. By definition, if the lower bound exceeds 1, then the risk is statistically significantly raised (with p=0.05).
Whether or not there is anything magic about 95%, the true risk is equally likely to be anywhere inside the 95% CI, including values above the point estimate. In environmental and health and safety regulation, it is common to take the health-protective approach of basing public policy on the upper 95% confidence limit...The industry has represented the fact that the increase in risk observed did not reach statistical significance as indicating that the study did not find any increased risk." (4)
This was an extraordinary turn-around. It was the very opposite of what he had written in 1978. Glantz gives an accurate definition of what constitutes a statistically significant risk ("if the lower bound exceeds 1") but then immediately claims that this is a "one-sided interpretation" dreamed up by the tobacco industry. He then suggests that since the true risk falls somewhere within the two confidence intervals, it is reasonable to pick any figure between the two. Furthermore, it is apparently acceptable for anti-smoking advocates to select the highest figure within the confidence interval as being the true risk. What is this if not a "one-sided interpretation" of confidence intervals?!
The troubling implications of such thinking can scarcely be overstated. Glantz is explicitly stating that environmental and health legislation should be based on the top end of the confidence interval ie. the highest level of risk, even when there is no statistical significant association to begin with and, therefore, no risk. Such an assumption makes it possible to 'prove' anything. Even if an epidemiological study finds a nonsignificant reduction in risk (eg. 0.9 (0.7-1.2)), policy makers will only see the 1.2 upper limit and assume a 20% increase in risk. Nothing studied can ever have a neutral effect; everything is harmful; nothing is safe.
Such a practice is not so much bad science as anti-science since it requires no evidence before a theory becomes a 'fact'. It gives the upper hand to pessimists, fanatics and hypochondriacs at the expense of science and reason. The power lies with anybody who can finance and successfully promote the research. Zero-evidence epidemiology can effectively manufacture health hazards at will and the precautionary principle requires politicians to legislate, regulate and abolish as if the risk was real and proven.
Glantz's transition from being a defender of scientific standards to becoming an assailant is an extraordinary one. That it came about due to his need to 'prove' that secondhand smoke kills offers an explanation but is scarce justification. It would a laughable exaggeration to claim that Glantz is some sort of fallen angel. His Circulation article would be unexceptional were it not for its author's subsequent U-turn and Glantz has done as much as anyone to discredit epidemiology as a serious science ever since. Since his training was in mechanical engineering rather than epidemiology, a charitable defence of the man's work might be that he simply does not know any better. I would suggest that his Circulation article, written at the dawn of his career, removes that defence. At one time, Glantz clearly did understand the principles of epidemiology and was prepared to defend them. The fact that he has since become a ringleader of junk science and debased epidemiology is all the more striking in the context of this forgotten article.
(1) 'Alcohol, tobacco and breast cancer', British Journal of Cancer, 2002, 87, pp. 1234-1245; doi: 10.1038/sj.bjc.6600596; see also 'Cigarette smoking and breast cancer', Field et al. International Journal of Epidemiology 1992; 21: 842-848.
(2) Quoted in Michael Fumento, 'Is EPA Blowing Its Own Smoke?' Investor's Business Daily, January 28, 1993. Republished by the American Smokers Alliance at www.smokers.org/research/ articles/08-epa_ blowing _ smoke html.
(3) http:// academicsenate.ucdavis.edu/forums/SoundScienceAJPH.pdf
Incidentally, in his Circulation article, Glantz cites the work of the great epidemiologist Alvas Feinstein. Glantz would later dismiss Feinstein as a "industry consultant" when he cast doubt on the passive smoking theory. http://circ.ahajournals.org/cgi/reprint/116/16/1845.pdf
* In reality, the idea that significant results are correct 95% of the time is a fallacy that can easily be confirmed by reading about the health scares and miracle cures that are reported in every daily newspaper. To quote John Brignell in The Epidemiologists:
"In the comparison of two random variables the correlation coefficient is never zero. The first question to be determined is whether it is far enough from zero to be significant, which in the case of epidemiology means resorting to the one-in-twenty lottery. Even if the correlation is deemed significant, however, that is not sufficient evidence to warrant a claim of causation." (p. 199)
[Originally published at velvetgloveironfist.com]