Anyone who has played pool knows that you have to call your shots before you make them. This rule is intended to decrease probability of "getting lucky" from just hitting the cue ball as hard as you can, expecting that the more it bounces around the table, the more likely it is that one of your many balls will fall through chance alone. Sinking a ball without first calling it is referred to coloquially as "slop" or a "slop shot".
The underlying logic is that you know best which shot you're MOST likely to successfully make, so not only does that increase the prior probability of a skilled versus a lucky shot (especially if it is a complex shot, such as one "off the rail"), but also it effectively reduces the number of chances the cue ball has to sink one of your balls without you losing your turn. It reduces those multiple chances to one single chance.
Likewise, a clinical trialist must focus on one "primary outcome" for two reasons: 1.) because preliminary data, if available, background knowledge, and logic will allow him to select the variable with the highest "pre-test probability" of causing the null hypothesis to be rejected, meaning that the post-test probability of the alternative hypothesis is enhanced; and 2.) because it reduces the probaility to find "significant" associations among multiple variables through chance alone. Today I came across a cute little experiment that drives this point home quite well. The abstract can be found here on pubmed: http://www.ncbi.nlm.nih.gov/pubmed/16895820?ordinalpos=4&itool=EntrezSystem2.PEntrez.Pubmed.Pubmed_ResultsPanel.Pubmed_DefaultReportPanel.Pubmed_RVDocSum .
In it, the authors describe "dredging" a Canadian database and looking for correlations between astrological signs and various diagnoses. Significant associations were found between the Leo sign and gastrointestinal hemorrhage, and the Saggitarius sign and humerous fracture. With this "analogy of extremes" as I like to call them, you can clearly see how the failure to define a prospective primary endpoint can lead to statistical slop. (Nobody would have been able to predict a priori that it would be THOSE two diagnoses associated with THOSE two signs!) Failure to PROSPECTIVELY identify ONE primary endpoint led to multiple chances for chance associations. Moreover, because there were no preliminary data upon which to base a primary hypothesis, the prior probability of any given alternative hypothesis is markedly reduced, and thus the posterior probability of the alternative hypothesis remains low IN SPITE OF the statistically significant result.
It is for this very reason that "positive" or significant associations among non-primary endpoint variables in clinical trials are considered "hypothesis generating" rather than hypothesis confirming. Requiring additional studies of these associations as primary endpoints is like telling your slop shot partner in the pool hall "that's great, but I need to see you do that double rail shot again to believe that it's skill rather than luck."
Reproducibility of results is indeed the hallmark of good science.
This is discussion forum for physicians, researchers, and other healthcare professionals interested in the epistemology of medical knowledge, the limitations of the evidence, how clinical trials evidence is generated, disseminated, and incorporated into clinical practice, how the evidence should optimally be incorporated into practice, and what the value of the evidence is to science, individual patients, and society.
Saturday, March 14, 2009
Tuesday, March 10, 2009
PCI versus CABG - Superiority is in the heart of the angina sufferer
In the current issue of the NEJM, Serruys et al describe the results of a multicenter RCT comparing PCI with CABG for severe coronary artery disease: http://content.nejm.org/cgi/content/full/360/10/961. The trial, which was designed by the [profiteering] makers of drug-coated stents, was a non-inferiority trial intended to show the non-inferiority (NOT the equivalence) of PCI (new treatment) to CABG (standard treatment). Alas, the authors appear to misunderstand the design and reporting of non-inferiority trials, and mistakenly declare CABG as superior to PCI as a result of this study. This error will be the subject of a forthcoming letter to the editor of the NEJM.
The findings of the study can be summarized as follows: compared to PCI, CABG led to a 5.6% reduction in the combined endpoint of death from any cause, stroke, myocardial infarction, or repeat vascularization (P=0.002). The caveats regarding non-inferiority trials notwithstanding, there are other reasons to call into question the interpretation that CABG is superior to PCI, and I will enumerate some of these below.
1.) The study used a ONE-SIDED 95% confidence interval - shame, shame, shame. See: http://jama.ama-assn.org/cgi/content/abstract/295/10/1152 .
2.) Table 1 is conspicuous for the absence of cost data. The post-procedural hospital stay was 6 days longer for CABG than PCI, and the procedural time was twice as long - both highly statistically and clinically significant. I recognize that it would be somewhat specious to provide means for cost because it was a multinational study and there would likely be substantial dispersion of cost among countries, but it seems like neglecting the data altogether is a glaring omission of a very important variable if we are to rationally compare these two procedures.
3.) Numbers needed to treat are mentioned in the text for variables such as death and myocardial infarction that were not individually statistically significant. This is misleading. The significance of the composite endpoint does not allow one to infer that the individual components are significant (they were not) and I don't think it's conventional to report NNTs for non-significant outcomes.
4.) Table 2 lists significant deficencies and discrepancies between pharmocological medical management at discharge which are inadequately explained as mentioned by the editorialist.
5.) Table 2 also demonstrates a five-fold increase in amiodarone use and a three-fold increase in warfarin use at discharge among patients in the CABG group. I infer this to represent an increase in the rate of atrial fibrillation in the CABG patients, but because the rates are not reported, I am kept wondering.
6.) Neurocognitive functioning and the incidence of defecits (if measured), known complications of bypass, are not reported.
7.) It is mentioned in the discussion that after consent, more patients randomized to CABG compared to PCI withdrew consent, a tacit admission of the wariness of patients to submit to this more invasive procedure.
In all, what this trial does for me is to remind me to be wary of an overly-simplistic interpretation of complex data and a tendency toward dichotimous thinking - superior versus inferior, good versus bad, etc.
One interpretation of the data is that a 3.4 hour bypass surgery and 9 days in the hospital !MIGHT! save you from an extra 1.7 hour PCI and another 3 days in the hospital on top of your initial committment of 1.7 hours of PCI and 3 days in the hospital if you wind up requiring revascularization, the primary [only] driver of the composite endpoint. And in payment for this dubiously useful exchange, you must submit to a ~2% increase in the risk of stroke, have a cracked chest, risk surgical wound infection (rate of which is also not reported) pay an unknown (but probably large) increased financial cost, risk some probably large increased risk of atrial fibrillation and therefore be discharged on amiodarone and coumadin with their high rates of side effects and drug-drug interactions, while coincidentally risk being discharged on inadequate medical pharmacological management.
Looked at from this perspective, one sees that beauty is truly in the eye of the beholder.
The findings of the study can be summarized as follows: compared to PCI, CABG led to a 5.6% reduction in the combined endpoint of death from any cause, stroke, myocardial infarction, or repeat vascularization (P=0.002). The caveats regarding non-inferiority trials notwithstanding, there are other reasons to call into question the interpretation that CABG is superior to PCI, and I will enumerate some of these below.
1.) The study used a ONE-SIDED 95% confidence interval - shame, shame, shame. See: http://jama.ama-assn.org/cgi/content/abstract/295/10/1152 .
2.) Table 1 is conspicuous for the absence of cost data. The post-procedural hospital stay was 6 days longer for CABG than PCI, and the procedural time was twice as long - both highly statistically and clinically significant. I recognize that it would be somewhat specious to provide means for cost because it was a multinational study and there would likely be substantial dispersion of cost among countries, but it seems like neglecting the data altogether is a glaring omission of a very important variable if we are to rationally compare these two procedures.
3.) Numbers needed to treat are mentioned in the text for variables such as death and myocardial infarction that were not individually statistically significant. This is misleading. The significance of the composite endpoint does not allow one to infer that the individual components are significant (they were not) and I don't think it's conventional to report NNTs for non-significant outcomes.
4.) Table 2 lists significant deficencies and discrepancies between pharmocological medical management at discharge which are inadequately explained as mentioned by the editorialist.
5.) Table 2 also demonstrates a five-fold increase in amiodarone use and a three-fold increase in warfarin use at discharge among patients in the CABG group. I infer this to represent an increase in the rate of atrial fibrillation in the CABG patients, but because the rates are not reported, I am kept wondering.
6.) Neurocognitive functioning and the incidence of defecits (if measured), known complications of bypass, are not reported.
7.) It is mentioned in the discussion that after consent, more patients randomized to CABG compared to PCI withdrew consent, a tacit admission of the wariness of patients to submit to this more invasive procedure.
In all, what this trial does for me is to remind me to be wary of an overly-simplistic interpretation of complex data and a tendency toward dichotimous thinking - superior versus inferior, good versus bad, etc.
One interpretation of the data is that a 3.4 hour bypass surgery and 9 days in the hospital !MIGHT! save you from an extra 1.7 hour PCI and another 3 days in the hospital on top of your initial committment of 1.7 hours of PCI and 3 days in the hospital if you wind up requiring revascularization, the primary [only] driver of the composite endpoint. And in payment for this dubiously useful exchange, you must submit to a ~2% increase in the risk of stroke, have a cracked chest, risk surgical wound infection (rate of which is also not reported) pay an unknown (but probably large) increased financial cost, risk some probably large increased risk of atrial fibrillation and therefore be discharged on amiodarone and coumadin with their high rates of side effects and drug-drug interactions, while coincidentally risk being discharged on inadequate medical pharmacological management.
Looked at from this perspective, one sees that beauty is truly in the eye of the beholder.
Monday, March 9, 2009
Money talks and Chantix (varenicline) walks - the role of financial incentives in inducing healthful behavior
I usually try to keep the posts current, but I missed a WONDERFUL article a few weeks ago in the NEJM, one that is pivotal in its own right, but especially in the context of good decision making about therapeutic choices and opportunity costs.
The article, by Volpp et all entitled: A Randomized, Controlled Trial of Financial Incentives for Smoking Cessation can be found here: http://content.nejm.org/cgi/content/abstract/360/7/699
In summary, smokers at a large US company, where a smoking cessation program existed before the research began were randomized to receive additional information about the program, versus the same information plus a financial incentive of up to $750 for successfully stopping smoking. At 9-12 months, smoking cessation was 10% higher in the financial incentive group (14.7% vs. 5.0%, P<0.001).
In the 2006 JAMA article on varenicline (Chantix) by Gonzales et al (http://jama.ama-assn.org/cgi/reprint/296/1/47.pdf ), the cessation rates at weeks 9-52 were 8.4% for placebo and 21.9% for varenicline, an absolute gain of 13.5%. (Similar results were reported in the study by Jorenby et al: http://jama.ama-assn.org/cgi/content/abstract/296/1/56?maxtoshow=&HITS=10&hits=10&RESULTFORMAT=&fulltext=varenicline&searchid=1&FIRSTINDEX=0&resourcetype=HWCIT ) Now, given that this branded pharmaceutical sells for ~$120 for a 30 day supply, and that, based on the article by Tonstad (http://jama.ama-assn.org/cgi/reprint/296/1/64.pdf ), many patients are continued on varenicline for 24 weeks or more, the cost of a course of treatment with the drug is approximately $720, just about the same as the financial incentives used in the index article.
And all of this begs the question: Is it better to pay $750 for 6 months of treatment with a drug that has [potentially serious] side effects to achieve ~13% reduction in smoking, or to pay patients to quit smoking to achieve a 10% reduction in smoking without harmful side effects and in fact with POSITIVE side effects (money to spend on pleasurable alternatives to smoking or other necessities)?
The choice is clear to me, and, having failed Chantix, I now consider whether I should offer my brother payment to quit smoking. (I expect to receive a call as soon as he reads this, especially since I haven't mentioned the cotinine tests yet.)
And all of this begs the more important question of why we seek drugs to solve behavioral problems, when good old fashioned greenbacks will do the trick just fine. Why bother with Meridia and Rimonabant and all the other weight loss drugs when we might be able to pay people to lose weight? (See: http://jama.ama-assn.org/cgi/content/abstract/300/22/2631 .) Perhaps one part of Obama's stimulus bill can allocate funds to additional such an experiments, or better yet, to such a social program.
One answer to this question is that the financial incentive to study financial incentives is not as great as the financial incentive to find another profitable pill to treat social ills. (There is after all a "pipeline deficiency" in a number of Big Pharma companies that has led to several mergers and proposed mergers, such as the announcement today of a possible merger of MRK and SGP, two of my personal favorites.) Yet this study sets the stage for more such research. If we are going to pay one way or another, I for one would rather that we be paying people to volitionally change their behavior, rather than paying via third party to reinforce the notion that there is "a pill for everything". As Ben Franklin said, "S/He is the best physician who knows the worthlessness of the most medicines."
The article, by Volpp et all entitled: A Randomized, Controlled Trial of Financial Incentives for Smoking Cessation can be found here: http://content.nejm.org/cgi/content/abstract/360/7/699
In summary, smokers at a large US company, where a smoking cessation program existed before the research began were randomized to receive additional information about the program, versus the same information plus a financial incentive of up to $750 for successfully stopping smoking. At 9-12 months, smoking cessation was 10% higher in the financial incentive group (14.7% vs. 5.0%, P<0.001).
In the 2006 JAMA article on varenicline (Chantix) by Gonzales et al (http://jama.ama-assn.org/cgi/reprint/296/1/47.pdf ), the cessation rates at weeks 9-52 were 8.4% for placebo and 21.9% for varenicline, an absolute gain of 13.5%. (Similar results were reported in the study by Jorenby et al: http://jama.ama-assn.org/cgi/content/abstract/296/1/56?maxtoshow=&HITS=10&hits=10&RESULTFORMAT=&fulltext=varenicline&searchid=1&FIRSTINDEX=0&resourcetype=HWCIT ) Now, given that this branded pharmaceutical sells for ~$120 for a 30 day supply, and that, based on the article by Tonstad (http://jama.ama-assn.org/cgi/reprint/296/1/64.pdf ), many patients are continued on varenicline for 24 weeks or more, the cost of a course of treatment with the drug is approximately $720, just about the same as the financial incentives used in the index article.
And all of this begs the question: Is it better to pay $750 for 6 months of treatment with a drug that has [potentially serious] side effects to achieve ~13% reduction in smoking, or to pay patients to quit smoking to achieve a 10% reduction in smoking without harmful side effects and in fact with POSITIVE side effects (money to spend on pleasurable alternatives to smoking or other necessities)?
The choice is clear to me, and, having failed Chantix, I now consider whether I should offer my brother payment to quit smoking. (I expect to receive a call as soon as he reads this, especially since I haven't mentioned the cotinine tests yet.)
And all of this begs the more important question of why we seek drugs to solve behavioral problems, when good old fashioned greenbacks will do the trick just fine. Why bother with Meridia and Rimonabant and all the other weight loss drugs when we might be able to pay people to lose weight? (See: http://jama.ama-assn.org/cgi/content/abstract/300/22/2631 .) Perhaps one part of Obama's stimulus bill can allocate funds to additional such an experiments, or better yet, to such a social program.
One answer to this question is that the financial incentive to study financial incentives is not as great as the financial incentive to find another profitable pill to treat social ills. (There is after all a "pipeline deficiency" in a number of Big Pharma companies that has led to several mergers and proposed mergers, such as the announcement today of a possible merger of MRK and SGP, two of my personal favorites.) Yet this study sets the stage for more such research. If we are going to pay one way or another, I for one would rather that we be paying people to volitionally change their behavior, rather than paying via third party to reinforce the notion that there is "a pill for everything". As Ben Franklin said, "S/He is the best physician who knows the worthlessness of the most medicines."
Wednesday, March 4, 2009
The Normailzation Heuristic: how an untested hypothesis may misguide medical decisions
Here is an article that may be of interest written by two perspicacious young fellows:
http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6WN2-4VP175C-1&_user=10&_rdoc=1&_fmt=&_orig=search&_sort=d&view=c&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=0067dfb6094ecc27303ccd6939257200
In this article, we describe how the general clinical hypothesis that "normalizing" abnormal laboratory values and physiological parameters will improve patient outcomes is unreliably accurate, and use historical examples of practices such as hormone replacement therapy, and the CAST trial to buttress this argument. We further suggest that many ongoing practices that rely on normalizing values should be called into question because the normalization hypothesis is a fragile one. We also operationally define the "normalization heuristic" and define four general ways in which it can fail clinical decision makers. Lastly, we make suggestions for empirical testing of existence of this heuristic and caution clinicians and medical educators to be wary of reliance on the normalization hypothesis and the normalization heuristic. This paper is an expansion of the idea of the normalization heuristic that was mentioned previously on this blog.
http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6WN2-4VP175C-1&_user=10&_rdoc=1&_fmt=&_orig=search&_sort=d&view=c&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=0067dfb6094ecc27303ccd6939257200
In this article, we describe how the general clinical hypothesis that "normalizing" abnormal laboratory values and physiological parameters will improve patient outcomes is unreliably accurate, and use historical examples of practices such as hormone replacement therapy, and the CAST trial to buttress this argument. We further suggest that many ongoing practices that rely on normalizing values should be called into question because the normalization hypothesis is a fragile one. We also operationally define the "normalization heuristic" and define four general ways in which it can fail clinical decision makers. Lastly, we make suggestions for empirical testing of existence of this heuristic and caution clinicians and medical educators to be wary of reliance on the normalization hypothesis and the normalization heuristic. This paper is an expansion of the idea of the normalization heuristic that was mentioned previously on this blog.
Tuesday, February 10, 2009
West's estimations of PaO2 on Everest Confirmed - but SaO2 remains an estimation
Recently, Grocott et al published results of an intriguing study in which they drew blood gas samples from climbers near the summit of everest and analyzed them at one of the high camps with a modified blood gas analyzer. (See: http://content.nejm.org/cgi/content/abstract/360/2/140 ) This is no small feat, and the perhaps shocking results confirm earlier estimations of low arterial oxygen tension derived from samples of exhaled gas. The PaO2 of these climbers is often under 30mmHg - a difficult to believe number for clinicians who are accustomed to a danger zone represented by much higher numbers in clinical practice.
As intriguing as the numbers may be, the authors have made a crucial assumption in the estimation of arterial oxygen saturation (SaO2) that leads us to be circumspect about the accuracy of this estimated value. A letter written by me and my colleagues emphasizing several caveats in these estimations was not accepted for publication by the NEJM so I will post it below.
In the article by Grocott et al, an important limitation of using calculated SaO2 values for the estimation of arterial oxygen content is neglected. The equation used for the calculation of SaO2 in the article does not take into account changes in hemoglobin affinity induced by increased 2,3-DPG levels which are known to occur during acclimatization (1;2). Errors resulting from these estimations will be magnified for values of PaO2 on the steep portion of the oxyhemoglobin dissociation curve. The PaO2 values of the subjects studied are on this portion of the curve. Can the authors comment on 2,3-DPG levels in these climbers and how any resulting changes in hemoglobin affinity may have affected calculated values? Were the climbers taking acetazolamide, which has variably been demonstrated to affect the oxygen affinity of hemoglobin (3;4)? Is there any evidence that acclimatization induces increased production of fetal hemoglobin as occurs in some other species (5)? Because of such caveats and possibly other unknown variables, co-oximetry remains the gold standard for determination of arterial oxygen saturation.
Reference List
(1) Wagner PD, Wagner HE, Groves BM, Cymerman A, Houston CS. Hemoglobin P(50) during a simulated ascent of Mt. Everest, Operation Everest II. High Alt Med Biol 2007; 8(1):32-42.
(2) Winslow RM, Samaja M, West JB. Red cell function at extreme altitude on Mount Everest. J Appl Physiol 1984; 56(1):109-116.
(3) Gai X, Taki K, Kato H, Nagaishi H. Regulation of hemoglobin affinity for oxygen by carbonic anhydrase. J Lab Clin Med 2003; 142(6):414-420.
(4) Milles JJ, Chesner IM, Oldfield S, Bradwell AR. Effect of acetazolamide on blood gases and 2,3 DPG during ascent and acclimatization to high altitude. Postgrad Med J 1987; 63(737):183-184.
(5) Reynafarje C, Faura J, Villavicencio D, Curaca A, Reynafarje B, Oyola L et al. Oxygen transport of hemoglobin in high-altitude animals (Camelidae). J Appl Physiol 1975; 38(5):806-810.
Scott K Aberegg, MD, MPH
Leroy Essig, MD
Andrew Twehues, MD
As intriguing as the numbers may be, the authors have made a crucial assumption in the estimation of arterial oxygen saturation (SaO2) that leads us to be circumspect about the accuracy of this estimated value. A letter written by me and my colleagues emphasizing several caveats in these estimations was not accepted for publication by the NEJM so I will post it below.
In the article by Grocott et al, an important limitation of using calculated SaO2 values for the estimation of arterial oxygen content is neglected. The equation used for the calculation of SaO2 in the article does not take into account changes in hemoglobin affinity induced by increased 2,3-DPG levels which are known to occur during acclimatization (1;2). Errors resulting from these estimations will be magnified for values of PaO2 on the steep portion of the oxyhemoglobin dissociation curve. The PaO2 values of the subjects studied are on this portion of the curve. Can the authors comment on 2,3-DPG levels in these climbers and how any resulting changes in hemoglobin affinity may have affected calculated values? Were the climbers taking acetazolamide, which has variably been demonstrated to affect the oxygen affinity of hemoglobin (3;4)? Is there any evidence that acclimatization induces increased production of fetal hemoglobin as occurs in some other species (5)? Because of such caveats and possibly other unknown variables, co-oximetry remains the gold standard for determination of arterial oxygen saturation.
Reference List
(1) Wagner PD, Wagner HE, Groves BM, Cymerman A, Houston CS. Hemoglobin P(50) during a simulated ascent of Mt. Everest, Operation Everest II. High Alt Med Biol 2007; 8(1):32-42.
(2) Winslow RM, Samaja M, West JB. Red cell function at extreme altitude on Mount Everest. J Appl Physiol 1984; 56(1):109-116.
(3) Gai X, Taki K, Kato H, Nagaishi H. Regulation of hemoglobin affinity for oxygen by carbonic anhydrase. J Lab Clin Med 2003; 142(6):414-420.
(4) Milles JJ, Chesner IM, Oldfield S, Bradwell AR. Effect of acetazolamide on blood gases and 2,3 DPG during ascent and acclimatization to high altitude. Postgrad Med J 1987; 63(737):183-184.
(5) Reynafarje C, Faura J, Villavicencio D, Curaca A, Reynafarje B, Oyola L et al. Oxygen transport of hemoglobin in high-altitude animals (Camelidae). J Appl Physiol 1975; 38(5):806-810.
Scott K Aberegg, MD, MPH
Leroy Essig, MD
Andrew Twehues, MD
Monday, February 9, 2009
More Data on Dexmedetomidine - moving in the direction of a new standard
A follow-up study of dexmedetomidine (see previous blog: http://medicalevidence.blogspot.com/2007/12/dexmedetomidine-new-standard-in_16.html )
was published in last week's JAMA (http://jama.ama-assn.org/cgi/content/abstract/301/5/489 ) and hopefully serves as a prelude to future studies of this agent and indeed all studies in critical care. The recent study addresses one of my biggest concerns of the previous one, namely that routine interruptions of sedatives were not employed.
Ironically, it may be this difference between the studies that led to the failure to show a difference in the primary endpoint in the current study. The primary endpoint, namely the percentage of time within the target RASS, was presumably chosen not only on the basis of its pragmatic utility, but also because it was one of the most statistically significant differences found among secondary analyses in the previous study (percent of patients with a RASS [Richmond Agitation and Sedation Scale] score within one point of the physician goal; 67% versus 55%, p=0.008). It is possible, and I reason likely, that daily interruptions in the current study obliterated that difference which was found in the previous study.
But that failure does not undermine the usefulness of the current study which showed that sedation comparable to routinely used benzos can be achieved with dexmed, probably with less delirium, and perhaps with shorter time on the ventilator and fewer infections. What I would like to see now, and what is probably in the works, is a study of dexmed which shows shorter time on the ventilator and/or reductions in nosocomial infections as primary study endpoints.
But to show endpoints such as these, we are going to need to carefully standardize our ascertainment of infections (difficult to say the least) and also to standardize our approach to discontinuation of mechanical ventilation. In regard to the latter, I propose that we challenge some of our current assumptions about liberation from mechanical ventilation - namely, that a patient must be fully awake and following commands prior to extubation. I think that a status quo bias is at work here. We have many a patient with delirium in the ICU who is not already intubated and we do not intubate them for delirium alone. Why, then, should we fail to extubate a patient in whom all indicators show reaolution of critical illness, but who remains delirious? Is it possible that this is the main player in the causal pathway between sedation and extubation and perhaps even nosocomial infections and mortality? (The protocols or lack thereof for assessing extubation readiness were not described in the current study, unless I missed them.) It would certainly be interesting and perhaps mandatory to know the extubation practices in the centers involved in this study, especially if we are going to take great stock in this secondary outcome of this study.
Another thing I am interested in knowing is what PATIENT experiences are like in each group - whether there is greater recall or other differences in psychological outcomes between patients who receive different sedatives during their ICU experience.
I hope this study and others like it serve as a wake-up call to the critical care research community which has heretofore been brainwashed into thinking that a therapy is only worthwhile if it improves mortality, a feat that is difficult to achieve not only because it is often unrealistic and because absurd power calculations and delta inflation run rampant in trial design, but because of limitations in funding and logistical difficulties. This group has shown us repeatedly that useful therapies in critical care need not be predicated upon a mortality reduction. It's past time to start buying some stock in shorter times on the blower and in the ICU.
was published in last week's JAMA (http://jama.ama-assn.org/cgi/content/abstract/301/5/489 ) and hopefully serves as a prelude to future studies of this agent and indeed all studies in critical care. The recent study addresses one of my biggest concerns of the previous one, namely that routine interruptions of sedatives were not employed.
Ironically, it may be this difference between the studies that led to the failure to show a difference in the primary endpoint in the current study. The primary endpoint, namely the percentage of time within the target RASS, was presumably chosen not only on the basis of its pragmatic utility, but also because it was one of the most statistically significant differences found among secondary analyses in the previous study (percent of patients with a RASS [Richmond Agitation and Sedation Scale] score within one point of the physician goal; 67% versus 55%, p=0.008). It is possible, and I reason likely, that daily interruptions in the current study obliterated that difference which was found in the previous study.
But that failure does not undermine the usefulness of the current study which showed that sedation comparable to routinely used benzos can be achieved with dexmed, probably with less delirium, and perhaps with shorter time on the ventilator and fewer infections. What I would like to see now, and what is probably in the works, is a study of dexmed which shows shorter time on the ventilator and/or reductions in nosocomial infections as primary study endpoints.
But to show endpoints such as these, we are going to need to carefully standardize our ascertainment of infections (difficult to say the least) and also to standardize our approach to discontinuation of mechanical ventilation. In regard to the latter, I propose that we challenge some of our current assumptions about liberation from mechanical ventilation - namely, that a patient must be fully awake and following commands prior to extubation. I think that a status quo bias is at work here. We have many a patient with delirium in the ICU who is not already intubated and we do not intubate them for delirium alone. Why, then, should we fail to extubate a patient in whom all indicators show reaolution of critical illness, but who remains delirious? Is it possible that this is the main player in the causal pathway between sedation and extubation and perhaps even nosocomial infections and mortality? (The protocols or lack thereof for assessing extubation readiness were not described in the current study, unless I missed them.) It would certainly be interesting and perhaps mandatory to know the extubation practices in the centers involved in this study, especially if we are going to take great stock in this secondary outcome of this study.
Another thing I am interested in knowing is what PATIENT experiences are like in each group - whether there is greater recall or other differences in psychological outcomes between patients who receive different sedatives during their ICU experience.
I hope this study and others like it serve as a wake-up call to the critical care research community which has heretofore been brainwashed into thinking that a therapy is only worthwhile if it improves mortality, a feat that is difficult to achieve not only because it is often unrealistic and because absurd power calculations and delta inflation run rampant in trial design, but because of limitations in funding and logistical difficulties. This group has shown us repeatedly that useful therapies in critical care need not be predicated upon a mortality reduction. It's past time to start buying some stock in shorter times on the blower and in the ICU.
Tuesday, February 3, 2009
Cost: The neglected adverse event / side effect in trials of for-profit pharmaceuticals and devices
Amid press releases and conference calls today pertaining to the release of data on two trials of the investigational drug pirfenidone, one analyst's comments struck me as subtly profound. She was saying that in spite of conflicting data on and uncertainty about the efficacy of the drug (in the Capacity 1 and Capacity 2 trials - percent change in FVC [forced vital CAPACITY] at 72 weeks was the primary endpoint of the study) IPF is a deadly and desperate disease for which no effective treatments exist (save for lung transplantations if you're willing to consider that an effective treatment) and therefore any treatment with any positive effect however small and however uncertain should be given ample consideration, especially given the relative absense of side effects of pirfenidone in the Capacity trials.
And I thought to myself - "absense of side effects?" Here we have a drug that, over the course of about 1.5 years reduces the decline in FVC by about 60ccs (maybe - it did so in Capacity 2 but not in Capacity 1) but does not prolong survival or dyspnea scores or any other outcome that a patient may notice. So, I'm picturing an IPF patient traipsing off to the drugstore to purchase pirfenidone, a branded drug, and I'm imagining that the cash outlay might be perceived by such a patient as an adverse event, a side effect of sorts of using this questionably effective drug to prevent an intangible decline in FVC. The analyst's argument distilled to: "why not, there's no drawback to using it and there are no alternatives", but this utterly neglected the financial hardships that many patients endure when taking expensive branded drugs and ignored alternative ways that patients with IPF may spend their income to benefit their health or general well-being.
This perspective is even more poignant when we consider the cases of "me-too" drugs that add marginally to the benefits or side effect profiles of existing drugs, and which are often approved on the basis of a trial comparing them to placebo rather than existing generic alternatives. One of the last posts on this blog detailed the case of Aliskiren, and I am reminded of the trial of Tiotropium published in the NEJM in October, among many other entire classes of drugs such as the proton pump inhibitors, antidepressants, antihistamines, inhaled corticosteroids, antihypertensives, ACE-inhibitors for congestive heart failure, and the list goes on.
Given todays economy, soaring healthcare costs, and increasing financial burdens and co-pays shouldered by patients especially those of limited economic means or those hit hardest by economic downturns, we can no longer afford (pun intended) to ignore the financial costs of "me too" medications as adverse events of the use of these drugs when cheaper alternatives exist.
In terms of trial design, we should demand that new agents be compared to existing alternatives when those exist, and we need to develop a system for evaluating the results of a trial that does not neglect the full range of adverse effects experienced by patients as a result of using expensive branded drugs. Marginally "better" is not better at all if it costs ridiculously more, and the uncertainty relating to the efficacy of a drug must be accounted for in terms of its value to patients, especially when costly.
And I thought to myself - "absense of side effects?" Here we have a drug that, over the course of about 1.5 years reduces the decline in FVC by about 60ccs (maybe - it did so in Capacity 2 but not in Capacity 1) but does not prolong survival or dyspnea scores or any other outcome that a patient may notice. So, I'm picturing an IPF patient traipsing off to the drugstore to purchase pirfenidone, a branded drug, and I'm imagining that the cash outlay might be perceived by such a patient as an adverse event, a side effect of sorts of using this questionably effective drug to prevent an intangible decline in FVC. The analyst's argument distilled to: "why not, there's no drawback to using it and there are no alternatives", but this utterly neglected the financial hardships that many patients endure when taking expensive branded drugs and ignored alternative ways that patients with IPF may spend their income to benefit their health or general well-being.
This perspective is even more poignant when we consider the cases of "me-too" drugs that add marginally to the benefits or side effect profiles of existing drugs, and which are often approved on the basis of a trial comparing them to placebo rather than existing generic alternatives. One of the last posts on this blog detailed the case of Aliskiren, and I am reminded of the trial of Tiotropium published in the NEJM in October, among many other entire classes of drugs such as the proton pump inhibitors, antidepressants, antihistamines, inhaled corticosteroids, antihypertensives, ACE-inhibitors for congestive heart failure, and the list goes on.
Given todays economy, soaring healthcare costs, and increasing financial burdens and co-pays shouldered by patients especially those of limited economic means or those hit hardest by economic downturns, we can no longer afford (pun intended) to ignore the financial costs of "me too" medications as adverse events of the use of these drugs when cheaper alternatives exist.
In terms of trial design, we should demand that new agents be compared to existing alternatives when those exist, and we need to develop a system for evaluating the results of a trial that does not neglect the full range of adverse effects experienced by patients as a result of using expensive branded drugs. Marginally "better" is not better at all if it costs ridiculously more, and the uncertainty relating to the efficacy of a drug must be accounted for in terms of its value to patients, especially when costly.
Thursday, December 11, 2008
Monday, June 2, 2008
"Off-Label Promotion By Proxy": How the NEJM and Clinical Trials are Used as an Advertising Apparatus. The Case of Aliskiren
In the print edition of the June 5th NEJM (mine is delivered almost a week early sometimes), readers will see on the front cover the lead article entitled "Aliskiren Combined with Losartan in Type 2 Diabetes and Nephropathy," and on the back cover a sexy advertisement for Tekturna (aliskiren), an approved antihypertensive agent, which features "mercury-man", presumably a former hypertensive patient metamorphized into elite biker (and perhaps superhero) by the marvels of Tekturna. Readers who lay the journal inside down while open may experience the same irony I did when they see the front cover lead article juxtaposed to the back cover advertisement.
The article describes how aliskiren, in the AVOID trial, reduced the mean urinary albumin-to-creatinine ratio as compared to losartan alone. There are several important issues here. First, if one wants to use a combination of agents, s/he can use losartan with a generic ACE-inhibitor (ACEi). A more equitable comparison would have pitted aliskiren plus losartan against [generic] ACEi plus losartan. The authors would retort of course that losartan alone is a recommended agent for the condition studied, but that is circular logic. If we were not in need of more aggressive therapy for this condition, then why study aliskiren in combination for it at all? If you want to study a new aggressive combination, it seems only fair to compare it to existing aggressive combinations.
Which brings me to another point - should aliskiren be used for ANY condition? No, it should not. It is a novel [branded] agent which is expensive, for which there is little experience, which may have important side effects which are only discovered after it is used in hundreds of thousands of patients, and more importantly, alternative effective agents exist which are far less costly adn for which more experience exist. A common error in decision making occurs when decision makers focus only on the agent or choice at hand and fail to consider the range of alternatives and how the agent under consideration fares when compared to the alternatives. Because aliskiren has only been shown to lower blood pressure, a surrogate endpoint, we would do well to stick with cheaper agents for which there are more data and more experience, and reserve use of aliskiren until a study shows a long-term mortality or meaningful morbidity benefit.
But here's the real rub - after an agent like this gets approved for one [common] indication (hypertension), the company is free to conduct little studies like this one, for off-label uses, to promote its sale [albeit indirectly] in patients who do not need it for its approved indication (BP lowering). And what better advertising to bring the drug into the sight of physicians than a lead article in the NEJM, with a complementary full page advertisement on the back cover? This subversive "off-label promotion by proxy", effected by study of off-label indications for which FDA approval may or may not ultimately be sought, has the immediate benefit of misleading the unwary who may increase prescriptions of this medication based on this study (which they are free to do) withouth considering the full range of alternatives.
My colleague David Majure, MD, MPH has commented to me about an equally insidious but perhaps more nefarious practice that he noticed may be occuring while attending this year's meeting of the American College of Cardiology (ACC). There, "investigtors" and corporate cronies are free to present massive amounts of non-peer reviewed data in the form of abstracts and presentations, much of which data will not and should not withstand peer review or which will be relegated to the obscurity of low-tier journals (where it likely belongs). But eager audience members, lulled by the presumed credibility of data presented at a national meeting of [company paid] experts will likely never see the data in peer-reviewed form, and instead will carry away the messages as delivered. "Drug XYZ was found to do 1-2-3 to [surrogate endpoint/off-label indication] ABC." By sheer force of repetition alone, these abstracts and presentations serve to increase product recognition, and, almost certainly, prescriptions. Whether the impact of the data presented is meaningful or not need not be considered, and probably cannot be considered without seeing the data in printed form - and this is just fine - for sales that is.
(Added 6/11/2008: this pre-publication changing of practice patterns has been described before - see http://jama.ama-assn.org/cgi/content/abstract/284/22/2886 .)
The novel mechanism of action of this agent and the scientific validity of the AVOID trial notwithstanding, the editorialship of the NEJM and the medical community should realize that science and the profit motive are inextricably interwoven when companies study these branded agents. The full page advertisement on the back cover of this week's NEJM was just too much for me.
The article describes how aliskiren, in the AVOID trial, reduced the mean urinary albumin-to-creatinine ratio as compared to losartan alone. There are several important issues here. First, if one wants to use a combination of agents, s/he can use losartan with a generic ACE-inhibitor (ACEi). A more equitable comparison would have pitted aliskiren plus losartan against [generic] ACEi plus losartan. The authors would retort of course that losartan alone is a recommended agent for the condition studied, but that is circular logic. If we were not in need of more aggressive therapy for this condition, then why study aliskiren in combination for it at all? If you want to study a new aggressive combination, it seems only fair to compare it to existing aggressive combinations.
Which brings me to another point - should aliskiren be used for ANY condition? No, it should not. It is a novel [branded] agent which is expensive, for which there is little experience, which may have important side effects which are only discovered after it is used in hundreds of thousands of patients, and more importantly, alternative effective agents exist which are far less costly adn for which more experience exist. A common error in decision making occurs when decision makers focus only on the agent or choice at hand and fail to consider the range of alternatives and how the agent under consideration fares when compared to the alternatives. Because aliskiren has only been shown to lower blood pressure, a surrogate endpoint, we would do well to stick with cheaper agents for which there are more data and more experience, and reserve use of aliskiren until a study shows a long-term mortality or meaningful morbidity benefit.
But here's the real rub - after an agent like this gets approved for one [common] indication (hypertension), the company is free to conduct little studies like this one, for off-label uses, to promote its sale [albeit indirectly] in patients who do not need it for its approved indication (BP lowering). And what better advertising to bring the drug into the sight of physicians than a lead article in the NEJM, with a complementary full page advertisement on the back cover? This subversive "off-label promotion by proxy", effected by study of off-label indications for which FDA approval may or may not ultimately be sought, has the immediate benefit of misleading the unwary who may increase prescriptions of this medication based on this study (which they are free to do) withouth considering the full range of alternatives.
My colleague David Majure, MD, MPH has commented to me about an equally insidious but perhaps more nefarious practice that he noticed may be occuring while attending this year's meeting of the American College of Cardiology (ACC). There, "investigtors" and corporate cronies are free to present massive amounts of non-peer reviewed data in the form of abstracts and presentations, much of which data will not and should not withstand peer review or which will be relegated to the obscurity of low-tier journals (where it likely belongs). But eager audience members, lulled by the presumed credibility of data presented at a national meeting of [company paid] experts will likely never see the data in peer-reviewed form, and instead will carry away the messages as delivered. "Drug XYZ was found to do 1-2-3 to [surrogate endpoint/off-label indication] ABC." By sheer force of repetition alone, these abstracts and presentations serve to increase product recognition, and, almost certainly, prescriptions. Whether the impact of the data presented is meaningful or not need not be considered, and probably cannot be considered without seeing the data in printed form - and this is just fine - for sales that is.
(Added 6/11/2008: this pre-publication changing of practice patterns has been described before - see http://jama.ama-assn.org/cgi/content/abstract/284/22/2886 .)
The novel mechanism of action of this agent and the scientific validity of the AVOID trial notwithstanding, the editorialship of the NEJM and the medical community should realize that science and the profit motive are inextricably interwoven when companies study these branded agents. The full page advertisement on the back cover of this week's NEJM was just too much for me.
Thursday, May 29, 2008
Prucalopride: When Delivery is so Suspicious that the Entire Message Seems Corrupt
In this week's NEJM, (http://content.nejm.org/cgi/content/short/358/22/2344) Camilleri (of the Mayo Clinic) and comrades from Movetis (a pharmaceutical company) report the results of a study of Prucalopride, a prokinetic agent, for the treatment of chronic constipation. What is striking about this study is not the agent's relation to Ciaspride (Propulsid, an agent removed from the market a number of years ago because of QTc prolongation and associated cardiac risk) but rather the fact that this study was completed nearly a decade ago, and was only just now published. Such a delay is certainly worthy of concern as astutely pointed out by an editorialist (http://content.nejm.org/cgi/content/short/358/22/2402).
A colleague and I recently pointed out the unethical practice of witholding the results of negative trials from the scientific community (see http://ccmjournal.com/pt/re/ccm/fulltext.00003246-200803000-00060.htm;jsessionid=L2bQSl9ygT9BzlZq81qlnJGfyfG2Jh2f2qQvP4XTp0YqMQ1ZD3T1!195308708!181195628!8091!-1?index=1&database=ppvovft&results=1&count=10&searchid=2&nav=search#P6), but the Prucalopride trial takes the cake. Here, positive results were either intentionally witheld from that community or by happenstance were omitted from publication, delaying further study of this agent (if it is indeed even warranted) and undermining the altruistic basis of subjects' participation in the trial, which, ostensibly, was to advance science (unless they participated for financial incentives, which I might argue [as others already have] should be disclosed in the reporting of a trial - see http://content.nejm.org/cgi/content/extract/358/22/2316.)
I will leave it to other bloggers and commentators to speculate whether the profit or other motives were the impetus behind this delay and whether medical ghostwriting was in any way involved in the publication of this article. Suffice it to say that there are certain irregularities in the way a trial is reported (in addition to those with which it was conducted) that should give us pause. Prucalopride has now shown itself to be worthy of a bright spotlight of intense scrutiny.
A colleague and I recently pointed out the unethical practice of witholding the results of negative trials from the scientific community (see http://ccmjournal.com/pt/re/ccm/fulltext.00003246-200803000-00060.htm;jsessionid=L2bQSl9ygT9BzlZq81qlnJGfyfG2Jh2f2qQvP4XTp0YqMQ1ZD3T1!195308708!181195628!8091!-1?index=1&database=ppvovft&results=1&count=10&searchid=2&nav=search#P6), but the Prucalopride trial takes the cake. Here, positive results were either intentionally witheld from that community or by happenstance were omitted from publication, delaying further study of this agent (if it is indeed even warranted) and undermining the altruistic basis of subjects' participation in the trial, which, ostensibly, was to advance science (unless they participated for financial incentives, which I might argue [as others already have] should be disclosed in the reporting of a trial - see http://content.nejm.org/cgi/content/extract/358/22/2316.)
I will leave it to other bloggers and commentators to speculate whether the profit or other motives were the impetus behind this delay and whether medical ghostwriting was in any way involved in the publication of this article. Suffice it to say that there are certain irregularities in the way a trial is reported (in addition to those with which it was conducted) that should give us pause. Prucalopride has now shown itself to be worthy of a bright spotlight of intense scrutiny.
Wednesday, May 14, 2008
Troponin Predicts Outcome in Heart Failure - But So What?
In today's NEJM, Peacock and others (http://content.nejm.org/cgi/content/short/358/20/2117 ) report that cardiac troponin is STATISTICALLY associated with hospital mortality in patients with acute decompensated heart failure, and that this association is independent of other predictive variables. Let us assume that we take the results for granted, and that this is an internally and externally valid study with little discernible bias.
In the first paragraph of the discussion, the authors state that "These results suggest that measurement of troponin adds important prognostic information to the initial evaluation of patients with acute decompensated heart failure and should be considered as part of an early assessment of risk."
Really?
The mortality in patients in the lowest quartile of troponin I was 2.0% and that in the highest quartile was 5.3%. If we make the common mistake of comparing things on a relative scale, this is in an impressive difference - in excess of a twofold increase in mortality. But that is like saying that I saved 50% off the price of a Hershey Kiss which costs 5 cents - so I saved 3 cents! As we approach zero, smaller and smaller absolute differences can appear impressive on a relative scale. But health should not be appraised that way. If you are "buying" something, be it health or some other commodity, you shouldn't care about your relative return on your investment, only the absolute return. You have after all, only some absolute quantity of money. Charlie (from the Chocolate Factory) may find 3 cents to be meaningful, but we are not here talking about getting a 3% reduction in mortality - we are talking about predicting for Charlie whether he will have to pay $0.05 for his kiss or $0.02 for it, and even if our prediction is accurate, we do not know how to help him get the discounted kiss - he's either lucky or he's not.
Imagine that you are a patient hospitalized for acute decompensated heart failure. Does it matter to you if your physician comes to you carrying triumphantly the results of your troponin I test and informs you that because it is low, your mortality is 2% rather than 5%? It probably matters very little. It matters even less if your physician is not going to do anything differently given the results of that test. Two percent, 5 percent, it doesn't matter if it can't be changed.
Then there is the cost associated with this test. My hospital charges on the order of $200 for this test. Consider the opportunity costs - what else could that $200 be spent on, in the care of American patients, and perhaps even more importantly in the context of global health and economics? Also consider the value of the test to a patient who might have to pay out of pocket for it - is it worth $200 to discriminate within an in-hospital mortality range of 2-5%?
This study, while meticulously conducted and reported, underscores the important distinction between statistical significance and clinical significance. With the aid of a ginormous patient registry, the authors clearly demonstrated a statistically significant result that is at least mildly interesting from a biological perspective (is it interesting that a failing heart spills some of its contents into the blodstream and that they can be detected by a highly sensitive assay?) But the clinical significance of the findings appears to be negligible, and I worry that this report will encourate the already rampant mindless use of this expensive test which, outside of the context of clinical pre-test probabilities, already serves to misguide care and run up healthcare costs in a substantial proportion of the patients in whom it is ordered.
In the first paragraph of the discussion, the authors state that "These results suggest that measurement of troponin adds important prognostic information to the initial evaluation of patients with acute decompensated heart failure and should be considered as part of an early assessment of risk."
Really?
The mortality in patients in the lowest quartile of troponin I was 2.0% and that in the highest quartile was 5.3%. If we make the common mistake of comparing things on a relative scale, this is in an impressive difference - in excess of a twofold increase in mortality. But that is like saying that I saved 50% off the price of a Hershey Kiss which costs 5 cents - so I saved 3 cents! As we approach zero, smaller and smaller absolute differences can appear impressive on a relative scale. But health should not be appraised that way. If you are "buying" something, be it health or some other commodity, you shouldn't care about your relative return on your investment, only the absolute return. You have after all, only some absolute quantity of money. Charlie (from the Chocolate Factory) may find 3 cents to be meaningful, but we are not here talking about getting a 3% reduction in mortality - we are talking about predicting for Charlie whether he will have to pay $0.05 for his kiss or $0.02 for it, and even if our prediction is accurate, we do not know how to help him get the discounted kiss - he's either lucky or he's not.
Imagine that you are a patient hospitalized for acute decompensated heart failure. Does it matter to you if your physician comes to you carrying triumphantly the results of your troponin I test and informs you that because it is low, your mortality is 2% rather than 5%? It probably matters very little. It matters even less if your physician is not going to do anything differently given the results of that test. Two percent, 5 percent, it doesn't matter if it can't be changed.
Then there is the cost associated with this test. My hospital charges on the order of $200 for this test. Consider the opportunity costs - what else could that $200 be spent on, in the care of American patients, and perhaps even more importantly in the context of global health and economics? Also consider the value of the test to a patient who might have to pay out of pocket for it - is it worth $200 to discriminate within an in-hospital mortality range of 2-5%?
This study, while meticulously conducted and reported, underscores the important distinction between statistical significance and clinical significance. With the aid of a ginormous patient registry, the authors clearly demonstrated a statistically significant result that is at least mildly interesting from a biological perspective (is it interesting that a failing heart spills some of its contents into the blodstream and that they can be detected by a highly sensitive assay?) But the clinical significance of the findings appears to be negligible, and I worry that this report will encourate the already rampant mindless use of this expensive test which, outside of the context of clinical pre-test probabilities, already serves to misguide care and run up healthcare costs in a substantial proportion of the patients in whom it is ordered.
Tuesday, April 29, 2008
Blood Substitutes Doomed by Natanson's Meta-Analysis in JAMA
When the ARMY gives up on something, you should be on the lookout for red flags. (Pentagon types beholden to powerful contractors and highly susceptible to sunk cost bias still haven't given up on that whirligig of death called the Osprey, have they?) But the ARMY's abandonment of a blood substitute that it found was killing animals in tests was apparently no deterrent to Northfield Laboratories, Inc., makers of "Polyheme", as well as Wall Street investors in this an other companies working on products with a similar goal - to cook up an extracellular hemoglobin-based molecule that can be used in lieu of red blood cell transfusions in trauma patients and others.
Charles Natanson, an intramural researcher at the NIH and co-workers performed a meta-analysis of trials of blood substitutes which was published on-line today at the JAMA website: http://jama.ama-assn.org/cgi/content/full/299.19.jrv80007 . They found that these trials, which were powered for outcomes such as number of transfusions provided or other "surrogate-sounding" endpoints, when combined demonstrate that these products were killing subjects in these studies. The relative risk of death for study subjects receiving one of these products was 1.3 and the risk of myocardial infarction increased more than threefold. The robustness of these findings is enhanced by the biological plausibility of the result - cell-free hemoglobin is known to eat up nitric oxide from the endothelium of the vasculature leading to substantial vasoconstriction and other untoward downstream outcomes.
In addition to my penchant for cautionary tales, my interest in this study has to do with study design. We are beholden to "conventional" study design expectations where a p-value is a p-value, they're all 0.05, and an outcome is an outcome, whether it be bleeding, or pain or death, we don't differentially value them. But if you're studying a novel agent, looking for some crumby surrogate endpoint like number of transfusions, and your alpha threshold for that is 0.05, then the alpha threshold for death should be higher (say 0.25 or so), especially if you're underpowered to detect excess deaths. That kind of arrangement would imply that we value death at least 5 times higher than transfusion (I for one would rather have 500 or more transfusions that be dead, but that's a topic for another discussion).
Fortunately for any patients that may have been recruited to participate in such studies, Natanson et al undertook this perspicacious meta-analysis, and the editiorialists extended their recommendations for more transparency in data dissemination to argue, almost, that future trials of blood substitutes should be banned or boycotted. Even if the medical community does not have the gumption to go that far, prospective participants in such studies and their surrogates can at least perform a simple google search, and from now on the Natanson article is liable to be on the first page.
Charles Natanson, an intramural researcher at the NIH and co-workers performed a meta-analysis of trials of blood substitutes which was published on-line today at the JAMA website: http://jama.ama-assn.org/cgi/content/full/299.19.jrv80007 . They found that these trials, which were powered for outcomes such as number of transfusions provided or other "surrogate-sounding" endpoints, when combined demonstrate that these products were killing subjects in these studies. The relative risk of death for study subjects receiving one of these products was 1.3 and the risk of myocardial infarction increased more than threefold. The robustness of these findings is enhanced by the biological plausibility of the result - cell-free hemoglobin is known to eat up nitric oxide from the endothelium of the vasculature leading to substantial vasoconstriction and other untoward downstream outcomes.
In addition to my penchant for cautionary tales, my interest in this study has to do with study design. We are beholden to "conventional" study design expectations where a p-value is a p-value, they're all 0.05, and an outcome is an outcome, whether it be bleeding, or pain or death, we don't differentially value them. But if you're studying a novel agent, looking for some crumby surrogate endpoint like number of transfusions, and your alpha threshold for that is 0.05, then the alpha threshold for death should be higher (say 0.25 or so), especially if you're underpowered to detect excess deaths. That kind of arrangement would imply that we value death at least 5 times higher than transfusion (I for one would rather have 500 or more transfusions that be dead, but that's a topic for another discussion).
Fortunately for any patients that may have been recruited to participate in such studies, Natanson et al undertook this perspicacious meta-analysis, and the editiorialists extended their recommendations for more transparency in data dissemination to argue, almost, that future trials of blood substitutes should be banned or boycotted. Even if the medical community does not have the gumption to go that far, prospective participants in such studies and their surrogates can at least perform a simple google search, and from now on the Natanson article is liable to be on the first page.
Thursday, April 3, 2008
A [now open] letter to Congress re: Proposed Medicare Reimbursement Cuts
I'm not sure that this is entirely in keeping with the theme of this blog, but I will justify it by saying that the health of the healthcare system is of vital interest to all stakeholders including researchers with an interest in clinical trials. The following letter was sent via the ACCP to my senators and congressmen in regards to the Medicare reimbursement cuts that are to be instituted in July of this year. We were solicited via the medical professional society to be a voice in opposition to the cuts....
Dear Sir or Madam-
Physicians' income, especially that of primary care providers, upon whom patients rely most heavily for basic care, has been falling in real dollars (not keeping pace with inflation) for years, and the newest cuts will markedly exacerbate the disconcerting trend that already exists.
Most physicians do not begin earning income in earnest until they are over 30 years old, a significant lost opportunity due to prolonged schooling and training. This compounds the problem of substantial debt burden that recent graduates must bear. Economically speaking, medicine, especially in the essential primary care fields, is no longer an attractive option for many talented students and graduates. From a job satisfaction standpoint, medicine has also become far less attractive due to regulatory burdens, paperwork, lack of adequate time to spend with patients, and fragmentation of care.
This fragmentation of care is in fact at least partially driven by Medicare cuts. When reimbursement to an individual physician is cut, s/he simply "farms out" parcels of the overall care of the patient to other physicians and specialists. This "multi-consultism" militates against any cost savings that might be achieved by cuts in reimbursement to individual physicians. Perhaps more alarming is the fact that care delivery is less comprehensive, more fragmented, and less satisfying to patients and physicians alike, the latter which may feel a "diffusion of responsibilty" regarding patients' care when multiconsultism is employed. Reduced reimbursements also likely drive the excess ordering of laboratory tests and radiographic scans, both in situations where the physician stands to profit from the testing and when s/he does not, in the latter case because the care is being "farmed out" not to another physician, but to the laboratory or radiology suite. The result is that Medicare "cuts" may paradoxically increase overall net healthcare expenditures. Physicians are already squeezed as much as they can tolerate being squeezed. Further cuts are certain to backfire in this and myriad other ways.
A perhaps more insidious, invidious, and pernicious result of reimbursement cuts is that it is driving the talent out of medicine, especially primary care medicine. Were it not for the veritable reimbursement shelter that I experience as a practitioner at an academic medical center, I would surely not be practicing medicine in any traditional way - it is simply not worth it. Hence we have the genesis and proliferation of "concierge practices" where the wealthy pay an annual fee for entry into the practice, only cash payments are accepted, and more traditional service from your physician (e.g., time to talk to him/her in an unhurried fashion) can be expected by patients. Hence we have, as pointed out in a recent New York Times article (http://query.nytimes.com/gst/fullpage.html?res=9C05E6D81E38F93AA25750C0A96E9C8B63&scp=2&sq=dermatology&st=nyt ), the siphoning of medical student talent into specialties such as dermatology and plastic surgery because the lifestyle is more attractive and reimbursement is not a problem since the "clientele" (aka patients) are affluent and pay out-of-pocket. Hence we have the brightest physicians, such as my colleague and close friend Michael C., MD, leaving medicine altogether to work on Wall Street in the financial sector. All of these disturbing trends threaten to undermine what was heretofore (and hopefully still is) one of the best healthcare systems on the planet. I, for one, will not recommend a career in primary care to any medical student who seeks my advice, and to undergraduates contemplating a career in medicine I say "enter medicine only if it is the only field you can invision yourself ever being happy in."
The system is broken, and we as a country cannot endure and thrive if our healthcare expenditures continue to eat up 15+% of our GDP. But cutting the payments to physicians, the very workforce upon which delivery of any care depends, is no longer a viable solution to the problem. Other excesses in the system, such as use of branded pharmaceuticals (e.g., Vytorin or Zetia) when generic alternatives are as good or better, use of expensive scans of unproven benefit (screening CT scans for lung cancer) when cheaper alternatives exist (stoping smoking), excessive and wasteful laboratory testing of unproven benefit (daily laboratory testing on hospital inpatients, wanton ordering of chest x-rays, head CTs, EKGs, and echocardiograms), use of therapeutic modalities of very high cost and modest benefit (AICDs, lung transplantation, back surgery, knee arthroscopy, coated stents, etc.), and provision of futile care at the end of life are better targets for cost savings, limitations on which are far less likely to compromise delivery of generally effective and affordable care for the average citizen.
I urge congress to consider the far-reaching but difficult to measure consequences of further reimbursement cuts before an entire generation of the most talented physicians and potential physicians determines that the financial, lifestyle, and opportunity costs of practicing medicine, especially primary care medicine, are just too much to bear.
Regards,
Scott K Aberegg, MD, MPH, FCCP
Assistant Professor of Medicine
The Ohio State University College of Medicine
Columbus,
Dear Sir or Madam-
Physicians' income, especially that of primary care providers, upon whom patients rely most heavily for basic care, has been falling in real dollars (not keeping pace with inflation) for years, and the newest cuts will markedly exacerbate the disconcerting trend that already exists.
Most physicians do not begin earning income in earnest until they are over 30 years old, a significant lost opportunity due to prolonged schooling and training. This compounds the problem of substantial debt burden that recent graduates must bear. Economically speaking, medicine, especially in the essential primary care fields, is no longer an attractive option for many talented students and graduates. From a job satisfaction standpoint, medicine has also become far less attractive due to regulatory burdens, paperwork, lack of adequate time to spend with patients, and fragmentation of care.
This fragmentation of care is in fact at least partially driven by Medicare cuts. When reimbursement to an individual physician is cut, s/he simply "farms out" parcels of the overall care of the patient to other physicians and specialists. This "multi-consultism" militates against any cost savings that might be achieved by cuts in reimbursement to individual physicians. Perhaps more alarming is the fact that care delivery is less comprehensive, more fragmented, and less satisfying to patients and physicians alike, the latter which may feel a "diffusion of responsibilty" regarding patients' care when multiconsultism is employed. Reduced reimbursements also likely drive the excess ordering of laboratory tests and radiographic scans, both in situations where the physician stands to profit from the testing and when s/he does not, in the latter case because the care is being "farmed out" not to another physician, but to the laboratory or radiology suite. The result is that Medicare "cuts" may paradoxically increase overall net healthcare expenditures. Physicians are already squeezed as much as they can tolerate being squeezed. Further cuts are certain to backfire in this and myriad other ways.
A perhaps more insidious, invidious, and pernicious result of reimbursement cuts is that it is driving the talent out of medicine, especially primary care medicine. Were it not for the veritable reimbursement shelter that I experience as a practitioner at an academic medical center, I would surely not be practicing medicine in any traditional way - it is simply not worth it. Hence we have the genesis and proliferation of "concierge practices" where the wealthy pay an annual fee for entry into the practice, only cash payments are accepted, and more traditional service from your physician (e.g., time to talk to him/her in an unhurried fashion) can be expected by patients. Hence we have, as pointed out in a recent New York Times article (http://query.nytimes.com/gst/fullpage.html?res=9C05E6D81E38F93AA25750C0A96E9C8B63&scp=2&sq=dermatology&st=nyt ), the siphoning of medical student talent into specialties such as dermatology and plastic surgery because the lifestyle is more attractive and reimbursement is not a problem since the "clientele" (aka patients) are affluent and pay out-of-pocket. Hence we have the brightest physicians, such as my colleague and close friend Michael C., MD, leaving medicine altogether to work on Wall Street in the financial sector. All of these disturbing trends threaten to undermine what was heretofore (and hopefully still is) one of the best healthcare systems on the planet. I, for one, will not recommend a career in primary care to any medical student who seeks my advice, and to undergraduates contemplating a career in medicine I say "enter medicine only if it is the only field you can invision yourself ever being happy in."
The system is broken, and we as a country cannot endure and thrive if our healthcare expenditures continue to eat up 15+% of our GDP. But cutting the payments to physicians, the very workforce upon which delivery of any care depends, is no longer a viable solution to the problem. Other excesses in the system, such as use of branded pharmaceuticals (e.g., Vytorin or Zetia) when generic alternatives are as good or better, use of expensive scans of unproven benefit (screening CT scans for lung cancer) when cheaper alternatives exist (stoping smoking), excessive and wasteful laboratory testing of unproven benefit (daily laboratory testing on hospital inpatients, wanton ordering of chest x-rays, head CTs, EKGs, and echocardiograms), use of therapeutic modalities of very high cost and modest benefit (AICDs, lung transplantation, back surgery, knee arthroscopy, coated stents, etc.), and provision of futile care at the end of life are better targets for cost savings, limitations on which are far less likely to compromise delivery of generally effective and affordable care for the average citizen.
I urge congress to consider the far-reaching but difficult to measure consequences of further reimbursement cuts before an entire generation of the most talented physicians and potential physicians determines that the financial, lifestyle, and opportunity costs of practicing medicine, especially primary care medicine, are just too much to bear.
Regards,
Scott K Aberegg, MD, MPH, FCCP
Assistant Professor of Medicine
The Ohio State University College of Medicine
Columbus,
Monday, March 31, 2008
MRK and SGP: Ye shall know the truth, and the truth shall send thy stock spiralling
Apparently, the editors of the NEJM read my blog (even though they stop short of calling for a BOYCOTT):
"...it seems prudent to encourage patients whose LDL cholesterol levels remain elevated despite treatment with an optimal dose of a statin to redouble their efforts at dietary control and regular exercise. Niacin, fibrates, and resins should be considered when diet, exercise, and a statin have failed to achieve the target, with ezetimibe reserved for patients who cannot tolerate these agents."
Sound familiar?
The full editorial can be seen here: http://content.nejm.org/cgi/content/full/NEJMe0801842
along with a number of other early-release articles on the subject.
The ENHANCE data are also published online (http://content.nejm.org/cgi/content/full/NEJMoa0800742
and there's really nothing new to report. We have known the results for several months now. What is new is doctors' nascent realization that they have been misled and bamboozled by the drug reps, Big Pharma, and their own long-standing, almost religious faith in surrogate endpoints (see post below). It's like you have to go through the stages of grief (Kubler-Ross) before you give up on your long-cherished notions of reality (denial, anger, bargaining, then, finally, acceptance). Amazingly, the ACC, whose statement just months ago appeared to be intended to allay patients' and doctors' concerns about Zetia, has done a apparent 180 on the drug: "Go back to Statins" is now their sanctimonious advice: http://acc08.acc.org/SSN/Documents/ACC%20D3LR.pdf
I was briefly at the ACC meeting yesterday (although I did not pay the $900 fee to attend the sessions). The Big Pharma marketing presence was nauseating. A Lipitor-emblazoned bag was given to each attendee. A Lipitor laynard was used to hold your $900 ID badge. Busses throughout the city were emblazoned with Vytorin and Lipitor advertisements among others. Banners covered numerous floors of the facades of city buildings. The "exhibition hall," a veritable orgy of marketing madness, was jam-packed with the most aesthetically pleasing and best-dressed salespersons with their catchy displays and gimmicks. (Did you know that abnormal "vascular reactivity" is a heretofore unknown "risk factor"? And that with a little $20,000 device that they can sell you (which you can probably bill for), you can detect said abnormal vascular reactivity.) The distinction between science, reality, and marketing is blurred imperceptibly if it exists at all. Physicians from all over the world greedily scramble for free pens, bags, and umbrellas (as if they cannot afford such trinkets on their own - or was it the $900 entrance fee that squeezed their pocketbooks?) They can be seen throughout the convention center with armloads of Big Pharma propaganda packages: flashlights, laser pointers, free orange juice and the like.
I just wonder: How much money does the ACC receive from these companies (for this Big Pharma Bonanza and for other "activities")? If my guess is in the right ballpark, I don't have to wonder why the ACC hedged in its statement when the ENHANCE data were released in January. I think I might have an idea.
"...it seems prudent to encourage patients whose LDL cholesterol levels remain elevated despite treatment with an optimal dose of a statin to redouble their efforts at dietary control and regular exercise. Niacin, fibrates, and resins should be considered when diet, exercise, and a statin have failed to achieve the target, with ezetimibe reserved for patients who cannot tolerate these agents."
Sound familiar?
The full editorial can be seen here: http://content.nejm.org/cgi/content/full/NEJMe0801842
along with a number of other early-release articles on the subject.
The ENHANCE data are also published online (http://content.nejm.org/cgi/content/full/NEJMoa0800742
and there's really nothing new to report. We have known the results for several months now. What is new is doctors' nascent realization that they have been misled and bamboozled by the drug reps, Big Pharma, and their own long-standing, almost religious faith in surrogate endpoints (see post below). It's like you have to go through the stages of grief (Kubler-Ross) before you give up on your long-cherished notions of reality (denial, anger, bargaining, then, finally, acceptance). Amazingly, the ACC, whose statement just months ago appeared to be intended to allay patients' and doctors' concerns about Zetia, has done a apparent 180 on the drug: "Go back to Statins" is now their sanctimonious advice: http://acc08.acc.org/SSN/Documents/ACC%20D3LR.pdf
I was briefly at the ACC meeting yesterday (although I did not pay the $900 fee to attend the sessions). The Big Pharma marketing presence was nauseating. A Lipitor-emblazoned bag was given to each attendee. A Lipitor laynard was used to hold your $900 ID badge. Busses throughout the city were emblazoned with Vytorin and Lipitor advertisements among others. Banners covered numerous floors of the facades of city buildings. The "exhibition hall," a veritable orgy of marketing madness, was jam-packed with the most aesthetically pleasing and best-dressed salespersons with their catchy displays and gimmicks. (Did you know that abnormal "vascular reactivity" is a heretofore unknown "risk factor"? And that with a little $20,000 device that they can sell you (which you can probably bill for), you can detect said abnormal vascular reactivity.) The distinction between science, reality, and marketing is blurred imperceptibly if it exists at all. Physicians from all over the world greedily scramble for free pens, bags, and umbrellas (as if they cannot afford such trinkets on their own - or was it the $900 entrance fee that squeezed their pocketbooks?) They can be seen throughout the convention center with armloads of Big Pharma propaganda packages: flashlights, laser pointers, free orange juice and the like.
I just wonder: How much money does the ACC receive from these companies (for this Big Pharma Bonanza and for other "activities")? If my guess is in the right ballpark, I don't have to wonder why the ACC hedged in its statement when the ENHANCE data were released in January. I think I might have an idea.
Labels:
ACC,
alternatives,
big pharma,
boycott,
ezetimibe,
marketing,
Merck,
MRK,
opportunity costs,
profiteering,
Schering-Plough,
SGP,
Simvastatin,
Surrogate End-points,
Vytorin,
zetia
Wednesday, March 26, 2008
Torcetrapib, Ezetimibe, and Surrogate Endpoints: A Cautionary Tale
In today's JAMA, (http://jama.ama-assn.org/cgi/content/extract/299/12/1474 ), Drs. Psaty and Lumley echo many of the points on this blog over the last six months about ezetimibe and torcetrapib (see posts below.) While they stop short of calling for a boycott of ezetimibe, and their perspective on torcetrapib is tempered by Pfizer's early conduct of a trial with hard outcomes as endpoints, their commentary underscores the dangers inherent in the long-standing practice of almost unquestioningly accepting the validy of "established" surrogate endpoints. The time to re-examine the validity of surrogate endpoints such as glycemic control, LDL, HDL, and blood pressure is now. Agents to treat these maladies are abundant and widely accessible, so potential delays in discovery and approval of new agents is no longer a suitable argument for a "fast track" approval process for new agents. We have seen time and again that such "fast tracks" are nothing more than expressways to profit for Big Pharma.
Psaty and Lumley's chronology of the studies of ezitimibe and their timing are themselves timely and should refocus needed scrutiny on the role of pharmaceutical companies as the stewards of scientific data and discovery.
Psaty and Lumley's chronology of the studies of ezitimibe and their timing are themselves timely and should refocus needed scrutiny on the role of pharmaceutical companies as the stewards of scientific data and discovery.
Monday, March 10, 2008
The CORTICUS Trial: Power, Priors, Effect Size, and Regression to the Mean
The long-awaited results of another trial in critical care were published in a recent NEJM: (http://content.nejm.org/cgi/content/abstract/358/2/111). Similar to the VASST trial, the CORTICUS trial was "negative" and low dose hydrocortisone was not demonstrated to be of benefit in septic shock. However, unlike VASST, in this case the results are in conflict with an earlier trial (Annane et al, JAMA, 2002) that generated much fanfare and which, like the Van den Berghe trial of the Leuven Insulin Protocol, led to widespread [and premature?] adoption of a new therapy. The CORTICUS trial, like VASST, raises some interesting questions about the design and interpretation of trials in which short-term mortality is the primary endpoint.
Jean Louis Vincent presented data at this year's SCCM conference with which he estimated that only about 10% of trials in critical care are "positive" in the traditional sense. (I was not present, so this is basically hearsay to me - if anyone has a reference, please e-mail me or post it as a comment.) Nonetheless, this estimate rings true. Few are the trials that show a statistically significant benefit in the primary outcome, fewer still are trials that confirm the results of those trials. This begs the question: are critical care trials chronically, consistently, and woefully underpowered? And if so, why? I will offer some speculative answers to these and other questions below.
The CORTICUS trial, like VASST, was powered to detect a 10% absolute reduction in mortality. Is this reasonable? At all? What is the precedent for a 10% ARR in mortality in a critical care trial? There are few, if any. No large, well-conducted trials in critical care that I am aware of have ever demonstrated (least of all consistently) a 10% or greater reduction in mortality of any therapy, at least not as a PRIMARY PROSPECTIVE OUTCOME. Low tidal volume ventilation? 9% ARR. Drotrecogin-alfa? 7% ARR in all-comers. So I therefore argue that all trials powered to detect an ARR in mortality of greater than 7-9% are ridiculously optimistic, and that the trials that spring from this unfortunate optimism are woefully underpowered. It is no wonder that, as JLV purportedly demonstrated, so few trials in critical care are "positive". The prior probability is is exceedingly low that ANY therapy will deliver a 10% mortality reduction. The designers of these trials are, by force of pragmatic constraints, rolling the proverbial trial dice and hoping for a lucky throw.
Then there is the issue of regression to the mean. Suppose that the alternative hypothesis (Ha) is indeed correct in the generic sense that hydrocortisone does beneficially influence mortality in septic shock. Suppose further that we interpret Annane's 2002 data as consistent with Ha. In that study, a subgroup of patients (non-responders) demonstrated a 10% ARR in mortality. We should be excused for getting excited about this result, because after all, we all want the best for our patients and eagerly await the next breaktrough, and the higher the ARR, the greater the clinical relevance, whatever the level of statistical significance. But shouldn't we regard that estimate with skepticism since no therapy in critical care has ever shown such a large reduction in mortality as a primary outcome? Since no such result has ever been consistently repeated? Even if we believe in Ha, shouldn't we also believe that the 10% Annane estimate will regress to the mean on repeated trials?
It may be true that therapies with robust data behind them become standard practice, equipoise dissapates, and the trials of the best therapies are not repeated - so they don't have a chance to be confirmed. But the knife cuts both ways - if you're repeating a trial, it stands to reason that the data in support of the therapy are not that robust and you should become more circumspect in your estimates of effect size - taking prior probability and regression to the mean into account.
Perhaps we need to rethink how we're powering these trials. And funding agencies need to rethink the budgets they will allow for them. It makes little sense to spend so much time, money, and effort on underpowered trials, and to establish the track record that we have established where the majority of our trials are "failures" in the traditional sence and which all include a sentence in the discussion section about how the current results should influence the design of subsequent trials. Wouldn't it make more sense to conduct one trial that is so robust that nobody would dare repeat it in the future? One that would provide a definitive answer to the quesiton that is posed? Is there something to be learned from the long arc of the steroid pendulum that has been swinging with frustrating periodicity for many a decade now?
This is not to denigrate in any way the quality of the trials that I have referred to. The Canadian group in particular as well as other groups (ARDSnet) are to be commended for producing work of the highest quality which is of great value to patients, medicine, and science. But in keeping with the advancement of knowledge, I propose that we take home another message from these trials - we may be chronically underpowering them.
Jean Louis Vincent presented data at this year's SCCM conference with which he estimated that only about 10% of trials in critical care are "positive" in the traditional sense. (I was not present, so this is basically hearsay to me - if anyone has a reference, please e-mail me or post it as a comment.) Nonetheless, this estimate rings true. Few are the trials that show a statistically significant benefit in the primary outcome, fewer still are trials that confirm the results of those trials. This begs the question: are critical care trials chronically, consistently, and woefully underpowered? And if so, why? I will offer some speculative answers to these and other questions below.
The CORTICUS trial, like VASST, was powered to detect a 10% absolute reduction in mortality. Is this reasonable? At all? What is the precedent for a 10% ARR in mortality in a critical care trial? There are few, if any. No large, well-conducted trials in critical care that I am aware of have ever demonstrated (least of all consistently) a 10% or greater reduction in mortality of any therapy, at least not as a PRIMARY PROSPECTIVE OUTCOME. Low tidal volume ventilation? 9% ARR. Drotrecogin-alfa? 7% ARR in all-comers. So I therefore argue that all trials powered to detect an ARR in mortality of greater than 7-9% are ridiculously optimistic, and that the trials that spring from this unfortunate optimism are woefully underpowered. It is no wonder that, as JLV purportedly demonstrated, so few trials in critical care are "positive". The prior probability is is exceedingly low that ANY therapy will deliver a 10% mortality reduction. The designers of these trials are, by force of pragmatic constraints, rolling the proverbial trial dice and hoping for a lucky throw.
Then there is the issue of regression to the mean. Suppose that the alternative hypothesis (Ha) is indeed correct in the generic sense that hydrocortisone does beneficially influence mortality in septic shock. Suppose further that we interpret Annane's 2002 data as consistent with Ha. In that study, a subgroup of patients (non-responders) demonstrated a 10% ARR in mortality. We should be excused for getting excited about this result, because after all, we all want the best for our patients and eagerly await the next breaktrough, and the higher the ARR, the greater the clinical relevance, whatever the level of statistical significance. But shouldn't we regard that estimate with skepticism since no therapy in critical care has ever shown such a large reduction in mortality as a primary outcome? Since no such result has ever been consistently repeated? Even if we believe in Ha, shouldn't we also believe that the 10% Annane estimate will regress to the mean on repeated trials?
It may be true that therapies with robust data behind them become standard practice, equipoise dissapates, and the trials of the best therapies are not repeated - so they don't have a chance to be confirmed. But the knife cuts both ways - if you're repeating a trial, it stands to reason that the data in support of the therapy are not that robust and you should become more circumspect in your estimates of effect size - taking prior probability and regression to the mean into account.
Perhaps we need to rethink how we're powering these trials. And funding agencies need to rethink the budgets they will allow for them. It makes little sense to spend so much time, money, and effort on underpowered trials, and to establish the track record that we have established where the majority of our trials are "failures" in the traditional sence and which all include a sentence in the discussion section about how the current results should influence the design of subsequent trials. Wouldn't it make more sense to conduct one trial that is so robust that nobody would dare repeat it in the future? One that would provide a definitive answer to the quesiton that is posed? Is there something to be learned from the long arc of the steroid pendulum that has been swinging with frustrating periodicity for many a decade now?
This is not to denigrate in any way the quality of the trials that I have referred to. The Canadian group in particular as well as other groups (ARDSnet) are to be commended for producing work of the highest quality which is of great value to patients, medicine, and science. But in keeping with the advancement of knowledge, I propose that we take home another message from these trials - we may be chronically underpowering them.
Sunday, March 9, 2008
The "Trials" and Tribulations of Powering Clinical Trials: The Case of Vasopressin for Septic Shock (VASST trial)
Nobody likes "negative" trials. They're just not as exciting as positive ones. (Unless they show that something we're doing is harmful or that a product that Wall Street has bet heavily on is headed for the chopping block.) But "negative" studies such as an excellent one by Russell et al in a recent NEJM (http://content.nejm.org/cgi/content/abstract/358/9/877 ) show just how difficult it is to design and conduct a "positive" trial. The [non-significant] trends in this study, namely that vasopressin is superior to norepinephrine in reducing mortality in septic shock, were demonstrated in a study that had an a priori power of 80%, based on an expected mortality rate of 60% in the placebo group. Actual power in the study was significantly less, not because, as the authors appear to suggest, the observed placebo mortality was only ~39%, but rather because the observed effect size fell markedly short of the anticipated 10% absolute mortality reduction. In order to demonstrate a mortality benefit of the magnitude observed in the current trial (~4% ARR) at a significance level of 0.05, approximately 1500 patients in each study arm would be required. This is a formidable number for a critical care trial.
Thus, this trial illustrates the trials and tribulations of designing and conducting studies with 28-day mortality as an endpoint. These studies not only entail substantial costs, but pose challenges for patient recruitment, necessitating the participation of numerous centers in a multinational setting. The coordination of such a trial is daunting. It is understandable, therefore, that investigators may wish to be optimistic about the ARR they can expect from a therapy, as this will reduce sample size and increase the chances that the trial will be successfully completed in a resonable period of time. (For an example of a study which had to be terminated early because of these challenges, see Mancebo et al : http://ajrccm.atsjournals.org/cgi/content/short/200503-353OCv1 ). Powering the trial at 80% instead of 90% likewise represents a compromise between optimism for the efficacy of the therapy and optimism for patient recruitment. In essence, the lower the power, the more "faith" there must be that a roll of the trial dice will confirm the alternative hypothesis.
These realities played out [dissappointingly] in the Russell trial. The p-value for the ARR (28-day mortality - the primary endpoint) associated with vasopressin compared with placebo was 0.26, while that associated with 90-day mortality (a prespecified secondary endpoint) was 0.11. Thus, this trial is considered negative by conventional standards.
But its being "negative" does not mean that it is not of value to practitioners. This large experience with vasopressin demonstrates both that this agent is a viable alternative to norepinephrine in regards to raising the MAP to within the goal range, as also that we can expect that there will not be a significant excess of adverse events when this agent is used. In my opinion, this study represents a veritable "green light" for continued use of this agent, as I agree with the editorialist (http://content.nejm.org/cgi/reprint/358/9/954.pdf ) that many patients with sepsis who are not responding to norepinephrine respond dramatically and favorably to this agent.
Perhaps there is a larger lesson here. Should we use the same p-value threshold for a study of, say, an antidepressant as we do for a study of an agent that may reduce mortality? In the former case, we may be most concerned about exposure of patients to a costly drug with no benefits and potential side effects - in essence, we are most concerned with a Type I Error, i.e., concluding that there is a benefit when in reality there is none. Perhaps in a trial of a potentially life-saving therapy (e.g., vasopressin) we should be most concerned with a Type II Error, i.e., concluding that there is no real benefit when in reality one exists. If that were the case, and you may have already guessed that I believe that it should be, we could address this concern by loosening the standard of statistical significance for a study of potentially life-saving agents.
The standards notwithstanding, critical care practitioners are free to interpret these data as they see fit. And one reasonable conclusion is that, the trends being in the right direction and the side effect profile being acceptable, we should be using more vasopressin in septic shock.
Or, we must make a tough call: do we want to invest the resources in a much larger trail to determine if vasopressin can be shown to reduce mortality at the conventional p-value level of 0.05? Can we recruit the necessary 3000 patients?
Thus, this trial illustrates the trials and tribulations of designing and conducting studies with 28-day mortality as an endpoint. These studies not only entail substantial costs, but pose challenges for patient recruitment, necessitating the participation of numerous centers in a multinational setting. The coordination of such a trial is daunting. It is understandable, therefore, that investigators may wish to be optimistic about the ARR they can expect from a therapy, as this will reduce sample size and increase the chances that the trial will be successfully completed in a resonable period of time. (For an example of a study which had to be terminated early because of these challenges, see Mancebo et al : http://ajrccm.atsjournals.org/cgi/content/short/200503-353OCv1 ). Powering the trial at 80% instead of 90% likewise represents a compromise between optimism for the efficacy of the therapy and optimism for patient recruitment. In essence, the lower the power, the more "faith" there must be that a roll of the trial dice will confirm the alternative hypothesis.
These realities played out [dissappointingly] in the Russell trial. The p-value for the ARR (28-day mortality - the primary endpoint) associated with vasopressin compared with placebo was 0.26, while that associated with 90-day mortality (a prespecified secondary endpoint) was 0.11. Thus, this trial is considered negative by conventional standards.
But its being "negative" does not mean that it is not of value to practitioners. This large experience with vasopressin demonstrates both that this agent is a viable alternative to norepinephrine in regards to raising the MAP to within the goal range, as also that we can expect that there will not be a significant excess of adverse events when this agent is used. In my opinion, this study represents a veritable "green light" for continued use of this agent, as I agree with the editorialist (http://content.nejm.org/cgi/reprint/358/9/954.pdf ) that many patients with sepsis who are not responding to norepinephrine respond dramatically and favorably to this agent.
Perhaps there is a larger lesson here. Should we use the same p-value threshold for a study of, say, an antidepressant as we do for a study of an agent that may reduce mortality? In the former case, we may be most concerned about exposure of patients to a costly drug with no benefits and potential side effects - in essence, we are most concerned with a Type I Error, i.e., concluding that there is a benefit when in reality there is none. Perhaps in a trial of a potentially life-saving therapy (e.g., vasopressin) we should be most concerned with a Type II Error, i.e., concluding that there is no real benefit when in reality one exists. If that were the case, and you may have already guessed that I believe that it should be, we could address this concern by loosening the standard of statistical significance for a study of potentially life-saving agents.
The standards notwithstanding, critical care practitioners are free to interpret these data as they see fit. And one reasonable conclusion is that, the trends being in the right direction and the side effect profile being acceptable, we should be using more vasopressin in septic shock.
Or, we must make a tough call: do we want to invest the resources in a much larger trail to determine if vasopressin can be shown to reduce mortality at the conventional p-value level of 0.05? Can we recruit the necessary 3000 patients?
Monday, February 18, 2008
Wake Up and Smell the Coffee then Wake Up Your Patients and Let Them Breathe
A few weeks ago in The Lancet (http://www.thelancet.com/journals/lancet/article/PIIS0140673608601051/abstract ) appeared a wonderful and pragmatic article demonstrating the effectiveness of combining Spontaneous Awakening Trials (SATs) with Spontaneous Breathing Trials (SBTs) in the ICU. This strategy of "Wake Up and Breathe" was highly effective and critical care practitioners everywhere should take heed. Unfortunately, a penchant for the status quo and a heaping of omission bias led the editorialist to foment skepticism for the adoption of "wake up and breathe." My colleagues and I find this skepticism unfounded and frankly dangerous in that it risks reducing the adoption of this highly effective strategy, the benefits of which clearly exceed the risks. Our letter to the editor of The Lancet was not accepted for publication, but is posted below. Hats off to Girard and Ely and co-workers for this vital addition to our literature. Now if we can just convince critical care practitioners to wake up and wake their patients up...
We read with interest the report of the ABC Trial which demonstrated the efficacy of combining daily awakenings with breathing trials in mechanically ventilated patients (1). In the accompanying editorial, Dr. Brochard contends that “sedation is also an important component of care for critically ill patients,” but he cites only one review article to support this claim (2). It is unknown if the disturbing weaning experiences he references are related to sedation restriction. What is known with reasonable certainty is that oversedation is common and associated with increased delirium (1;3), neuroimaging (4), long-term psychiatric consequences (5) and mortality (1) and longer duration of mechanical ventilation and ICU stay (1;4). The ABC trial adds to this body of literature by demonstrating the practical utility of combining daily sedation cessation with spontaneous breathing trails. That 92% of spontaneous awakening trials were well-tolerated strongly suggests that patients were no worse without sedation, and is consistent with prior studies showing that oversedation, not undersedation, is the principal risk to critically ill patients.
For too long, we suffered from a dearth of quality evidence to guide the care of the critically ill. Now that such evidence is available, we would be wise to act upon it. We therefore disagree with Dr. Brochard’s statement that “more information is needed to show that the approach is feasible and safe.” Each year that we await another confirmatory trial is another year that our patients suffer prolonged mechanical ventilation and illness due to our fondness for the status quo.
Reference List
1. Girard TD, Kress JP, Fuchs BD, Thomason JW, Schweickert WD, Pun BT et al. Efficacy and safety of a paired sedation and ventilator weaning protocol for mechanically ventilated patients in intensive care (Awakening and Breathing Controlled trial): a randomised controlled trial. Lancet 2008;371(9607):126-34.
2. Brochard L. Sedation in the intensive-care unit: good and bad? Lancet 2008;371(9607):95-7.
3. Pandharipande P, Shintani A, Peterson J, Pun BT, Wilkinson GR, Dittus RS et al. Lorazepam is an independent risk factor for transitioning to delirium in intensive care unit patients. Anesthesiology 2006;104(1):21-6.
4. Kress JP, Pohlman AS, O'Connor MF, Hall JB. Daily interruption of sedative infusions in critically ill patients undergoing mechanical ventilation. N.Engl.J Med 2000;342(20):1471-7.
5. Kress JP, Gehlbach B, Lacy M, Pliskin N, Pohlman AS, Hall JB. The long-term psychological effects of daily sedative interruption on critically ill patients. Am.J Respir.Crit Care Med 2003;168(12):1457-61.
James M. O'Brien, Md, MSc
Naeem A. Ali, MD
Scott K. Aberegg, MD, MPH
We read with interest the report of the ABC Trial which demonstrated the efficacy of combining daily awakenings with breathing trials in mechanically ventilated patients (1). In the accompanying editorial, Dr. Brochard contends that “sedation is also an important component of care for critically ill patients,” but he cites only one review article to support this claim (2). It is unknown if the disturbing weaning experiences he references are related to sedation restriction. What is known with reasonable certainty is that oversedation is common and associated with increased delirium (1;3), neuroimaging (4), long-term psychiatric consequences (5) and mortality (1) and longer duration of mechanical ventilation and ICU stay (1;4). The ABC trial adds to this body of literature by demonstrating the practical utility of combining daily sedation cessation with spontaneous breathing trails. That 92% of spontaneous awakening trials were well-tolerated strongly suggests that patients were no worse without sedation, and is consistent with prior studies showing that oversedation, not undersedation, is the principal risk to critically ill patients.
For too long, we suffered from a dearth of quality evidence to guide the care of the critically ill. Now that such evidence is available, we would be wise to act upon it. We therefore disagree with Dr. Brochard’s statement that “more information is needed to show that the approach is feasible and safe.” Each year that we await another confirmatory trial is another year that our patients suffer prolonged mechanical ventilation and illness due to our fondness for the status quo.
Reference List
1. Girard TD, Kress JP, Fuchs BD, Thomason JW, Schweickert WD, Pun BT et al. Efficacy and safety of a paired sedation and ventilator weaning protocol for mechanically ventilated patients in intensive care (Awakening and Breathing Controlled trial): a randomised controlled trial. Lancet 2008;371(9607):126-34.
2. Brochard L. Sedation in the intensive-care unit: good and bad? Lancet 2008;371(9607):95-7.
3. Pandharipande P, Shintani A, Peterson J, Pun BT, Wilkinson GR, Dittus RS et al. Lorazepam is an independent risk factor for transitioning to delirium in intensive care unit patients. Anesthesiology 2006;104(1):21-6.
4. Kress JP, Pohlman AS, O'Connor MF, Hall JB. Daily interruption of sedative infusions in critically ill patients undergoing mechanical ventilation. N.Engl.J Med 2000;342(20):1471-7.
5. Kress JP, Gehlbach B, Lacy M, Pliskin N, Pohlman AS, Hall JB. The long-term psychological effects of daily sedative interruption on critically ill patients. Am.J Respir.Crit Care Med 2003;168(12):1457-61.
James M. O'Brien, Md, MSc
Naeem A. Ali, MD
Scott K. Aberegg, MD, MPH
Friday, January 18, 2008
Have the Peddlers of Antidepressants (Big Pharma) been Successful in Suppressing Negative Trial Results?
Yes, according to this article in yesterday's NEJM:
http://content.nejm.org/cgi/content/short/358/3/252
Talk about publication bias. According to Erick H. Turner, M.D. and coauthors, the selective publication of only "positive" trials, in addition to publishing in a positive light studies that the FDA considered "negative" leads to a 32% increase in the apparent efficacy of antidepressant drugs, on average (range 11-69%). Once again, profit trumps science, safety, and patient and public health.
What can we do about it? First, reduce by one third the effect size of any antidepressant results you see in an industry-sponsored clinical trial. Next, carefully consider whether whatever [probably modest] effect remains is worth the side effects (e.g., increase in suicide), cost, and nuisance of the drug. Third, prescribe generic agents. Fourth, don't allow pharmaceutical reps to speak with you about new products. Fifth, consider alternative treatments.
I am reminded of a curious occurrence relating to a drug that I think is definately worth the cost, side effects, and nuisance associated with it: Chantix (varenicline) - Pfizer's smoking cessation drug. In JAMA in July 2006,
(http://jama.ama-assn.org/content/vol296/issue1/index.dtl)
two nearly identical articles described two nearly identical studies, which shared many of the same authors. What was the intent of this? Why not conduct one larger study? Was the intent to diversify the risk of failure and allow for selective publication of positive results? I'm very interested in any information anyone can provide about this curious arrangement, which appears to be without precedent. Please leave your comments below.
http://content.nejm.org/cgi/content/short/358/3/252
Talk about publication bias. According to Erick H. Turner, M.D. and coauthors, the selective publication of only "positive" trials, in addition to publishing in a positive light studies that the FDA considered "negative" leads to a 32% increase in the apparent efficacy of antidepressant drugs, on average (range 11-69%). Once again, profit trumps science, safety, and patient and public health.
What can we do about it? First, reduce by one third the effect size of any antidepressant results you see in an industry-sponsored clinical trial. Next, carefully consider whether whatever [probably modest] effect remains is worth the side effects (e.g., increase in suicide), cost, and nuisance of the drug. Third, prescribe generic agents. Fourth, don't allow pharmaceutical reps to speak with you about new products. Fifth, consider alternative treatments.
I am reminded of a curious occurrence relating to a drug that I think is definately worth the cost, side effects, and nuisance associated with it: Chantix (varenicline) - Pfizer's smoking cessation drug. In JAMA in July 2006,
(http://jama.ama-assn.org/content/vol296/issue1/index.dtl)
two nearly identical articles described two nearly identical studies, which shared many of the same authors. What was the intent of this? Why not conduct one larger study? Was the intent to diversify the risk of failure and allow for selective publication of positive results? I'm very interested in any information anyone can provide about this curious arrangement, which appears to be without precedent. Please leave your comments below.
Wednesday, January 16, 2008
Is the American College of Cardiology (ACC) Complicit with Big Pharma (Merck and Shering-Plough)?
I am reminded of the surgical attending at Johns Hopkins who (perhaps apocryphally) would scream at the intern in the morning when a patient had done poorly overnight:
"Whose side are you on, the patient or the disease?!"
And I ask the ACC, "Whose side are you on? Patients' or Big Pharma's"?!
Their main web page now links to this statement:
http://www.acc.org/enhance.htm
which states:
"The American College of Cardiology recommends that major clinical decisions not be made on the basis of the ENHANCE study alone."
Is it really a "major clinical decision" to stop Zetia/Vytorin and take a statin or niacin until the very efficacy of Vytorin and Zetia is sorted out?
I'd say that the ACC and its members need to reconsider the rather major decision they made to support the use of this drug based on surrogate end-points. As with torcetrapib, they're going to have to learn the hard way to take their lashings.
The statement goes on to say:
"The ACC recommends that Zetia remain a reasonable option for patients who are currently on a high dose statin but have not reached their goal. The ACC also notes that Zetia is a reasonable option for patients who cannot tolerate statins or can only tolerate a low dose statin."
Well, that sounds reasonable, but do you really thing that the majority of patients on Zetia or Vytorin are on it because they failed a reasonable attempt to use a high-dose statin? We all know that after it hits the market, a drug is generally prescribed willy-nilly rather than carefully and rationally in selected patient groups. The ACC should know this. Hence my suspicion of complicity.
It bothers me how entrenched the use of these drugs becomes and how hard it is to remove patients from them. This is a serious status quo bias that I have commented upon before. Few physicians would start a patient on Avandia now, but the ones who are already on it get left on it. The same is true, it appears, with Vytorin, and the ACC is contributing to the status quo bias!
The mandate for physicians and the FDA is to prescribe only SAFE and EFFECTIVE therapies. The burden of scientific proof is on the drug companies who are driven by profit to promote these drugs. It is up to physicians to stand between patients' health and the companies' profits and prescribe only drugs that have met the burden of proof. And Vytorin and Zetia have not. Boycott them until the proof is in. Use alternative agents in the meantime.
"Whose side are you on, the patient or the disease?!"
And I ask the ACC, "Whose side are you on? Patients' or Big Pharma's"?!
Their main web page now links to this statement:
http://www.acc.org/enhance.htm
which states:
"The American College of Cardiology recommends that major clinical decisions not be made on the basis of the ENHANCE study alone."
Is it really a "major clinical decision" to stop Zetia/Vytorin and take a statin or niacin until the very efficacy of Vytorin and Zetia is sorted out?
I'd say that the ACC and its members need to reconsider the rather major decision they made to support the use of this drug based on surrogate end-points. As with torcetrapib, they're going to have to learn the hard way to take their lashings.
The statement goes on to say:
"The ACC recommends that Zetia remain a reasonable option for patients who are currently on a high dose statin but have not reached their goal. The ACC also notes that Zetia is a reasonable option for patients who cannot tolerate statins or can only tolerate a low dose statin."
Well, that sounds reasonable, but do you really thing that the majority of patients on Zetia or Vytorin are on it because they failed a reasonable attempt to use a high-dose statin? We all know that after it hits the market, a drug is generally prescribed willy-nilly rather than carefully and rationally in selected patient groups. The ACC should know this. Hence my suspicion of complicity.
It bothers me how entrenched the use of these drugs becomes and how hard it is to remove patients from them. This is a serious status quo bias that I have commented upon before. Few physicians would start a patient on Avandia now, but the ones who are already on it get left on it. The same is true, it appears, with Vytorin, and the ACC is contributing to the status quo bias!
The mandate for physicians and the FDA is to prescribe only SAFE and EFFECTIVE therapies. The burden of scientific proof is on the drug companies who are driven by profit to promote these drugs. It is up to physicians to stand between patients' health and the companies' profits and prescribe only drugs that have met the burden of proof. And Vytorin and Zetia have not. Boycott them until the proof is in. Use alternative agents in the meantime.
Monday, January 14, 2008
Vytorin Vanquished: ENHANCE comes out from hiding and the call for a BOYCOTT gathers steam
Merck (MRK) and Shering-Plough (SGP) have finally released the ENHANCE data and they do not look good, neither for MRK and SGP stock prices (both of which were significantly down in pre-market trading!) nor for patients who have been taking ezetimibe as either Vytorin or Zetia - all the trends were in the WRONG DIRECTION (i.e., they favored simvastatin alone) IN SPITE OF robust additional LDL lowering with ezitimibe:
http://biz.yahoo.com/bw/080114/20080114005752.html?.v=1
This further evidence that this drug does not influence important clinical outcomes should renew interest in BOYCOTTING ezitimibe in all forms until/unless improved clinically meaningful outcomes can be shown with this agent in properly designed and conducted trials with sufficient transparency.
(Of course, I recognize that Vytorin is Vanquished only in this battle, that others will follow, and that MRK and SGP will say that these "real trials" are still being conducted, as if they funded ENHANCE for no good reason, and as if, had it been a postive study, they would have downplayed its significance and emphasized cautious interpretation of the results, pending completion of the "real trials".)
http://biz.yahoo.com/bw/080114/20080114005752.html?.v=1
This further evidence that this drug does not influence important clinical outcomes should renew interest in BOYCOTTING ezitimibe in all forms until/unless improved clinically meaningful outcomes can be shown with this agent in properly designed and conducted trials with sufficient transparency.
(Of course, I recognize that Vytorin is Vanquished only in this battle, that others will follow, and that MRK and SGP will say that these "real trials" are still being conducted, as if they funded ENHANCE for no good reason, and as if, had it been a postive study, they would have downplayed its significance and emphasized cautious interpretation of the results, pending completion of the "real trials".)
Friday, January 11, 2008
Jumping the Gun with Intensive Insulin Therapy (Leuven Protocol):How ICUs across the nation rushed to adopt a therapy which is probably not beneficial
In this week's NEJM is an anxiously awaited article about intensive insulin therapy in severely septic patients in the ICU: http://content.nejm.org/cgi/content/short/358/2/125
This business of intensive insulin therapy began with publication in the NEJM in 2001 an article by Van den Berghe et al showing a remarkable reduction in mortality in surgical (mostly post-cardiac surgery) patients in a surgical ICU. Thereafter ensued a veritable rush to adopt this therapy, and ICUs around the country began developing and adopting protocols for "tight glucose control" in spite of concerns about the study and its generalization to non-surgical patients who were not being fed concentrated intravenous dextrose solutions....
I vividly remember one of the ICU attendings at Johns Hopkins Hospital, Dr. Jimmy Sylvester, telling us on the morning after the study was published that "this is either the largest break-through in intensive care therapeutics ever, or these data are faked". In essence what he was saying was that the prior expectation of a result as dramatic as demonstrated by Van den Berghe was very low (see also: http://jama.ama-assn.org/cgi/content/full/294/17/2203 ). That lower prior probability should have reduced our confidence in the results, and made us more skeptical of the population studied and the dextrose solutions and the applicability to non-surgical patients. Well then, why didn't it?
My colleague James M. O'Brien, Jr, MD, MSc and I have one possible explanation for the rush to adopt "intensive insulin therapy" which we have dubbed the "normalization heuristic." Physicians, for all of our training, remain quite simple-minded. We like simple, feel-good fixes. Normalizing lab values is one of those things. "Make it normal and all will be fine," goes the mantra. We like to make the potassium normal. We like to make the hematocrit normal. We love it when the magnesium increases after we order 4 grams. It's satisfying. And it feels like we're doing some measurable, that is, easily measurable good in the world. Normalizing blood sugars fits that paradigm and makes us feel like we are doing good. But are we?
We have learned the hard way over the years that many of the things we do to "normalize" some surface value causes an undercurrent of harm for patients. Think suppression of PVCs (the CAST trial: http://content.nejm.org/cgi/content/abstract/321/6/406 ) or transfusion thresholds (the TRICC study and others: http://content.nejm.org/cgi/content/abstract/340/6/409 ). Oftentimes, it seems, our efforts to "normalize" some value cause more harm than good. It is quite possible that this is also the case with intensive insulin, and that the "feel-good" appeal of making the blood sugars normal in the short term in acutely ill patients propelled us to early adoption of this probably useless and possibly harmful therapy.
(For an analogous contemporaneous story about biology's complexity and defiance of simple explanations and logic such as the normalization heuristic, see: http://www.nytimes.com/2008/01/11/science/11ants.html?scp=1&sq=aiding+trees+can+kill+them.)
The interesting thing regarding the "adoption" of Van den Berghe's "Leuven protocol" is that no ICU I have worked in really adopted that protocol. They softened it up, making the target blood sugar not 80-120, but rather 120-150 or some similar range. So what was adopted was "moderate insulin therapy" rather than intensive insulin therapy. Nobody has any idea whether such an approach is beneficial. It's certainly safer. But it has substantial costs in terms of nursing care that might be better spent on other interventions (think sedation interruption).
(I have been highly critical of Van den Berghe's medical insulin article, and my criticisms were published in the NEJM. I was delighted that she did not even address me/them in "the authors reply" - apparently I left her speechless: http://content.nejm.org/cgi/content/extract/354/19/2069.)
So this wonderful article in the current issue by Brunkhorst et al is music to my ears. Rather than hiding the high rate of severe hypoglycemia in supplementary material, Brunkhorst et al come right out and say that not only was the Leuven protocol NOT associated with reduced mortality, but also that it had a very high incidence of severe side effects and that their DSMB had the wherewithal to stop the study early for safety reasons. Bravo!
We await the results of several other ongoing studies of intensive insulin therapy before we nail shut the coffin on the Leuven protocol. Meanwhile, I hope that someone somewhere will design a protocol to test the "moderate insulin therapy" that we rushed to adopt after the first Van den Berghe article as a half-hearted hedge/compromise between our "normalization heuristic", our tempered enthusiasm for the Leuven protocol, our desire to "do something" for critically ill patients, and our fear of causing side effects that result directly from our interventions (omission bias: http://mdm.sagepub.com/cgi/content/abstract/26/6/575 ).
Thank you, Brunkhorst et al, for testing the Leuven protocol in an even-handed and scientifically unbiased manner and for reporting your results candidly.
This business of intensive insulin therapy began with publication in the NEJM in 2001 an article by Van den Berghe et al showing a remarkable reduction in mortality in surgical (mostly post-cardiac surgery) patients in a surgical ICU. Thereafter ensued a veritable rush to adopt this therapy, and ICUs around the country began developing and adopting protocols for "tight glucose control" in spite of concerns about the study and its generalization to non-surgical patients who were not being fed concentrated intravenous dextrose solutions....
I vividly remember one of the ICU attendings at Johns Hopkins Hospital, Dr. Jimmy Sylvester, telling us on the morning after the study was published that "this is either the largest break-through in intensive care therapeutics ever, or these data are faked". In essence what he was saying was that the prior expectation of a result as dramatic as demonstrated by Van den Berghe was very low (see also: http://jama.ama-assn.org/cgi/content/full/294/17/2203 ). That lower prior probability should have reduced our confidence in the results, and made us more skeptical of the population studied and the dextrose solutions and the applicability to non-surgical patients. Well then, why didn't it?
My colleague James M. O'Brien, Jr, MD, MSc and I have one possible explanation for the rush to adopt "intensive insulin therapy" which we have dubbed the "normalization heuristic." Physicians, for all of our training, remain quite simple-minded. We like simple, feel-good fixes. Normalizing lab values is one of those things. "Make it normal and all will be fine," goes the mantra. We like to make the potassium normal. We like to make the hematocrit normal. We love it when the magnesium increases after we order 4 grams. It's satisfying. And it feels like we're doing some measurable, that is, easily measurable good in the world. Normalizing blood sugars fits that paradigm and makes us feel like we are doing good. But are we?
We have learned the hard way over the years that many of the things we do to "normalize" some surface value causes an undercurrent of harm for patients. Think suppression of PVCs (the CAST trial: http://content.nejm.org/cgi/content/abstract/321/6/406 ) or transfusion thresholds (the TRICC study and others: http://content.nejm.org/cgi/content/abstract/340/6/409 ). Oftentimes, it seems, our efforts to "normalize" some value cause more harm than good. It is quite possible that this is also the case with intensive insulin, and that the "feel-good" appeal of making the blood sugars normal in the short term in acutely ill patients propelled us to early adoption of this probably useless and possibly harmful therapy.
(For an analogous contemporaneous story about biology's complexity and defiance of simple explanations and logic such as the normalization heuristic, see: http://www.nytimes.com/2008/01/11/science/11ants.html?scp=1&sq=aiding+trees+can+kill+them.)
The interesting thing regarding the "adoption" of Van den Berghe's "Leuven protocol" is that no ICU I have worked in really adopted that protocol. They softened it up, making the target blood sugar not 80-120, but rather 120-150 or some similar range. So what was adopted was "moderate insulin therapy" rather than intensive insulin therapy. Nobody has any idea whether such an approach is beneficial. It's certainly safer. But it has substantial costs in terms of nursing care that might be better spent on other interventions (think sedation interruption).
(I have been highly critical of Van den Berghe's medical insulin article, and my criticisms were published in the NEJM. I was delighted that she did not even address me/them in "the authors reply" - apparently I left her speechless: http://content.nejm.org/cgi/content/extract/354/19/2069.)
So this wonderful article in the current issue by Brunkhorst et al is music to my ears. Rather than hiding the high rate of severe hypoglycemia in supplementary material, Brunkhorst et al come right out and say that not only was the Leuven protocol NOT associated with reduced mortality, but also that it had a very high incidence of severe side effects and that their DSMB had the wherewithal to stop the study early for safety reasons. Bravo!
We await the results of several other ongoing studies of intensive insulin therapy before we nail shut the coffin on the Leuven protocol. Meanwhile, I hope that someone somewhere will design a protocol to test the "moderate insulin therapy" that we rushed to adopt after the first Van den Berghe article as a half-hearted hedge/compromise between our "normalization heuristic", our tempered enthusiasm for the Leuven protocol, our desire to "do something" for critically ill patients, and our fear of causing side effects that result directly from our interventions (omission bias: http://mdm.sagepub.com/cgi/content/abstract/26/6/575 ).
Thank you, Brunkhorst et al, for testing the Leuven protocol in an even-handed and scientifically unbiased manner and for reporting your results candidly.
Merck and Schering's "Secret Vytorin Panel"
Matthew Herper continues to lead the pack in investigating the shenanigans perpetrated by Shering-Plough (SGP) and Merck (MRK)in the conduct of the ENHANCE trial of Vytorin. I reiterate that it is my strong but measured and carefully considered opinion that this drug or ezetimibe should NOT be used in ANY patients until definitive evidence of efficacy is available, since alternative, more proven alternatives exist. Patients' health should not be risked on this drug. There is too much uncertainty, and too many proven alternatives.
Matthew's article describes more intriguing aspects of this saga, and I couldn't state it any better than he, so I invite you to read his article:
http://www.forbes.com/2008/01/10/merck-schering-vytorin-biz-cx_mh_0111enhance.html?partner=email
Type rest of the post here
Matthew's article describes more intriguing aspects of this saga, and I couldn't state it any better than he, so I invite you to read his article:
http://www.forbes.com/2008/01/10/merck-schering-vytorin-biz-cx_mh_0111enhance.html?partner=email
Type rest of the post here
Monday, December 31, 2007
Is there any place for the f/Vt (the Yang-Tobin index) in today's ICU?
Recently, Tobin and Jubran performed an eloquent re-analysis of the value of “weaning predictor tests” (Crit Care Med 2008; 36: 1). In an accompanying editorial, Dr. MacIntyre does an admirable job of disputing some of the authors’ contentions (Crit Care Med 2008; 36: 329). However, I suspect space limited his ability to defend the recommendations of the guidelines for weaning and discontinuation of ventilatory support.
Tobin and Jubran provide a whirlwind tour of the limitations of meta-analyses. These are important considerations when interpreting the reported results. However, lost in this critique of the presumed approach used by the McMaster group and the joint tack force are the limitations of the studies on which the meta-analysis was based. Tobin and Jubran provide excellent points about systematic error limiting the internal validity of the study but, interestingly, do not apply such criticism to studies of f/Vt.
For the sake of simplicity, I will limit my discussion to the original report by Yang and Tobin (New Eng J Med 1991; 324: 1445). As a reminder, this was a single center study which included 36 subjects in a “training set” and 64 subjects in a “prospective-validation set.” Patients were selected if “clinically stable and whose primary physicians considered them ready to undergo a weaning trial.” The authors then looked a variety of measures to determine predictors of those “able to sustain spontaneous breathing for ≥24 hours after extubation” versus those “in whom mechanical ventilation was reinstituted at the end of a weaning trial or who required reintubation within 24 hours.” While not explicitly stated, it looks as if all the patients who failed a weaning trial had mechanical ventilation reinstituted, rather than failing extubation.
In determining the internal validity of a diagnostic test, one important consideration is that all subjects have the “gold standard” test performed. In the case of “weaning predictor tests,” what is the condition we are trying to diagnose? I would argue that it is the presence of respiratory failure requiring continued ventilatory support. Alternatively, it is the absence of respiratory failure requiring continued ventilatory support. I would also argue that the gold standard test for this condition is the ability to sustain spontaneous breathing. Therefore, to determine the test performance of “weaning predictor tests,” all subjects should undergo a trial of spontaneous breathing regardless of the results of the predictor tests. Now, some may argue that the self-breathing trial (or spontaneous breathing trial) is, indeed, this gold standard. I would agree if SBTs were perfectly accurate in predicting removal of the endotracheal tube and spontaneous breathing without a ventilator in the room. This is, however, not the case. So, truly, what Yang and Tobin are assessing is the ability of these tests to predict the performance on a subsequent SBT.
Dr. MacIntyre argues that “since the outcome of an SBT is the outcome of interest, why waste time and effort trying to predict it?” I would agree with this within limits. Existing literature supports the use of very basic parameters (e.g., hemodynamic stability, low levels of FiO2 and PEEP, etc.) as screens for identifying patients for whom an SBT is appropriate. Uncertain is the value of daily SBTs in all patients, regardless of passing this screen or not. One might hypothesize that simplifying this step even further might provide incremental benefit. Yang and Tobin, however, must consider a failure on an SBT to have deleterious effects. They consider “weaning trials undertaken either prematurely or after an unnecessary delay…equally deleterious to a patient’s health.” There is no reference supporting this assertion. Recent data suggest that inclusion of “weaning predictor tests” do not save patients from harm due to avoiding SBTs destined to fail (Tanios et al. Crit Care Med, 2006; 34: 2530). On the contrary, inclusion of the f/Vt as the first in Tobin’s and Jubran’s “three diagnostic tests in sequence” resulted in prolonged weaning time.
Tobin and Jubran also note the importance of prior probabilities in determining the performance of a diagnostic test. In the original study, Yang and Tobin selected patients who “were considered ready to undergo a weaning trial” by their primary physicians. Other studies have reported that such clinician assessments are very unreliable with predictive values marginally better than a coin-flip (Stroetz et al, Am J Resp Crit Care Med, 1995; 152: 1034). Perhaps, the clinicians whose patients were in this study are better than this. However, we are not provided with strict clinical rules which define this candidacy for weaning but can probably presume that “readiness” is at least a 50% prior probability of success. Using Yang and Tobin’s sensitivity of 0.97 and specificity of 0.64 for f/Vt, we can generate a range of posterior probabilities of success on a weaning trial:

As one can see, the results of the f/Vt assessment have a dramatic effect on the posterior probabilities of successful SBTs. However, is there a threshold below which one would advocate not performing an SBT if one’s prior probability is 50% or higher? I doubt it. Even with a pre-test probability of successful SBT of 50% and a failed f/Vt, 1 in 25 patients would actually do well on an SBT. I am not willing to forego an SBT with such data since, in my mind, SBTs are not as dangerous as continued, unneeded mechanical ventilation. I would consider low f/Vt values as completely non-informative since they do not instruct me at all regarding the success of extubation – the outcome for which I am most interested.
Other studies have used f/Vt to predict extubation failure (rather than SBT failure) and these are nicely outlined in a recent summary by Tobin and Jubran (Intensive Care Medicine 2006; 32: 2002). Even if we ignore different cut-points of f/Vt and provide the most optimistic specificities (96% for f/Vt <100, Uusaro et al, Crit Care Med 2000; 28: 2313) and sensitivities (79% for f/VT <88, Zeggwagh et al., Intens Care Med 1999; 25:1077), the f/Vt may not help much. As with the prior table, using prior probabilities and the results of the f/Vt testing, we can generate posterior probabilities of successful extubation:

As with the predictions of SBT failure, a high f/Vt lowers the posterior probability of successful extubation greatly. However, one must consider the cut off for posterior probabilities in which one would not even attempt an SBT. Even with a 1% posterior probability, 1 in 100 patients will be successfully extubated. This is the rate when the prior probability of successful extubation is only 20% AND the patient has a high f/Vt! What rate of failed extubation is acceptable or, even, preferable? Five percent? Ten percent? If one never reintubates a patient, it is more likely that he is waiting “too long” to extubate rather than possessing perfect discrimination. Furthermore, what is the likelihood that patients with poor performance on an f/Vt will do well on an SBT? I suspect this failure will prohibit extubation and the high f/Vt values will only spare the effort of performing the SBT. Is the incremental effort of performing SBTs on those who are destined to fail such that it requires more time than the added complexity of using the f/Vt to determine if a patient should receive an SBT at all? Presuming that we require an SBT prior to extubation, low f/Vt values remain non-informative. One could argue that with a posterior probability of >95%, we should simply extubate the patient, but I doubt many would take this approach, except in those intubated for reasons not related to respiratory problems (e.g. mechanical ventilation for surgery or drug overdose).
Drs. Tobin, Jubran and Marini (who writes an additional, accompanying editorial, Crit Care Med 2008; 36: 328) are master clinicians and physiologists. When they are at the bedside, I do not doubt that their “clinical experience and firm grasp of pathophysiology” (as Dr. Marini mentions), can match or even exceed the performance of protocolized care. Indeed, expert clinicians at Johns Hopkins have demonstrated that protocolized care did not improve the performance of the clinical team (Krishnan et al., Am J Resp Crit Care Med 2004; 169: 673). I have heard Dr. Tobin argue that this indicates that protocols do not provide benefit for assessment of liberation (American Thoracic Society, 2007). I doubt that the authors would strictly agree with his interpretation of their data since several of the authors note in a separate publication that “the regularity of steps enforced by a protocol as executed by nurses or therapists trumps the rarefied individual decisions made sporadically by busy physicians” (Fessler and Brower, Crit Care Med 2005; 33: S224). What happens to the first patient who is admitted after Dr. Tobin leaves service? What if the physician assuming the care of his patients is more interested in sepsis than ventilatory physiology? What about the patient admitted to a small hospital in suburban Chicago rather than one of the Loyola hospitals? Protocols do not intend to set the ceiling on clinical decision-making and performance, but they can raise the floor.
Tobin and Jubran provide a whirlwind tour of the limitations of meta-analyses. These are important considerations when interpreting the reported results. However, lost in this critique of the presumed approach used by the McMaster group and the joint tack force are the limitations of the studies on which the meta-analysis was based. Tobin and Jubran provide excellent points about systematic error limiting the internal validity of the study but, interestingly, do not apply such criticism to studies of f/Vt.
For the sake of simplicity, I will limit my discussion to the original report by Yang and Tobin (New Eng J Med 1991; 324: 1445). As a reminder, this was a single center study which included 36 subjects in a “training set” and 64 subjects in a “prospective-validation set.” Patients were selected if “clinically stable and whose primary physicians considered them ready to undergo a weaning trial.” The authors then looked a variety of measures to determine predictors of those “able to sustain spontaneous breathing for ≥24 hours after extubation” versus those “in whom mechanical ventilation was reinstituted at the end of a weaning trial or who required reintubation within 24 hours.” While not explicitly stated, it looks as if all the patients who failed a weaning trial had mechanical ventilation reinstituted, rather than failing extubation.
In determining the internal validity of a diagnostic test, one important consideration is that all subjects have the “gold standard” test performed. In the case of “weaning predictor tests,” what is the condition we are trying to diagnose? I would argue that it is the presence of respiratory failure requiring continued ventilatory support. Alternatively, it is the absence of respiratory failure requiring continued ventilatory support. I would also argue that the gold standard test for this condition is the ability to sustain spontaneous breathing. Therefore, to determine the test performance of “weaning predictor tests,” all subjects should undergo a trial of spontaneous breathing regardless of the results of the predictor tests. Now, some may argue that the self-breathing trial (or spontaneous breathing trial) is, indeed, this gold standard. I would agree if SBTs were perfectly accurate in predicting removal of the endotracheal tube and spontaneous breathing without a ventilator in the room. This is, however, not the case. So, truly, what Yang and Tobin are assessing is the ability of these tests to predict the performance on a subsequent SBT.
Dr. MacIntyre argues that “since the outcome of an SBT is the outcome of interest, why waste time and effort trying to predict it?” I would agree with this within limits. Existing literature supports the use of very basic parameters (e.g., hemodynamic stability, low levels of FiO2 and PEEP, etc.) as screens for identifying patients for whom an SBT is appropriate. Uncertain is the value of daily SBTs in all patients, regardless of passing this screen or not. One might hypothesize that simplifying this step even further might provide incremental benefit. Yang and Tobin, however, must consider a failure on an SBT to have deleterious effects. They consider “weaning trials undertaken either prematurely or after an unnecessary delay…equally deleterious to a patient’s health.” There is no reference supporting this assertion. Recent data suggest that inclusion of “weaning predictor tests” do not save patients from harm due to avoiding SBTs destined to fail (Tanios et al. Crit Care Med, 2006; 34: 2530). On the contrary, inclusion of the f/Vt as the first in Tobin’s and Jubran’s “three diagnostic tests in sequence” resulted in prolonged weaning time.
Tobin and Jubran also note the importance of prior probabilities in determining the performance of a diagnostic test. In the original study, Yang and Tobin selected patients who “were considered ready to undergo a weaning trial” by their primary physicians. Other studies have reported that such clinician assessments are very unreliable with predictive values marginally better than a coin-flip (Stroetz et al, Am J Resp Crit Care Med, 1995; 152: 1034). Perhaps, the clinicians whose patients were in this study are better than this. However, we are not provided with strict clinical rules which define this candidacy for weaning but can probably presume that “readiness” is at least a 50% prior probability of success. Using Yang and Tobin’s sensitivity of 0.97 and specificity of 0.64 for f/Vt, we can generate a range of posterior probabilities of success on a weaning trial:

As one can see, the results of the f/Vt assessment have a dramatic effect on the posterior probabilities of successful SBTs. However, is there a threshold below which one would advocate not performing an SBT if one’s prior probability is 50% or higher? I doubt it. Even with a pre-test probability of successful SBT of 50% and a failed f/Vt, 1 in 25 patients would actually do well on an SBT. I am not willing to forego an SBT with such data since, in my mind, SBTs are not as dangerous as continued, unneeded mechanical ventilation. I would consider low f/Vt values as completely non-informative since they do not instruct me at all regarding the success of extubation – the outcome for which I am most interested.
Other studies have used f/Vt to predict extubation failure (rather than SBT failure) and these are nicely outlined in a recent summary by Tobin and Jubran (Intensive Care Medicine 2006; 32: 2002). Even if we ignore different cut-points of f/Vt and provide the most optimistic specificities (96% for f/Vt <100, Uusaro et al, Crit Care Med 2000; 28: 2313) and sensitivities (79% for f/VT <88, Zeggwagh et al., Intens Care Med 1999; 25:1077), the f/Vt may not help much. As with the prior table, using prior probabilities and the results of the f/Vt testing, we can generate posterior probabilities of successful extubation:

As with the predictions of SBT failure, a high f/Vt lowers the posterior probability of successful extubation greatly. However, one must consider the cut off for posterior probabilities in which one would not even attempt an SBT. Even with a 1% posterior probability, 1 in 100 patients will be successfully extubated. This is the rate when the prior probability of successful extubation is only 20% AND the patient has a high f/Vt! What rate of failed extubation is acceptable or, even, preferable? Five percent? Ten percent? If one never reintubates a patient, it is more likely that he is waiting “too long” to extubate rather than possessing perfect discrimination. Furthermore, what is the likelihood that patients with poor performance on an f/Vt will do well on an SBT? I suspect this failure will prohibit extubation and the high f/Vt values will only spare the effort of performing the SBT. Is the incremental effort of performing SBTs on those who are destined to fail such that it requires more time than the added complexity of using the f/Vt to determine if a patient should receive an SBT at all? Presuming that we require an SBT prior to extubation, low f/Vt values remain non-informative. One could argue that with a posterior probability of >95%, we should simply extubate the patient, but I doubt many would take this approach, except in those intubated for reasons not related to respiratory problems (e.g. mechanical ventilation for surgery or drug overdose).
Drs. Tobin, Jubran and Marini (who writes an additional, accompanying editorial, Crit Care Med 2008; 36: 328) are master clinicians and physiologists. When they are at the bedside, I do not doubt that their “clinical experience and firm grasp of pathophysiology” (as Dr. Marini mentions), can match or even exceed the performance of protocolized care. Indeed, expert clinicians at Johns Hopkins have demonstrated that protocolized care did not improve the performance of the clinical team (Krishnan et al., Am J Resp Crit Care Med 2004; 169: 673). I have heard Dr. Tobin argue that this indicates that protocols do not provide benefit for assessment of liberation (American Thoracic Society, 2007). I doubt that the authors would strictly agree with his interpretation of their data since several of the authors note in a separate publication that “the regularity of steps enforced by a protocol as executed by nurses or therapists trumps the rarefied individual decisions made sporadically by busy physicians” (Fessler and Brower, Crit Care Med 2005; 33: S224). What happens to the first patient who is admitted after Dr. Tobin leaves service? What if the physician assuming the care of his patients is more interested in sepsis than ventilatory physiology? What about the patient admitted to a small hospital in suburban Chicago rather than one of the Loyola hospitals? Protocols do not intend to set the ceiling on clinical decision-making and performance, but they can raise the floor.
Friday, December 28, 2007
Results of the Poll - Large Trials are preferred
The purpose of the poll that has been running alongside the posts on this blog for some months now was to determine if physicians/researchers (a convenience sample of folks visiting this site) intuitively are Bayesian when they think about clinical trials.
To summarize the results, 43/68 respondents (63%) reported that they preferred the larger 30-center RCT. This differs significantly from the hypothesized value of 50% (p=0.032).
From a purely mathematical and Bayesian perspective, physicians should be ambivalent about the choice between a large(r) 30-center RCT involving 2100 patients showing a 5% mortality reduction at p=0.0005, and 3 small(er) 10-center RCTs involving 700 patients each showing the same 5% mortality reduction at p=0.04. In essence, unless respondents were reading between the lines somewhere, the choice is between two options with identical posterior probabilities. That is, if the three smaller trials are combined, they are equal to the larger trial and the meta-analytic p-value is 0.0005. Looked at from a different perspective, the large 30-center trial could have been analyzed as 3 10-center trials based on the region of the country in which the centers were located or any other arbitrary classification of centers.
Why this result? I obviously can't say based on this simple poll, but here are some guesses: 1.) People are more comfortable with larger multicenter studies, perhaps because they are accustomed to seeing cardiology mega-trials in journals such as NEJM; or 2.) The p-value of 0.04 associated with the small(er) studies seems "marginal" and the combination of the three studies is non-intuitive, and/or it is not possible to see that the combination p-value will be the same. However, I have some (currently unpublished) data which show that [paradoxically] for the same study, physicians are more willing to adopt a therapy with a higher rather than a lower p-value.
Further research is obviously needed to determine how physicians respond to evidence from clinical trials and whether or not their responses are normative. In this poll, it appears that they were not.
To summarize the results, 43/68 respondents (63%) reported that they preferred the larger 30-center RCT. This differs significantly from the hypothesized value of 50% (p=0.032).
From a purely mathematical and Bayesian perspective, physicians should be ambivalent about the choice between a large(r) 30-center RCT involving 2100 patients showing a 5% mortality reduction at p=0.0005, and 3 small(er) 10-center RCTs involving 700 patients each showing the same 5% mortality reduction at p=0.04. In essence, unless respondents were reading between the lines somewhere, the choice is between two options with identical posterior probabilities. That is, if the three smaller trials are combined, they are equal to the larger trial and the meta-analytic p-value is 0.0005. Looked at from a different perspective, the large 30-center trial could have been analyzed as 3 10-center trials based on the region of the country in which the centers were located or any other arbitrary classification of centers.
Why this result? I obviously can't say based on this simple poll, but here are some guesses: 1.) People are more comfortable with larger multicenter studies, perhaps because they are accustomed to seeing cardiology mega-trials in journals such as NEJM; or 2.) The p-value of 0.04 associated with the small(er) studies seems "marginal" and the combination of the three studies is non-intuitive, and/or it is not possible to see that the combination p-value will be the same. However, I have some (currently unpublished) data which show that [paradoxically] for the same study, physicians are more willing to adopt a therapy with a higher rather than a lower p-value.
Further research is obviously needed to determine how physicians respond to evidence from clinical trials and whether or not their responses are normative. In this poll, it appears that they were not.
Friday, December 21, 2007
Patients and Physicians should BOYCOTT Zetia and Vytorin: Forcing MRK and SGP to come clean with the data
You wouldn't believe it - or would you? The NYT reports today that SGP has data from a number of - go figure - unpublished studies that may contain important data about increased [and previously undisclosed] risks of liver toxicity with Zetia and Vytorin: http://www.nytimes.com/2007/12/21/business/21drug.html Unproven benefits, undisclosed risks? If I were a patient, I would want to be taken off this drug and be put on atorvastatin or simvastatin or a similar agent. If teh medical community would get on board and take patients off of this unproven and perhaps risky drug, that might at least force the companies to come clean with their data.
In fact, I'm astonished at the medical community's reluctance to challenge the status quo which is represented by widespread use of drugs such as this and Avandia, for which there is no proof of efficacy save for surrogate endpoints, and for which there is evidence of harm. These drugs are not good bets unless alternatives do not exist, and of course they do. I am astonished in my pulmonary clinic to see many patients referred for dyspnea, with a history of heart disease and/or cardiomyopathy who remain on Avandia. Apparently, protean dyspnea is not a sufficient wake-up call to change the diabetes management of a patient who is receiving an agent of unproven efficacy and which is known to cause fluid retention and CHF. This just goes to show how effective pharmaceutical marketing campaigns are, how out-of-control things have become, and how non-normative physicians' approach to the data are.
The profit motive impels them forward. The evidence does not support the agents proffered. Evidence of harm is available. Alternatives exist. Why aren't physicians taking patients off drugs such as vioxx, avandia, zetia, and vytorin, and using alternative agents until the confusion is resolved?
In fact, I'm astonished at the medical community's reluctance to challenge the status quo which is represented by widespread use of drugs such as this and Avandia, for which there is no proof of efficacy save for surrogate endpoints, and for which there is evidence of harm. These drugs are not good bets unless alternatives do not exist, and of course they do. I am astonished in my pulmonary clinic to see many patients referred for dyspnea, with a history of heart disease and/or cardiomyopathy who remain on Avandia. Apparently, protean dyspnea is not a sufficient wake-up call to change the diabetes management of a patient who is receiving an agent of unproven efficacy and which is known to cause fluid retention and CHF. This just goes to show how effective pharmaceutical marketing campaigns are, how out-of-control things have become, and how non-normative physicians' approach to the data are.
The profit motive impels them forward. The evidence does not support the agents proffered. Evidence of harm is available. Alternatives exist. Why aren't physicians taking patients off drugs such as vioxx, avandia, zetia, and vytorin, and using alternative agents until the confusion is resolved?
Sunday, December 16, 2007
Dexmedetomidine: a New Standard in Critical Care Sedation?
In last week's JAMA, Wes Ely's group at Vanderbilt report the results of a trial comparing dexmedetomidine to lorazepam for the sedation of critically ill patients:
http://jama.ama-assn.org/cgi/content/short/298/22/2644
This group, along with others, has taken the lead as innovators in research related to sedation and delirium in the ICU (in addition to other topics), and this is a very important article in this area. In short, the authors found that, when compared to lorazepam, dexmed led to better targeted sedation and less time in coma, with a trend toward improved mortality.
One of the most impressive things about this study is stated as a post-script:
“This investigator-initiated study was aided by receipt of study drug and an unrestricted research grant for laboratory and investigational studies from Hospira Inc….Hospira Inc had no role in the design or conduct of the study; in the collection, analysis, and interpretation of the data; in the preparation, review, or approval of this manuscript; or in the publication strategy of the results of this study. These data are not being used to generate FDA label changes for this medication, but rather to advance the science of sedation, analgesia, and brain dysfunction in critically ill patients….”
Investigator-initiated....investigator-controlled design and publication, investigators as stewards of the data.....music to my ears.
But is dexmed going to be the new standard in critical care sedation? For that question, it would appear that it is too early for answers. I have the following observations:
• This study used higher doses of dexmed for longer durations than what the product labeling advises. Should practitioners use the doses studied or the approved doses? My very small experience with this drug so far at the labelled doses is that it is difficult to use in that it does not achieve adequate sedation in the most agitated patients - those receiveing the highest doses of benzos and narcotics, in whom lightenting of sedationl is assigned the highest priority.
• The most impressive primary endpoint achieved by the drug was days alive without delirium or coma, but most of it was driven by coma-free days. Perhaps this is not surprising given two aspects of the study's design
1. Patients did not have daily interruptions of sedative infusions, a difficult-to-employ, but evidence-based practice to reduce oversedation and coma
2. lorazepam was titrated upwards without boluses between dose increases. Given the long half-life of this drug, we would expect overshoot by the time steady state pharmacokinetics were achieved.
So is it surprising that patients in the dexmed group had fewer coma-free days?
• We are not told about the tracheostomy practices in this study. Getting a trach earlier may lead to both sedation reduction and improved mortality (See http://ccmjournal.org/pt/re/ccm/abstract.00003246-200408000-00009.htm;jsessionid=HlfG93Qfvb113sCpnD10053YzKqMB3zFfDTdbGvgCQPdlMZ3S8kV!1219373867!181195629!8091!-1?index=1&database=ppvovft&results=1&count=10&searchid=1&nav=search).
• We are not told the proportion of patients in each group who had withdrawal of support. Anecdotally, I have found that families have greater willingness to withdraw support for patients who are comatose, regardless of other underlying physiological variables or organ failures. Can the trend towards improved mortality with dexmed be attributed to differrences in willingness of families to WD support?
• In spite of substantial data that delirium is associated with mortality (http://jama.ama-assn.org/cgi/content/abstract/291/14/1753?maxtoshow=&HITS=10&hits=10&RESULTFORMAT=&fulltext=delirium&searchid=1&FIRSTINDEX=0&resourcetype=HWCIT ), and these data showing that there is a TREND towards fewer delirium-free days with dexmed, the hypothesis that dexmed improves mortality via improvement in delirium is one that can only be tested by a study with mortality as a primary endpoint.
The data from the current study are compelling, and Ely and investigators are to be commended for the important research they are doing (this article is only the tip of that iceberg of research). However, it remains to be seen if one sedative compared to others can lead to improvements in mortality or more rapid recovery from critical illness, or whether limitation of sedation in general with whatever agent is used is primarily responsible for improved outcomes.
http://jama.ama-assn.org/cgi/content/short/298/22/2644
This group, along with others, has taken the lead as innovators in research related to sedation and delirium in the ICU (in addition to other topics), and this is a very important article in this area. In short, the authors found that, when compared to lorazepam, dexmed led to better targeted sedation and less time in coma, with a trend toward improved mortality.
One of the most impressive things about this study is stated as a post-script:
“This investigator-initiated study was aided by receipt of study drug and an unrestricted research grant for laboratory and investigational studies from Hospira Inc….Hospira Inc had no role in the design or conduct of the study; in the collection, analysis, and interpretation of the data; in the preparation, review, or approval of this manuscript; or in the publication strategy of the results of this study. These data are not being used to generate FDA label changes for this medication, but rather to advance the science of sedation, analgesia, and brain dysfunction in critically ill patients….”
Investigator-initiated....investigator-controlled design and publication, investigators as stewards of the data.....music to my ears.
But is dexmed going to be the new standard in critical care sedation? For that question, it would appear that it is too early for answers. I have the following observations:
• This study used higher doses of dexmed for longer durations than what the product labeling advises. Should practitioners use the doses studied or the approved doses? My very small experience with this drug so far at the labelled doses is that it is difficult to use in that it does not achieve adequate sedation in the most agitated patients - those receiveing the highest doses of benzos and narcotics, in whom lightenting of sedationl is assigned the highest priority.
• The most impressive primary endpoint achieved by the drug was days alive without delirium or coma, but most of it was driven by coma-free days. Perhaps this is not surprising given two aspects of the study's design
1. Patients did not have daily interruptions of sedative infusions, a difficult-to-employ, but evidence-based practice to reduce oversedation and coma
2. lorazepam was titrated upwards without boluses between dose increases. Given the long half-life of this drug, we would expect overshoot by the time steady state pharmacokinetics were achieved.
So is it surprising that patients in the dexmed group had fewer coma-free days?
• We are not told about the tracheostomy practices in this study. Getting a trach earlier may lead to both sedation reduction and improved mortality (See http://ccmjournal.org/pt/re/ccm/abstract.00003246-200408000-00009.htm;jsessionid=HlfG93Qfvb113sCpnD10053YzKqMB3zFfDTdbGvgCQPdlMZ3S8kV!1219373867!181195629!8091!-1?index=1&database=ppvovft&results=1&count=10&searchid=1&nav=search).
• We are not told the proportion of patients in each group who had withdrawal of support. Anecdotally, I have found that families have greater willingness to withdraw support for patients who are comatose, regardless of other underlying physiological variables or organ failures. Can the trend towards improved mortality with dexmed be attributed to differrences in willingness of families to WD support?
• In spite of substantial data that delirium is associated with mortality (http://jama.ama-assn.org/cgi/content/abstract/291/14/1753?maxtoshow=&HITS=10&hits=10&RESULTFORMAT=&fulltext=delirium&searchid=1&FIRSTINDEX=0&resourcetype=HWCIT ), and these data showing that there is a TREND towards fewer delirium-free days with dexmed, the hypothesis that dexmed improves mortality via improvement in delirium is one that can only be tested by a study with mortality as a primary endpoint.
The data from the current study are compelling, and Ely and investigators are to be commended for the important research they are doing (this article is only the tip of that iceberg of research). However, it remains to be seen if one sedative compared to others can lead to improvements in mortality or more rapid recovery from critical illness, or whether limitation of sedation in general with whatever agent is used is primarily responsible for improved outcomes.
Wednesday, December 12, 2007
ENHANCE trial faces congressional scrutiny
Merck and Shering-Plough had better get their houses in order. Congress is on the case:
http://www.nytimes.com/2007/12/12/business/12zetia.html?_r=1&oref=slogin
Apparently, representatives of the US populus, which pays for a substantial portion of the Zetia sold, are not pleased by the delays in release of the data from the ENHANCE trial. The chicanery is going to be harder to sustain.
I certainly hope for everyone's sake (especially patients') that there is no foul play afoot with this trial or ezetimibe - Merck can hardly withstand another round of Vioxx-type suits, can it? Or can it. Merck's stock price (MRK: http://finance.yahoo.com/q/bc?s=MRK&t=5y&l=on&z=m&q=l&c=) is at the same level as it was in Jan, 2004. Some high price to pay for obfuscating the truth, concealing evidence of harm, bilking insurers and the American public and government for billions of $$$ for a prescription painkiller when equivalent non-branded products were available, and causing thousands of heart attacks in the process....
The consequences should be harsher the second time around.....
Type rest of the post here
http://www.nytimes.com/2007/12/12/business/12zetia.html?_r=1&oref=slogin
Apparently, representatives of the US populus, which pays for a substantial portion of the Zetia sold, are not pleased by the delays in release of the data from the ENHANCE trial. The chicanery is going to be harder to sustain.
I certainly hope for everyone's sake (especially patients') that there is no foul play afoot with this trial or ezetimibe - Merck can hardly withstand another round of Vioxx-type suits, can it? Or can it. Merck's stock price (MRK: http://finance.yahoo.com/q/bc?s=MRK&t=5y&l=on&z=m&q=l&c=) is at the same level as it was in Jan, 2004. Some high price to pay for obfuscating the truth, concealing evidence of harm, bilking insurers and the American public and government for billions of $$$ for a prescription painkiller when equivalent non-branded products were available, and causing thousands of heart attacks in the process....
The consequences should be harsher the second time around.....
Type rest of the post here
Tuesday, December 11, 2007
Pronovost, Checklists, and Putting Evidence into Practice
In this week's New Yorker:
http://www.newyorker.com/reporting/2007/12/10/071210fa_fact_gawande
Atul Gawande, a popular physician writer who may be familiar to readers from his columns in the NEJM and the NYT, chronicles the hurculean efforts by Peter Pronovost, MD, PhD at Johns Hopkins Hospital to make sure that the mundane but effective does not always take back seat to the heroic but largely symbolic efforts of critical care doctors.
One of my chronic laments is that evidence is not utilized and that physician efforts do not appear to be rationally apportioned to what counts most. There appears to be too much emphasis on developing evidence and too little emphasis on making sure it is expeditiously adopted and employed; to much emphasis on diagnosis, too little emphasis on evidence-based treatment; too much focus on the "rule of rescue" too little focus on the "power of prevention". Pronovost has demonstrated that simple checklists can have bountiful yields in terms of teamwork, prevention, and delivery of effective care - then why aren't we all familiar with his work? Why doesn't every ICU use his checklists?
My own experience at the Ohio State University Medical Center is emblematic of the challenges of getting an unglamorous thing like a checklist accepted as a routine part of clinical practice in the ICU. In spite of evidence supporting it, its obvious rational basis, widespread recognition that we often miss things if we aren't rigorous and systematic, adopting an adapted version of Pronovost's checklist at OSUMC has proven challenging (albeit possible). As local champion of a checklist that I largely plagarized from Pronovost's original, I have been told by colleagues that it is "cumbersome", but RNs that it is "superfluous", by fellows that it is a "pain", by people of all disciplines that they "don't seen the point" and have been frustrated that when I do not personally assure that it is being done daily (by woaking through the ICU and checking), that it is abandoned as yet another "chore", another piece of bureaucratic red tape that hampers the delivery of more important "patient-centered" care - such as procudures and ordering of tests.
All of these criticisms are delivered despite my admonition that the checklist, like a fishing expedition, is not expected to yield a "catch" on every cast, but that if it is cast enough, things will be caught that would otherwise be missed; desipte my reminder that it is an opportunity to improve our communication with our multi-disciplinary ICU team (and to learn the names of its constituents); despite producing evidence of its benefit and evidence of underutilization of evidence-based therapies which the checklist reminds practitioners to consider. If I were not personally committed to making sure that the checklist is photocopied/available and consistently filled out (by our fellows, who deserve great credit for filling it out), it would quicly fall by the wayside, another relic of a well-meaning effort to encourage concsientiousness through bureaucracy and busy-work (think HIPPA here -the intent is noble, but the practical result an abject failure).
So what is the solution? How are we to increase acceptance of Pronovost's checklist and recognition of its utility and its necessity? It could be through fiat, through education, through a variety of means. But it appears that it has survived at Hopkins because of Pronovost's ongoing efforts to promote it and extol its benefits and its virtues and to get "buy-in" from other stake-holders: RNs, patients, adminitrators, the public, and other physicians. This is not an easy task - but then again, rarely is anything that is worth it. Hopefully other champions of this and other unglamorous innovations will continue to advocate for mundane but effective interventions to improve communication among members of multidisciplinary healthcare teams, the utilzation of evidence-based therapies, and outcomes for patients.
http://www.newyorker.com/reporting/2007/12/10/071210fa_fact_gawande
Atul Gawande, a popular physician writer who may be familiar to readers from his columns in the NEJM and the NYT, chronicles the hurculean efforts by Peter Pronovost, MD, PhD at Johns Hopkins Hospital to make sure that the mundane but effective does not always take back seat to the heroic but largely symbolic efforts of critical care doctors.
One of my chronic laments is that evidence is not utilized and that physician efforts do not appear to be rationally apportioned to what counts most. There appears to be too much emphasis on developing evidence and too little emphasis on making sure it is expeditiously adopted and employed; to much emphasis on diagnosis, too little emphasis on evidence-based treatment; too much focus on the "rule of rescue" too little focus on the "power of prevention". Pronovost has demonstrated that simple checklists can have bountiful yields in terms of teamwork, prevention, and delivery of effective care - then why aren't we all familiar with his work? Why doesn't every ICU use his checklists?
My own experience at the Ohio State University Medical Center is emblematic of the challenges of getting an unglamorous thing like a checklist accepted as a routine part of clinical practice in the ICU. In spite of evidence supporting it, its obvious rational basis, widespread recognition that we often miss things if we aren't rigorous and systematic, adopting an adapted version of Pronovost's checklist at OSUMC has proven challenging (albeit possible). As local champion of a checklist that I largely plagarized from Pronovost's original, I have been told by colleagues that it is "cumbersome", but RNs that it is "superfluous", by fellows that it is a "pain", by people of all disciplines that they "don't seen the point" and have been frustrated that when I do not personally assure that it is being done daily (by woaking through the ICU and checking), that it is abandoned as yet another "chore", another piece of bureaucratic red tape that hampers the delivery of more important "patient-centered" care - such as procudures and ordering of tests.
All of these criticisms are delivered despite my admonition that the checklist, like a fishing expedition, is not expected to yield a "catch" on every cast, but that if it is cast enough, things will be caught that would otherwise be missed; desipte my reminder that it is an opportunity to improve our communication with our multi-disciplinary ICU team (and to learn the names of its constituents); despite producing evidence of its benefit and evidence of underutilization of evidence-based therapies which the checklist reminds practitioners to consider. If I were not personally committed to making sure that the checklist is photocopied/available and consistently filled out (by our fellows, who deserve great credit for filling it out), it would quicly fall by the wayside, another relic of a well-meaning effort to encourage concsientiousness through bureaucracy and busy-work (think HIPPA here -the intent is noble, but the practical result an abject failure).
So what is the solution? How are we to increase acceptance of Pronovost's checklist and recognition of its utility and its necessity? It could be through fiat, through education, through a variety of means. But it appears that it has survived at Hopkins because of Pronovost's ongoing efforts to promote it and extol its benefits and its virtues and to get "buy-in" from other stake-holders: RNs, patients, adminitrators, the public, and other physicians. This is not an easy task - but then again, rarely is anything that is worth it. Hopefully other champions of this and other unglamorous innovations will continue to advocate for mundane but effective interventions to improve communication among members of multidisciplinary healthcare teams, the utilzation of evidence-based therapies, and outcomes for patients.
Subscribe to:
Posts (Atom)