Medical Evidence Blog: March 2008

Monday, March 31, 2008

MRK and SGP: Ye shall know the truth, and the truth shall send thy stock spiralling

Apparently, the editors of the NEJM read my blog (even though they stop short of calling for a BOYCOTT):

"...it seems prudent to encourage patients whose LDL cholesterol levels remain elevated despite treatment with an optimal dose of a statin to redouble their efforts at dietary control and regular exercise. Niacin, fibrates, and resins should be considered when diet, exercise, and a statin have failed to achieve the target, with ezetimibe reserved for patients who cannot tolerate these agents."

Sound familiar?

The full editorial can be seen here: http://content.nejm.org/cgi/content/full/NEJMe0801842
along with a number of other early-release articles on the subject.

The ENHANCE data are also published online (http://content.nejm.org/cgi/content/full/NEJMoa0800742
and there's really nothing new to report. We have known the results for several months now. What is new is doctors' nascent realization that they have been misled and bamboozled by the drug reps, Big Pharma, and their own long-standing, almost religious faith in surrogate endpoints (see post below). It's like you have to go through the stages of grief (Kubler-Ross) before you give up on your long-cherished notions of reality (denial, anger, bargaining, then, finally, acceptance). Amazingly, the ACC, whose statement just months ago appeared to be intended to allay patients' and doctors' concerns about Zetia, has done a apparent 180 on the drug: "Go back to Statins" is now their sanctimonious advice: http://acc08.acc.org/SSN/Documents/ACC%20D3LR.pdf

I was briefly at the ACC meeting yesterday (although I did not pay the $900 fee to attend the sessions). The Big Pharma marketing presence was nauseating. A Lipitor-emblazoned bag was given to each attendee. A Lipitor laynard was used to hold your $900 ID badge. Busses throughout the city were emblazoned with Vytorin and Lipitor advertisements among others. Banners covered numerous floors of the facades of city buildings. The "exhibition hall," a veritable orgy of marketing madness, was jam-packed with the most aesthetically pleasing and best-dressed salespersons with their catchy displays and gimmicks. (Did you know that abnormal "vascular reactivity" is a heretofore unknown "risk factor"? And that with a little $20,000 device that they can sell you (which you can probably bill for), you can detect said abnormal vascular reactivity.) The distinction between science, reality, and marketing is blurred imperceptibly if it exists at all. Physicians from all over the world greedily scramble for free pens, bags, and umbrellas (as if they cannot afford such trinkets on their own - or was it the $900 entrance fee that squeezed their pocketbooks?) They can be seen throughout the convention center with armloads of Big Pharma propaganda packages: flashlights, laser pointers, free orange juice and the like.

I just wonder: How much money does the ACC receive from these companies (for this Big Pharma Bonanza and for other "activities")? If my guess is in the right ballpark, I don't have to wonder why the ACC hedged in its statement when the ENHANCE data were released in January. I think I might have an idea.

Wednesday, March 26, 2008

Torcetrapib, Ezetimibe, and Surrogate Endpoints: A Cautionary Tale

In today's JAMA, (http://jama.ama-assn.org/cgi/content/extract/299/12/1474 ), Drs. Psaty and Lumley echo many of the points on this blog over the last six months about ezetimibe and torcetrapib (see posts below.) While they stop short of calling for a boycott of ezetimibe, and their perspective on torcetrapib is tempered by Pfizer's early conduct of a trial with hard outcomes as endpoints, their commentary underscores the dangers inherent in the long-standing practice of almost unquestioningly accepting the validy of "established" surrogate endpoints. The time to re-examine the validity of surrogate endpoints such as glycemic control, LDL, HDL, and blood pressure is now. Agents to treat these maladies are abundant and widely accessible, so potential delays in discovery and approval of new agents is no longer a suitable argument for a "fast track" approval process for new agents. We have seen time and again that such "fast tracks" are nothing more than expressways to profit for Big Pharma.

Psaty and Lumley's chronology of the studies of ezitimibe and their timing are themselves timely and should refocus needed scrutiny on the role of pharmaceutical companies as the stewards of scientific data and discovery.

Monday, March 10, 2008

The CORTICUS Trial: Power, Priors, Effect Size, and Regression to the Mean

The long-awaited results of another trial in critical care were published in a recent NEJM: (http://content.nejm.org/cgi/content/abstract/358/2/111). Similar to the VASST trial, the CORTICUS trial was "negative" and low dose hydrocortisone was not demonstrated to be of benefit in septic shock. However, unlike VASST, in this case the results are in conflict with an earlier trial (Annane et al, JAMA, 2002) that generated much fanfare and which, like the Van den Berghe trial of the Leuven Insulin Protocol, led to widespread [and premature?] adoption of a new therapy. The CORTICUS trial, like VASST, raises some interesting questions about the design and interpretation of trials in which short-term mortality is the primary endpoint.

Jean Louis Vincent presented data at this year's SCCM conference with which he estimated that only about 10% of trials in critical care are "positive" in the traditional sense. (I was not present, so this is basically hearsay to me - if anyone has a reference, please e-mail me or post it as a comment.) Nonetheless, this estimate rings true. Few are the trials that show a statistically significant benefit in the primary outcome, fewer still are trials that confirm the results of those trials. This begs the question: are critical care trials chronically, consistently, and woefully underpowered? And if so, why? I will offer some speculative answers to these and other questions below.

The CORTICUS trial, like VASST, was powered to detect a 10% absolute reduction in mortality. Is this reasonable? At all? What is the precedent for a 10% ARR in mortality in a critical care trial? There are few, if any. No large, well-conducted trials in critical care that I am aware of have ever demonstrated (least of all consistently) a 10% or greater reduction in mortality of any therapy, at least not as a PRIMARY PROSPECTIVE OUTCOME. Low tidal volume ventilation? 9% ARR. Drotrecogin-alfa? 7% ARR in all-comers. So I therefore argue that all trials powered to detect an ARR in mortality of greater than 7-9% are ridiculously optimistic, and that the trials that spring from this unfortunate optimism are woefully underpowered. It is no wonder that, as JLV purportedly demonstrated, so few trials in critical care are "positive". The prior probability is is exceedingly low that ANY therapy will deliver a 10% mortality reduction. The designers of these trials are, by force of pragmatic constraints, rolling the proverbial trial dice and hoping for a lucky throw.

Then there is the issue of regression to the mean. Suppose that the alternative hypothesis (Ha) is indeed correct in the generic sense that hydrocortisone does beneficially influence mortality in septic shock. Suppose further that we interpret Annane's 2002 data as consistent with Ha. In that study, a subgroup of patients (non-responders) demonstrated a 10% ARR in mortality. We should be excused for getting excited about this result, because after all, we all want the best for our patients and eagerly await the next breaktrough, and the higher the ARR, the greater the clinical relevance, whatever the level of statistical significance. But shouldn't we regard that estimate with skepticism since no therapy in critical care has ever shown such a large reduction in mortality as a primary outcome? Since no such result has ever been consistently repeated? Even if we believe in Ha, shouldn't we also believe that the 10% Annane estimate will regress to the mean on repeated trials?

It may be true that therapies with robust data behind them become standard practice, equipoise dissapates, and the trials of the best therapies are not repeated - so they don't have a chance to be confirmed. But the knife cuts both ways - if you're repeating a trial, it stands to reason that the data in support of the therapy are not that robust and you should become more circumspect in your estimates of effect size - taking prior probability and regression to the mean into account.

Perhaps we need to rethink how we're powering these trials. And funding agencies need to rethink the budgets they will allow for them. It makes little sense to spend so much time, money, and effort on underpowered trials, and to establish the track record that we have established where the majority of our trials are "failures" in the traditional sence and which all include a sentence in the discussion section about how the current results should influence the design of subsequent trials. Wouldn't it make more sense to conduct one trial that is so robust that nobody would dare repeat it in the future? One that would provide a definitive answer to the quesiton that is posed? Is there something to be learned from the long arc of the steroid pendulum that has been swinging with frustrating periodicity for many a decade now?

This is not to denigrate in any way the quality of the trials that I have referred to. The Canadian group in particular as well as other groups (ARDSnet) are to be commended for producing work of the highest quality which is of great value to patients, medicine, and science. But in keeping with the advancement of knowledge, I propose that we take home another message from these trials - we may be chronically underpowering them.

Sunday, March 9, 2008

The "Trials" and Tribulations of Powering Clinical Trials: The Case of Vasopressin for Septic Shock (VASST trial)

Nobody likes "negative" trials. They're just not as exciting as positive ones. (Unless they show that something we're doing is harmful or that a product that Wall Street has bet heavily on is headed for the chopping block.) But "negative" studies such as an excellent one by Russell et al in a recent NEJM (http://content.nejm.org/cgi/content/abstract/358/9/877 ) show just how difficult it is to design and conduct a "positive" trial. The [non-significant] trends in this study, namely that vasopressin is superior to norepinephrine in reducing mortality in septic shock, were demonstrated in a study that had an a priori power of 80%, based on an expected mortality rate of 60% in the placebo group. Actual power in the study was significantly less, not because, as the authors appear to suggest, the observed placebo mortality was only ~39%, but rather because the observed effect size fell markedly short of the anticipated 10% absolute mortality reduction. In order to demonstrate a mortality benefit of the magnitude observed in the current trial (~4% ARR) at a significance level of 0.05, approximately 1500 patients in each study arm would be required. This is a formidable number for a critical care trial.

Thus, this trial illustrates the trials and tribulations of designing and conducting studies with 28-day mortality as an endpoint. These studies not only entail substantial costs, but pose challenges for patient recruitment, necessitating the participation of numerous centers in a multinational setting. The coordination of such a trial is daunting. It is understandable, therefore, that investigators may wish to be optimistic about the ARR they can expect from a therapy, as this will reduce sample size and increase the chances that the trial will be successfully completed in a resonable period of time. (For an example of a study which had to be terminated early because of these challenges, see Mancebo et al : http://ajrccm.atsjournals.org/cgi/content/short/200503-353OCv1 ). Powering the trial at 80% instead of 90% likewise represents a compromise between optimism for the efficacy of the therapy and optimism for patient recruitment. In essence, the lower the power, the more "faith" there must be that a roll of the trial dice will confirm the alternative hypothesis.

These realities played out [dissappointingly] in the Russell trial. The p-value for the ARR (28-day mortality - the primary endpoint) associated with vasopressin compared with placebo was 0.26, while that associated with 90-day mortality (a prespecified secondary endpoint) was 0.11. Thus, this trial is considered negative by conventional standards.

But its being "negative" does not mean that it is not of value to practitioners. This large experience with vasopressin demonstrates both that this agent is a viable alternative to norepinephrine in regards to raising the MAP to within the goal range, as also that we can expect that there will not be a significant excess of adverse events when this agent is used. In my opinion, this study represents a veritable "green light" for continued use of this agent, as I agree with the editorialist (http://content.nejm.org/cgi/reprint/358/9/954.pdf ) that many patients with sepsis who are not responding to norepinephrine respond dramatically and favorably to this agent.

Perhaps there is a larger lesson here. Should we use the same p-value threshold for a study of, say, an antidepressant as we do for a study of an agent that may reduce mortality? In the former case, we may be most concerned about exposure of patients to a costly drug with no benefits and potential side effects - in essence, we are most concerned with a Type I Error, i.e., concluding that there is a benefit when in reality there is none. Perhaps in a trial of a potentially life-saving therapy (e.g., vasopressin) we should be most concerned with a Type II Error, i.e., concluding that there is no real benefit when in reality one exists. If that were the case, and you may have already guessed that I believe that it should be, we could address this concern by loosening the standard of statistical significance for a study of potentially life-saving agents.

The standards notwithstanding, critical care practitioners are free to interpret these data as they see fit. And one reasonable conclusion is that, the trends being in the right direction and the side effect profile being acceptable, we should be using more vasopressin in septic shock.

Or, we must make a tough call: do we want to invest the resources in a much larger trail to determine if vasopressin can be shown to reduce mortality at the conventional p-value level of 0.05? Can we recruit the necessary 3000 patients?