A Truncated Summation of the Adventure of the Cardboard Box

 

SL28408

One gets the sense when reading the literature on endovascular therapy for acute ischemic stroke that they are on a small seafaring vessel attempting to map the shoreline through a dense fog. There are moments when the fog lifts and you catch a glimpse of the topographic details of the shore, and then the cloud again rolls in obscuring any further ascertainment. Similarly the recent publications of endovascular therapy for acute ischemic stroke have demonstrated there is a definitive benefit to mechanical reperfusion therapy, and yet each publication in itself is so incomplete, it is difficult to perceive anything more than this general appearance of benefit. The finer details are obscured by the premature truncation of trials, too early to definitively characterize the benefits and risks of endovascular therapy.

MR CLEAN, published earlier this year in the NEJM, and discussed ad nauseam in previous posts, marked the first of what is now a litany of trials demonstrating benefit for endovascular therapy in acute ischemic stroke (1). Its release resulted in the subsequent premature stoppage of a number of key trials examining endovascular therapy. Although all these trials boast impressive results, each stopped their enrollment prematurely, not due to a preplanned interim analysis, but rather due to MR CLEAN’s positive results. ESCAPE and EXTEND-IA were the first to halt enrollment and hastily publish their results (2,3). More recently the NEJM has reported on the findings from the next two trials prematurely stopped due to MR CLEAN’s success.

The first of these studies is the SWIFT-PRIME trial published by Saver et al (4). This trial’s initial results were presented earlier this year alongside EXTEND-IA and ESCAPE at the 2015 International Stroke Conference. Like its counterparts, this trial examined patients presenting with large ischemic infarcts and radiographically identified occlusions in the terminal internal carotid (ICA) or first branch (M1) of the middle cerebral artery (MCA). Additionally patients had to demonstrate a favorable core-to-ischemic penumbra ratio on perfusion imaging. Patients were enrolled if they were able to undergo endovascular interventions within 6-hours of symptom onset.

Like ESCAPE and EXTEND-IA, the results of SWIFT-PRIME are impressive. Authors boast a 25% absolute difference in the number of patients with a mRS of 0-2 at 90 days. Though notable, the definitive magnitude of effect is hardly concrete. The authors cite an NNT of 4 to have one more patient alive and independent at 90 days, and an NNT of 2.6 to have one patient less disabled. These calculations are used using their dichotomous and ordinal analyses respectively. Although the authors cite impressive p-values (<0.001), the confidence interval surrounding this 25% point estimate is far broader (11-38%). Meaning the NNT is somewhere between 2.6 and 9 patients. EXTEND-IA and ESCAPE have similarly wide confidence intervals surrounding their point estimates (4). EXTEND-IA’s confidence interval is 8% to 50% surrounding a point estimate of 31% (2). Likewise ESCAPE has a confidence interval of 13% to 34% surrounding its 23.7% point estimate (3). All three of these trials were stopped early secondary to MR CLEAN’s results. And though both EXTEND-IA and ESCAPE came close to reaching their pre-defined sample size, SWIFT-PRIME was stopped before its first interim analysis (n<200) (4).

Like EXTEND-IA, ESCAPE and SWIFT-PRIME, the second trial just published in NEJM, the REVASCAT trial, by Jovin et al was stopped prematurely secondary to the publication of the MR CLEAN data. In fact, even though it failed to reach the prospectively determined efficacy threshold for stopping the trial, at the first interim analysis, the data and safety board felt that given the MR CLEAN data, there was a loss of equipoise and further randomization would be unethical (5). Despite its apparent success the results of the RAVASC trial are far less impressive than either EXTEND-IA, ESCAPE or SWIFT-PRIME. The REVASC trial planned to enroll 690 patients presenting to the Emergency Department in 4 centers across Catalonia with symptoms consistent with a large vessel stroke that could be treated with endovascular therapy within 8 hours of symptom onset. Unlike EXTEND-IA, ESCAPE or SWIFT-PRIME, the REVASCAT Trial did not use perfusion imaging to select patients with favorable areas of salvageable tissue. Rather employed CTA to identify occlusion in the ICA or M1 branch of the MCA, and utilized the less accurate ASPECT score, derived from the initial non-contrast CT, to assess potential for viable ischemic tissue (5).

REVASCAT enrolled 206 patients before its premature termination. And like the three trials before it demonstrated a statistically significant improvement in mRS at 90 days in the patients who underwent endovascular therapy. The REVASCAT trial cites an absolute increase in the number of patients with a mRS of 0-2 by 15.5%. This is surrounded by a confidence interval of 2.4% to 28.5%. Furthermore, unlike the previous three trials that either boast an outright benefit in mortality or demonstrate trends in favor of endovascular therapy, REVASCAT demonstrated an impressive 4.8% absolute increase in the rate of death within the first 7 days after randomization (5).

The results of REVASCAT are far from positive. If they were not included in the optimistic fervor that currently surrounds endovasacular therapy, it might even be considered a negative trial. Why were the results REVASCAT far less impressive than EXTEND-IA, ESCAPE and SWIFT-PRIME? Was it just random chance, the true effect size of endovascular therapy falling somewhere between the two extremes of the 13.5% difference observed in MR CLEAN and the 31% seen in EXTEND-IA? Or rather was it that the patient population selected in EXTEND-IA, ESCAPE and SWIFT-PRIME led to their success? EXTEND-IA, ESCAPE and SWIFT-PRIME all utilized some form of advanced imaging to determine the size of viable ischemic tissue (2,3,4). MR CLEAN and REVASCAT used only the CTA to identify a reachable lesion and the non-contrast CT to determine tissue viability (1,5). If any one of these trials were followed to completion the results likely would provide us with a better understanding of who will benefit from endovascular therapy and the exact magnitude of this benefit.

This is a problem of certainty. Our faith in endovascular interventions was so unyielding, that at the first sign of success we claimed victory and discontinued any further scientific inquiries. The bloated results demonstrated in EXTEND-IA, ESCAPE, and SWIFT-PRIME are the result of this premature resolution. We know that trials stopped early for benefit are likely to over-estimate the effect size of the treatment in question. In fact the smaller the sample size at the time of closure, the greater the amplification (6). In 1989, Peacock et al demonstrated this to be a mathematical inevitability (7). Later validated by Bassler et al in a meta-analysis examining 91 trials stopped prematurely for benefit (8). Bassler et al revealed that the degree of embellishment was directly related to the size of the sample population at cessation and independent of the quality of the trial or the presence of a predetermined methodology for early stoppage.

Although the exact patient population that stands to benefit from endovascular therapy is unclear, it is certainly a small fraction of the overall patients who present to the Emergency Department with acute ischemic stroke. All patients enrolled in the REVASC trial were also included in a national registry known as SONIA. SONIA catalogued 2576 patients (only 15.6% of all stroke patients seen) with some form of reperfusion therapy over the time period REVASCAT enrolled patients (5). The vast majority of these patients 2036(79%) received only tPA. 540 (21%) patients underwent endovascular therapy. Of these only 111 (24%) were eligible for enrollment into the REVASCAT trial. Only 4.3% of the patients in the SONIA registry, and only 0.3% of all stroke patients during the 2-year period were eligible for inclusion in the REVASCAT trial (5). This accounts for a small minority of the stroke patients presenting to the Emergency Department with symptoms consistent with acute ischemic stroke. Of note the criteria used in the REVASCAT trial to determine eligibility are more inclusive than those used in EXTEND-IA, ESCAPE, and PRIME-SWIFT, which if you believe were successful because of their inclusion criteria, would account for an even smaller portion of stroke patients presenting the Emergency Department. In the SWIFT-PRIME trial it took 2-years and 39 centers to recruit 196 patients (4). That comes out to 0.2 patients per center per month. EXTEND-IA and ESCAPE recruited only 0.3 and 1.44 patients per center per month respectively (2,3).

Even the most skeptical will find difficulty denying there is a definite treatment effect observed in the recent trials examining endovascular therapy in acute ischemic stroke. The magnitude of this effect has yet to be defined. Its borders are obscured by the murkiness of small sample sizes, extreme selection bias and prematurely stopped trials. There are also clear harms associated with this invasive procedure. Both the REVASCAT trial and the earlier trials examining endovascular therapy (IMS-3, SYNTHESIS and MR RESCUE) demonstrated that when performed on the wrong patient population, not only will endovascular therapy fail to provide benefit, it may in fact be harmful (5,9,10,11). This is simply not a yes or no question. The resources required to build an infrastructure capable of supporting endovascular therapy on a national level are daunting. Though we have reached a certain degree of clarity that endovascular therapy for acute ischemic stroke provides benefit, how well and in whom remains murky. The overeager truncation of important trials has left us adrift in a sea of fog. Unsure if the shoreline we paddle towards is a warm welcoming beachfront or a rocky coast prepared to demolish our vessel upon arrival.

Sources Cited:

  1. Berkhemer OA, Fransen PS, Beumer D, et al. A randomized trial of intraarterial treatment for acute ischemic stroke. N Engl J Med. 2015;372:(1)11-20.
  2. Campbell BC, Mitchell PJ, Kleinig TJ, et al. Endovascular Therapy for Ischemic Stroke with Perfusion-Imaging Selection. N Engl J Med. 2015.
  3. Goyal M, Demchuk AM, Menon BK, et al. Randomized Assessment of Rapid Endovascular Treatment of Ischemic Stroke. N Engl J Med. 2015.
  4. Saver JL, Goyal M, Bonafe A, et al. Stent-Retriever Thrombectomy after Intravenous t-PA vs. t-PA Alone in Stroke. N Engl J Med. 2015
  5. Jovin TG, Chamorro A, Cobo E, et al. Thrombectomy within 8 Hours after Symptom Onset in Ischemic Stroke. N Engl J Med. 2015;
  6. Guyatt GH, Briel M, Glasziou P, Bassler D, Montori VM. Problems of stopping trials early. BMJ. 2012;344:e3863.
  7. Pocock SJ, Hughes MD. Practical problems in interim analyses, with particular regard to estimation. Control Clin Trials 1989;10(suppl 4):209-21S.
  8. Bassler D, Briel M, Montori VM, et al. Stopping randomized trials early for benefit and estimation of treatment effects: systematic review and meta-regression analysis. JAMA. 2010;303(12):1180-7.
  9. Broderick JP, Palesch YY, Demchuk AM, et al. Endovascular therapy after intravenous t-PA versus t-PA alone for stroke. N Engl J Med. 2013;368(10):893-903.
  10. Ciccone A, Valvassori L, Nichelatti M, et al. Endovascular treatment for acute ischemic stroke. N Engl J Med. 2013;368(10):904-13.
  11. Kidwell CS, Jahan R, Gornbein J, et al. A trial of imaging selection and endovascular treatment for ischemic stroke. N Engl J Med. 2013;368(10):914-23.

 

 

 

 

 

The Case of the Anatomic Heart Part 2

illu_heart_kleiner

The PROMISE Trial, like any aptly named study chose an acronym meant to inspire. In this case, the hope for a better tomorrow. And though the authors of the Prospective Multicenter Imaging Study for Evaluation of Chest Pain trial were not clear on the specific details their promise entailed, I fear the results of this trial will leave us feeling betrayed and forsworn.

The authors of the PROMISE Trial presented the findings from their massive undertaking at the 2015 ACC scientific assembly. The results were published simultaneously in the NEJM. Douglas et al randomized 10,003 patients to either standard non-invasive functional testing, as determined by the treating physician, or CTCA. Patients were recruited from outpatient facilities across North America when presenting with new onset chest pain in which the treating physician was suspicious of cardiac origin and had already ruled out ACS. Patients were excluded if they presented with unstable vitals, EKG changes, or positive biomarkers. Given the pragmatic nature of the trial, all other treatment decisions were left to the prerogative of the treating physician (1).

The authors found no difference in their primary outcome, the composite endpoint of death, MI, hospitalization for UA, or major procedural complications over the followup period (at least 12 months with average follow up of 24 months), between the CTCA and traditional testing groups (3.3% vs 3.0%). In fact other than a small decrease in the amount of negative invasive catheterization seen in the CTCA arm (3.4% vs 4.3%), the authors were unable to find any statistically significant differences in the multitude of secondary endpoints measured. As far as safety outcomes, the authors did cite some relevant concerns. Most notably those randomized to receive CTCA as their screening test underwent significantly more downstream testing and interventions. 12.2% of those randomized to the CTCA arm compared to 8.1% in the standard testing arm underwent invasive catheterization, 6.2% compared to 3.2% underwent subsequent revascularization including a 1.5% vs 0.76% rate of coronary artery bypass grafting (CABG) (1).

Now some might argue that the PROMISE trial was not performed on Emergency Department patients and thus its application to our low risk chest pain population is questionable. In some senses this may be true. Patients evaluated in the Emergency Department for chest pain are inherently at higher risk than their counterparts seen in primary care offices. Conversely the PROMISE Trial evaluated a cohort of chest pain in whom the treating physician suspected the symptoms were likely of cardiac origin. Before being enrolled in the trial all of these patients were ruled out for ACS with negative EKGs and biomarkers. Additionally the treating physician felt further provocative testing was necessary. This is not unlike the cohort of patients we include in our low-risk chest pain population in the Emergency Department. Furthermore we have four trials with over 3,000 Emergency Department patients evaluating the efficacy of CTCA, which demonstrate almost identical results to the PROMISE Trial (2,3,4,5). Each of these studies determined that CTCA adds no additional prognostic value to our standard risk stratification strategies and likely leads to increased invasive procedures. In a meta-analysis of these four trials published in JACC in 2013, Hulten et al found a significant increase in the number of invasive angiographies, PCIs and revascularizations performed in the patients randomized to the CTCA arm (6). PROMISE demonstrated the exact same tendencies of CTCA in a much larger cohort (1).

Why did PROMISE fail to find a difference? What are we to infer about the acuity and severity of a disease state that does not benefit from a timely and accurate diagnosis? We know CTCA is far more accurate than our more traditional forms of provocative testing. And yet, why in this massive trial did it fail to find any difference in clinically relevant outcomes? Might it be that a time-sensitive anatomical definition of CAD is unnecessary?

The first reason why PROMISE failed to show a difference is that the population enrolled in the trial was at such low risk for the disease state in question, they are likely to do well whatever diagnostic testing strategy they undergo. Only 3.1% of the group had any event during the follow-up period. Only 1.5% died and only 0.7% had a MI (1). With such a low event rate, even if CTCA is an effective means of identifying and preventing MI and cardiac death, a statistically significant benefit is unlikely to be found even with a sample size as large as 10,000 patients.

The second reason why the PROMISE Trial is likely to have failed, is simply because we are functioning under the misconception that when we diagnose these patients with obstructive CAD, an invasive strategy is superior to optimal medical management. Though we know that reperfusion therapy has objective benefits in patients actively experiencing a myocardial infarction, these same benefits have failed to translate to the more stable lesions of CAD. Multiple large RCTs have failed to find a benefit of PCI over optimal medical management in patients with stable obstructive CAD (7,8). Stergiopoulos et al have now published a number of meta-analyses examining these trials, which have also failed to uncover benefits that may have been missed in the weaker powered individual trials (9,10).

The PROMISE trial was not the only trial presented at the ACC Scientific Assembly examining the pragmatic use of CTCA for the diagnostic work up of chest pain. The SCOT-HEART trial was yet another massive undertaking, the results published online in The Lancet in concert with the oral presentation. In this trial, investigators enrolled 4,146 patients referred to chest pain clinics across Scotland, to either a standard work up or a standard work up plus the addition of CTCA. Although by sheer quantity it does not possess the statistical s of the PROMISE trial, it does present us with some insights, which the PROMISE trial proved incapable of providing(11).

The unique design of the SCOT-HEART trial insured all patients received a full standardized evaluation, often including (85% of the time) an exercise stress test. It was only after the treating physician assessed the patient, reported his or her baseline estimate of the likelihood of CAD and determined what further testing and treatment strategies he or she would recommend, that the patients were randomized to either receive CTCA or standard care. Like PROMISE, this was a pragmatic trial design and other than the use of CT angiography clinicians were given free rein to treat each patient as they deemed appropriate. At 6 weeks the physicians were then asked again to assess the likelihood of CAD(11).

What the authors revealed was that the use of CTCA significantly improved the clinicians confidence in their diagnosis of both CAD and angina of cardiac origin (the trial’s primary endpoint). They also found a statistically significant increase in the number of patients diagnosed with CAD in the group randomized to receive CTCA (23% vs 11%). Additionally patients in the CTCA arm were more frequently shifted towards more aggressive and invasive modes of management when compared to the standard care arm. Specifically more patients in the CTCA group saw an increase in number of medical therapies prescribed and invasive catheterizations performed (11).

In summary, patients randomized to CTCA were more often given the diagnosis of CAD and were more likely to be treated with medical therapies and invasive procedures than the patients in the standard care group. But did all of these investigations and interventions lead to better outcomes? Simply put no. The rate of cardiovascular death and myocardial infarction during the follow up period (1.7 years) was 1.3 vs 2.0, a 0.7% non-statistical difference. The overall mortality was 0.8% vs 1.0%, respectively. Even the decrease in the quality and severity of the patients’ symptoms (the reason the patients presented to the clinic in the first place) at 6-weeks, was identical (11).

The PROMISE trial demonstrated the use of CTCA promotes increased downstream testing and intervention. The SCOT-HEART trial validated these findings. The SCOT-HEART trial also demonstrated CTCA provides a significant degree of diagnostic certainty to the treating physician, leading to more aggressive medical management. And yet knowing a lot and doing a lot failed equate to a reduction in mortality or myocardial infarctions. These are coronary mirages, promising the weary clinicians water when in reality they are just leading them deeper into the barren desert.

Despite its size and decisively negative results, perhaps the most important study arm in the PROMISE Trial did not exist, an arm in which patients were randomized to not receive any form of provocative testing, but rather treated medically as per the judgment of their physician. Both the PROMISE and SCOT-HEART trials demonstrated that a cohort of outpatient chest pain patients are at such low risk for adverse events, they are likely to do equally as well with whatever provocative test is used, or more importantly without any at all. Surely it is time to examine such a hypothesis, to add a third arm to the PROMISE cohort. The ISCHEMIA Trial is currently enrolling patients to compare medical management vs invasive strategies in the setting of a positive provocative test. Unfortunately this trial’s applicability is limited by the fact that authors insist all patients undergo a CTCA before enrollment to rule out the presence of left main arterial disease. And though this may be a step in the right direction, we still can’t escape our need for anatomical certainty in the face of diminishing clinical utility. Surely it is time we define the value of both provocative and anatomical testing in the low risk chest pain population, truly a Promise worth keeping.

Sources Cited:

  1. Douglas PS, Taylor A, Bild D, et al. Outcomes research in cardiovascular imaging: report of a workshop sponsored by the National Heart, Lung, and Blood Institute. Circ Cardiovasc Imaging 2009;2:339-348
  2. Goldstein JA, Chinnaiyan KM, Abidov A, et al. The CT-STAT (Coronary Computed Tomographic Angiography for Systematic Tri- age of Acute Chest Pain Patients to Treatment) trial. J Am Coll Cardiol 2011;58:1414–22.
  3. Hoffmann U, Truong QA, Schoenfeld DA, et al. Coronary CT angiography versus standard evaluation in acute chest pain. N Engl J Med 2012;367:299–308.
  4. Litt HI, Gatsonis C, Snyder B, et al. CT Angiography for safe discharge of patients with possible acute coronary syndromes. N Engl J Med 2012;366:1393–403.
  5. Goldstein JA, Gallagher MJ, O’Neill WW, Ross MA, O’Neil BJ, Raff GL. A randomized controlled trial of multi-slice coronary computed tomography for evaluation of acute chest pain. J Am Coll Cardiol 2007;49:863–71.
  6. Hulten E, Pickett C, Bittencourt MS, et al. Outcomes after coronary computed tomography angiography in the emergency department: a systematic review and meta-analysis of randomized, controlled trials. J Am Coll Cardiol. 2013;61:(8)880-92.
  7. Boden WE, O’rourke RA, Teo KK, et al. Optimal medical therapy with or without PCI for stable coronary disease. N Engl J Med. 2007;356(15):1503-16.
  8. Mehta SR, Cannon CP, Fox KA, et al. Routine vs selective invasive strategies in patients with acute coronary syndromes: a collaborative meta-analysis of randomized trials. JAMA. 2005;293(23):2908-17.
  9. Stergiopoulos K, Brown DL. Initial Coronary Stent Implantation With Medical Therapy vs Medical Therapy Alone for Stable Coronary Artery Disease: Meta- analysis of Randomized Controlled Trials. Archives of Internal Medicine 2012 Feb;172(4):312
  10. Stergiopoulos K, Boden WE, Hartigan P, et al. Percutaneous Coronary Intervention Outcomes in Patients With Stable Obstructive Coronary Artery Disease and Myocardial Ischemia: A Collaborative Meta-analysis of Contemporary Randomized Clinical Trials. JAMA Intern Med. 2014;174(2):232-240.
  11. The SCOT-HEART investigators. CT coronary angiography in patients with suspected angina due to coronary heart disease (SCOT-HEART): an open-label, parallel group multicenter trial. Lancet. 2015; (published online March 15.)

The Case of Dubious Squire

laennec (1)

I often get the sense that the makers of many biomarkers envision us as helpless damsels in distress drowning in an icy pond or trapped in a monumental tower with no obvious means of descent. I imagine they think in our desperate grasps for aid, we will cling to whatever assistance they may offer, independent of its buoyancy. But in these moments of fear and uncertainty we must remember for a test to be useful to a clinician not only does it have to be accurate and reliable, it must also add diagnostic value above the clinician’s own inherent aptitude. B-type natriuretic peptide (BNP) and its natriuretic derivatives are a classic example of such a test heralded for its isolated diagnostic properties without asking the simple question, how does it help the physician? Through statistical misdirection, the distributors of natriuretic peptides have published research hailing their diagnostic prowess when examined in isolation. Such publications have led to these assays becoming recommended components of the workup for any patient suspected of having acute decompensated heart failure (1,2,3). A recent meta-analysis performed by the helpful folks responsible for the NICE guidelines, sought to examine the validity of these recommendations and determine the true diagnostic accuracy of natriuretic peptides (4). And yet, I fear these authors in their effort to provide an accurate representation of the assay’s diagnostic accuracy, have forgotten to take into account the most important factor when evaluating any diagnostic test, the clinician.

In this meta-analysis, Roberts et al examined the clinical accuracy of BNP, NTproBNP, and MRproANP for the diagnosis of acute decompensated heart failure in the Emergency Department. Specifically, the  goal was to evaluate the low risk criteria proposed by the 2012 European Society of Cardiology guidelines for heart failure, a BNP ≤100 ng/L, a NTproBNP, ≤300 ng/L, and a MRproANP, ≤120 pmol/L. They also examined the utility of these assays at intermediate and high levels (100-500 ng/L, and >500 ng/L for BNP; 300-1800 ng/L, and >1800 ng/L for NTproBNP; and >120 pmol/L for MRproANP) (4).

The authors identified 42 articles, examining 37 different cohorts that met criteria for inclusion into their meta-analysis. Combining these studies, the authors calculated pooled test characteristics for each of the natriuretic assays in question. They found at the low thresholds proposed by the European Society of Cardiology, the assays performed equally mediocre. All three demonstrated high sensitivities, 95%, 99%, and 95% respectively. Of course by selecting such a low cutoff, authors ensured that a large proportion of the patients without acute heart failure would also test positive. The specificities of each of these assays were a dismal 63%, 43%, and 56% respectively. As with any diagnostic tool, by raising the threshold of what you consider positive, the authors were able to improve the assay’s specificity. When the intermediate thresholds were utilized, the specificities increased to to 86% and 76% for BNP and NTproBNP respectively (authors did not have enough data on MRproANP to adequately calculate accuracy in this intermediate range.) Of course this amplified specificity came at the price of a loss of sensitivity, 85% and 90% respectively. When using the high threshold, authors were able to augment the tests’ specificity even further, but of course at this high level a large portion of patients with acute decompensated heart failure are missed. At a threshold of ≥500 ng/L, diagnostic meta-analysis was not performed due to inadequate data. BNP demonstrated sensitivities from the individual studies ranging from 35% to 83%, with a paired specificity from 78% to 100%. Likewise at a threshold of ≥1800 ng/L, NTproBNP reported sensitivities ranging from 67% to 87% with paired specificities ranging from 72% to 95%. Finally at the threshold of >120 pmol/L, MRproANP demonstrated sensitivities ranging from 84% to 98% and the paired specificities from 40% to 84% (4).

The authors conclude, “The use of NTproBNP and B type natriuretic peptide at the rule-out threshold recommended by the recent European Society of Cardiology guidelines on heart failure provides excellent ability to exclude acute heart failure in the acute setting with reassuringly high sensitivity. The specificity is modest at all but the highest values of natriuretic peptide, therefore confirmatory testing by cardiac imaging is required in patients with positive test results (4).”

On face value this is a fair conclusion, as all three of these assays seem to perform moderately well at either extreme of their diagnostic spectrum. At very low levels it is safe to say that the likelihood that the patients symptoms were caused by heart failure was fairly low. Likewise when significantly elevated, these assays boast specificities high enough for clinical use. Unfortunately these results do very little to explain the true utility of natriuretic peptides. By isolating these assays’ test characteristics outside the clinical arena, the authors have falsely inflated the utility of BNP and its natriuretic derivatives.

The first issue that is pervasive throughout the literature expounding the utility of natriuretic peptides is the gold standard used to evaluate their diagnostic capabilities. The most prevalent gold standard used is a retrospective review performed by two Cardiologists blinded to the results of the natriuretic peptide in question. 31 of the 37 cohorts in this meta-analysis used some derivative of this questionable gold standard. In one of the largest trials conducted, the Breathing Not Properly (BNP) trial by Maisel et al, authors examined 1586 patients presenting to the Emergency Department with acute dyspnea (5). They found that the two Cardiologists disagreed with the initial Emergency Physician’s diagnoses 14% of the time and disagreed with each other 10.7% of the time (6). This suggests that the cases in question were clearly not straightforward. If two Cardiologists with access to the patients’ entire hospital course disagreed with each other almost as often as they disagreed with the initial diagnosis of the Emergency Physician, then it is fair to say using this definition as the gold standard is less than ideal.

Despite this tarnished gold standard the question remains, how do natriuretic peptides perform when used in the clinical arena? More specifically how well do natriuretic peptide assays help the Emergency Physician differentiate the causes of dyspnea in the subset of patients in which there is considerable diagnostic uncertainty? In the BNP trial Maisel et al examined the Emergency Physician’s ability to correctly identify acutely decompensated heart failure. They found our accuracy overall, when compared to the less than perfect gold standard of a retrospective review performed by two Cardiologists was 86% (6). In the subset of patients in which the Emergency Physician was certain the patients’ dyspnea was not cardiac in origin (<5% chance of CHF), their diagnostic accuracy was superb (92%). Likewise in the group of patients in which the Emergency Physician was 95% certain the patient did in fact have CHF, they were correct 95% of the time (7). It was only in the intermediate group (between 20%-80% probability) in which the Emergency Physician was unsure of the likelihood of CHF, that their diagnostic capabilities were understandably poor. It is in this intermediate group that we would hope the natriuretic peptides could provide us with some guidance. We should not ask how accurately do peptide assays predict acute decompensated heart failure, but rather how well do peptide assays predict acute decompensated heart failure in the subset of patients that present a diagnostic challenge to the Emergency Physician? When charged with such a task these assays are far less impressive.

Although in their initial publication Maisel et al failed to disclose the diagnostic abilities of the Emergency Physicians, citing only BNP’s performance using the retrospective cutoff of 100 ng/L (sensitivity of 90%, a specificity of 76%), the authors later published these findings in a secondary analysis. Published by McCullough et al in Circulation, the authors revealed that when the Emergency Physician was certain that the patient’s cause of dyspnea was either definitely CHF or definitely not CHF, their unstructured judgment outperformed that of the BNP assay. For patients in which the Emergency Physician was certain CHF was not the cause of their dyspnea their accuracy was 92% vs the BNP which was only 84%. Likewise when the Emergency Physician was certain the patient did in fact have CHF, again their judgment outperformed the diagnostic abilities of the BNP assay (accuracy of 95% vs 92%) (7). In fact even in the subset of patients where the Emergency Physician was fairly certain the diagnosis was CHF (>80%), their positive likelihood ratio of 11.5 was far more impressive than that of the BNP (3.4)(8). In the 27.8% of patients in which the Emergency Physician was unclear of the diagnosis, the very group we would hope the BNP could provide guidance, its diagnostic accuracy was entirely unhelpful. In this subset of patients, at a cutoff of 100 ng/L, the assay demonstrated no clinical utility with a sensitivity and specificity of 79% and 71% respectively (8).

Each of the 37 studies included in the Roberts et al meta-analysis failed to truly examine how natriuretic peptides perform clinically. As discussed, the majority of these trials employed a less than ideal gold standard comparator and were so confounded by spectrum bias, they rarely examined the subgroup of patients in which the diagnosis was unclear. Additionally most of these studies used a retrospectively derived cutoff calculated to demonstrate the assay’s optimal performance. This type of overfitting inevitably leads to decreased performance when validated in a novel cohort. Ideally a randomized trial comparing a natriuretic peptide guided management to standard practice could demonstrate what, if any, clinical utility these assays provide. A number of such trials have been conducted.

The first was published in the NEJM in 2004 by Mueller et al. In this trial the authors randomized 452 patients presenting to the emergency department with acute dyspnea to either a diagnostic strategy utilizing a BNP assay or a standard work up (9). Authors powered their study to detect a 20% reduction in time to discharge (an interesting primary diagnosis to choose if one thinks BNP possesses true clinical relevance), defined as the interval from presentation at the Emergency Department to discharge. The authors found a significant difference in time to discharge (8 vs 11 days) as well as shorter times to treatment for the BNP group (63 vs 90 minutes), decreased rates of hospitalization (75% vs 85%) and decreased admission to the ICU (15% vs 24%). In fact every outcome variable trended towards better in the group randomized to receive the BNP-guided diagnostic strategy. Initially these results seem significantly in favor of using BNP in the diagnostic workup of acute dyspnea, until one examines the other RCTs evaluating this question (9).

The second RCT examining natriuretic peptides for the management of acute dyspnea was published by Moe et al in Circulation in 2007(10). In this trial, the authors randomized 500 patients to either a NT-proBNP guided strategy or standard care. Like the previous study the authors used the clinically dubious endpoint of initial ED visit duration as their primary endpoint. Though the authors found a statistically significant difference in initial ED visit time, the 0.7-hour difference (5.6 hrs vs 6.3 hrs) hardly seems clinically relevant. In fact the remainder of clinically important variables all favored the usual care group (in-hospital mortality 4.4% vs 2.4% and 60-day mortality 5.4 vs 4.4) (10). Three other trials published subsequently found similar results. Other than clinically questionable reductions in length of stay, the use of natriuretic peptides had no meaningful effect on clinical outcomes (11,12,15). When these trials’ data were pooled in a meta-analysis published by Trinquart et al, in The American Journal of Emergency Medicine in 2011, authors found no significant difference in any of the multitude of clinically relevant variables including hospital admission rate, length of hospital stay, mortality or rates of re-hospitalization (13). Even in the long-term management of patients with known heart failure, when compared to symptom guided approach, a BNP guided protocol led to further diagnostic testing and more aggressive medical therapy without producing a difference in clinically relevant outcomes (18-month survival free of any hospitalization was 41% vs 40%) (16).

This is not a proclamation of the infallibility of the Emergency Physician but rather the recognition of our shortcomings. There are a clear group of patients that present a diagnostic challenge, for whom further confirmatory investigations could provide guidance. Despite the industry-sponsored studies designed to propagate an overinflated self-worth, a close examination of the natriuretic peptides reveal they add little value to Physicians’ judgment. When we as the Emergency Physician are certain of the diagnosis of acute decompensated heart failure, our intrinsic diagnostic capabilities outperform those of natriuretic peptides. In the patients that present as a diagnostic challenge, these assays are far too insensitive and non-specific to add substantial diagnostic clarity. Furthermore we have other, more diagnostically robust, tools like point of care ultrasound to assist in these challenging circumstances (14). Natriuretic peptides are not the diagnostic saviors that they are commonly proclaimed as. More importantly we are not in need of rescue as often as the makers of these peptides would have us believe. On the rare occasion we do require aid, should we not demand a far more resolute champion?

Sources Cited:

  1. Yancy CW, Jessup M, Bozkurt B, Butler J, Casey DE, Drazner M, et al. ACCF/AHA guideline for the management of heart failure: a report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines Circulation2013;128:e240-327
  2. McMurray JJV, Adamopoulos S, Anker SD, Auricchio A, Bohm M, Dickstein K, et al. ESC guidelines for the diagnosis and treatment of acute and chronic heart failure 2012: The Task Force for the Diagnosis and Treatment of Acute and Chronic Heart Failure 2012 of the European Society of Cardiology. Developed in collaboration with the Heart Failure Association (HFA) of the ESC. Eur J Heart Fail2012;14:803-69
  3. Thygesen K1, Mair J, Mueller C, Huber K, Weber M, Plebani M, et al. Recommendations for the use of natriuretic peptides in acute cardiac care: a position statement from the Study Group on Biomarkers in Cardiology of the ESC Working Group on Acute Cardiac Care Eur Heart J2012;33:2001-6
  4. Roberts Emmert, Ludman Andrew J, Dworzynski Katharina, Al-Mohammad Abdallah, Cowie Martin R, McMurray John J V et al. The diagnostic accuracy of the natriuretic peptides in heart failure: systematic review and diagnostic meta-analysis in the acute care setting BMJ 2015; 350 :h910
  5. Maisel AS, Krishnaswamy P, Nowak RM, et al. Rapid measurement of B-type natriuretic peptide in the emergency diagnosis of heart failure. N Engl J Med. 2002;347:(3)161-7.
  6. McCullough PA, Nowak RM, McCord J, et al. B-type natriuretic peptide and clinical judgment in emergency diagnosis of heart failure: analysis from Breathing Not Properly (BNP) Multinational Study. Circulation. 2002;106:(4)416-22.
  7. Schwam E. B-type natriuretic peptide for diagnosis of heart failure in emergency department patients: a critical appraisal. Acad Emerg Med. 2004;11:(6)686-91.
  8. Hohl CM, Mitelman BY, Wyer P, Lang E. Should emergency physicians use B-type natriuretic peptide testing in patients with unexplained dyspnea? CJEM. 2003;5:(3)162-5.
  9. Mueller C, Scholer A, Laule-Kilian K, Martina B, Schindler C, Buser P, et al. Use of B-type natriuretic peptide in the evaluation and management of acute dyspnea. N Engl J Med 2004;350(7):647-54.
  10. Moe GW, Howlett J, Januzzi JL, Zowall H. N-terminal pro-B-type natriuretic peptide testing improves the management of patients with suspected acute heart failure: primary results of the Canadian prospective randomized multicenter IMPROVE-CHF study. Circula- tion 2007;115(24):3103-10.
  11. Rutten JH, Steyerberg EW, Boomsma F, van Saase JL, Deckers JW, Hoogsteden HC, et al. N-terminal pro-brain natriuretic peptide testing in the emergency department: beneficial effects on hospitalization, costs, and outcome. Am Heart J 2008;156(1):71-7.
  12. Schneider HG, Lam L, Lokuge A, Krum H, Naughton MT, De Villiers Smit P, et al. B-type natriuretic peptide testing, clinical outcomes, and health services use in emergency department patients with dyspnea: a randomized trial. Ann Intern Med 2009;150(6):365-71.
  13. Trinquart L, Ray P, Riou B, Teixeira A. Natriuretic peptide testing in EDs for managing acute dyspnea: a meta-analysis. Am J Emerg Med. 2011;29:(7)757-67.
  14. Al Deeb M, Barbic S, Featherstone R, Dankoff J, Barbic D. Point-of-care ultrasonography for the diagnosis of acute cardiogenic pulmonary edema in patients presenting with acute dyspnea: a systematic review and meta-analysis. Acad Emerg Med. 2014;21:(8)843-52.
  15. Singer AJ, Birkhahn RH, Guss D, et al. Rapid Emergency Department Heart Failure Outpatients Trial (REDHOT II): a randomized controlled trial of the effect of serial B-type natriuretic peptide testing on patient management. Circ Heart Fail. 2009;2:(4)287-93.
  16. Pfisterer M, Buser P, Rickli H, et al. BNP-guided vs symptom-guided heart failure therapy: the Trial of Intensified vs Standard Medical Therapy in Elderly Patients With Congestive Heart Failure (TIME-CHF) randomized trial. JAMA. 2009;301:(4)383-92.

 

 

 

 

The Adventure of the Second Stain Continues

Meningitis_-_Lumbar_puncture

The CT-LP (lumbar puncture) diagnostic pathway has been a permanent fixture in the arsenal of the Emergency Physician for what seems like an eternity. Steadfast in its dependability, for many generations, the LP was a necessity for Emergency Physicians to safely exclude the diagnosis of subarachnoid hemorrhage (SAH). And yet, rarely a moment has passed over the past few years when Dr. Jeffrey Perry has not politely demonstrated how little we truly know about this disease process and the diagnostic tools associated with it. His 2011 paper questioning the necessity of an LP following a negative head CT under 6-hours from symptom onset, shook the once solid ground that the LP firmly stood upon (1). As if this attack on our reliable comrade was not enough, his most recent publication examining the diagnostic capabilities of the lumbar puncture itself has left our confidence in this once dependable testing strategy in turmoil.

In this paper, published in February of 2015 in The BMJ, Perry et al utilized a subset of patients from two cohorts originally enrolled to derive and validate his Ottawa SAH rule (3,4). Authors examined 1739 of these patients who received a lumbar puncture as part of their workup for SAH (2). They then sought to assess the diagnostic accuracy of this tool. Similar to common practice, they prospectively defined a positive tap as greater than 1 RBC on fluid aspirate. When this impossibly low threshold was upheld, LP’s performance was less than stellar. Of the 1739 patients who received an LP, 641 (36.9%) had positive findings, only 15 of which were actually from subarachnoid blood. Most of these false positive results were trivial, as 476 (74.3%) had counts of ≤100×106/L and 94 (14.8%) had counts of 101-1000×106/L. Only 10.4% of these patients were found to have clinically concerning levels of RBCs in their CSF (counts of >1000×106/L). Despite the predominance of low RBC counts, a great majority of the patients in whom the LP was positive (419) received invasive angiographic imaging.

When the LP was found to be negative (No RBCs in the CSF), it boasted a sensitivity of 100%. In an attempt to compensate for the unacceptably high number of false positives the authors retrospectively determined the ideal RBC cutoff to be 2000×106/L. At this threshold the LP had a sensitivity of 93.3% (95% confidence interval 66.0% to 99.7%) and specificity of 92.8% (90.5% to 94.6%) for aneurysmal subarachnoid hemorrhage.  If visual xanthochromia was added to this RBC cutoff, the sensitivity for ruling out SAH became 100% (95% confidence interval 74.7% to 100.0%).

These numbers are of course fraught with methodological pitfalls. The threshold of 2000×106/L was retrospectively derived to best fit this specific cohort. Only 15 of the 1739 patients examined actually had the disease in question making these calculations incredibly unstable (the confidence intervals surrounding their 100% sensitivity dropped as low as 74.7%). The threshold of 2000×106/L is hardly robust enough for clinical use and will inevitably fail when applied in prospective fashion to a novel cohort.

Though this data is not definitive and further studies validating these findings are required, a number of valuable conclusions can be inferred. Surprisingly the most important of these has little to do with the diagnostic utility of the lumbar puncture.

In 2011 Perry et al published their game changing article in The BMJ examining the accuracy of a non-contrast head CT performed under 6-hours from symptom onset for the diagnosis of SAH (1). This paper was a secondary analysis of the original cohort used to derive the Ottawa SAH Rules (4). Using this preexisting cohort they assessed the accuracy of head CT for the diagnosis of SAH before and after a 6-hour threshold. The authors claim a sensitivity of 100% when CT was performed within 6-hours of symptom onset. However when the CT was performed after this 6-hour threshold, the sensitivity fell to 85.7%. Suggesting that when performed within 6-hours, a non-contrast CT is sufficient to rule out SAH, allowing practitioners to forego a subsequent lumbar puncture. Though many have viewed this as practice changing, others argue a number of flaws in the study’s design prevent us from interpreting these conclusions with such conviction.

The most obvious and often discussed weakness of this study is the use of a surrogate endpoint in place of a true gold standard. Not all patients who had a negative head CT underwent a confirmatory lumbar puncture. In its place, the authors used a 6-month proxy outcome to demonstrate the safety of CT alone. Patients underwent a structured phone interview at the 6-month mark to ascertain their wellbeing. When attempts to reach patients over the phone failed, authors endeavored to determine their status by searching medical records from regional neurosurgical centers as well as coroner’s death records. Patients were considered to be free of SAH if on 6-month follow-up they were alive and well. In the case of patients who were discovered to have passed away during the follow-up period, if the cause of death was determined to be due to something other than SAH, their deaths were not counted as a missed diagnosis. Of the 1931 patients examined, 421 were lost to follow-up. Authors found 8 of these patients had passed away since their initial workup for subarachnoid hemorrhage. Although none of these patients were determined to have died because of SAH, the reliability of post mortem cause of death is questionable at best (5).

A far less discussed aspect to this study was how the authors’ definition of a positive CT influenced the validity of their results. The standard that Perry et al used to calculate the sensitivity of head CT was based upon the Neuroradiologist’s official report. In most facilities (as was the case at the centers participating in this study) what guides Emergency Physicians’ clinical decision-making is the initial wet read usually done by Radiology house staff or even the ED physicians themselves. The sensitivity we are concerned with is that of the wet read. The Neuroradiologists in this study were not blinded to the patients’ lab findings. As such we are unable ascertain how many CTs done within 6 hours were initially read as negative, and only later after a positive LP was performed was the final report recorded as positive. If this had occurred with any frequency it would obviously harm the internal validity of the results. We are able to get a sense of how frequently this occurred by examining how many of the patients who were diagnosed with SAH had both a positive CT and LP. At least in theory, if the CT was positive then there would be no reason to perform the subsequent LP.

Of the 15 patients with SAHs that were diagnosed using a positive LP, 10 underwent head CTs and LPs that were both positive. The vast majority of these subarachnoid bleeds (n=8) were found in patients who received their CTs beyond the 6-hour threshold. There were however two patients that were identified as having received their CTs within 6-hours of symptom onset. In both these patients their initial CT was read as negative and only after a positive lumbar puncture was the final report changed to positive. If these two patients are taken into account, the adjusted sensitivity of CT under 6-hours from symptom onset is only 98.3% (with the confidence interval dropping as low as 93.6%).

These findings of course do nothing accept muddy the already cloudy waters. Head CT though fairly sensitive, will on occasion miss a subarachnoid bleed. The addition of CSF aspirate will very often offer a further degree of ambiguity. Furthermore the utilization of LP, at least in its current strategy, leads to an unacceptable number of false positives, exposing a large number of patients to needless downstream testing. If a more liberal view towards RBCs in the CSF is taken, the LP’s utility may be justifiable. Even with the retrospective best fit diagnostic capabilities calculated by Perry et al, the prevalence of SAH following a negative CT in under 6-hours is so low that further testing will likely lead to identifying far more false positive results than true subarachnoid bleeds. Cleary the conviction and certainty we once held for this testing strategy has suffered. Perhaps it is time for a shared decision making model. After all it is our patients’ value systems rather than our own biases that should guide these investigative journeys. Dr. Perry has demonstrated that the CT-LP pathway is far from straightforward. Perhaps it is time we confess these imperfections to the world at large and begin a far more honest conversation.

Sources Cited:

  1. Perry JJ, Stiell IG, Sivilotti ML, et al. Sensitivity of computed tomography performed within six hours of onset of headache for diagnosis of subarachnoid haemorrhage: prospective cohort study. BMJ. 2011;343:d4277.
  2. Perry JJ, Alyahya B, Sivilotti ML, et al. Differentiation between traumatic tap and aneurysmal subarachnoid hemorrhage: prospective cohort study. BMJ. 2015;350:h568.
  3. Perry JJ, Stiell IG, Sivilotti ML, et al. Clinical decision rules to rule out subarachnoid hemorrhage for acute headache. JAMA. 2013;310:(12)1248-55.
  4. Perry  JJ, Stiell  IG, Sivilotti  ML,  et al.  High-risk clinical characteristics for subarachnoid haemorrhage in patients with acute headache: prospective cohort study. BMJ. 2010;341:c5204.
  5. Wexelman, BA et al. Survey of New York City Resident Physicians On Cause-Of-Death Reporting. 2010. Prev Chronic 2013 10:E76

The Adventure of the Cardboard Box Continues

sigmund-abeles_portrait-of-parasomniac

For those whose beliefs are already firmly in favor of endovascular therapy for acute ischemic stroke, the publication of the MR CLEAN trial earlier this year and more recently the EXTEND-IA and ESCAPE trials only serve as a big fat, “I TOLD YOU SO!” For the perpetual disbelievers, each of these trials possesses enough flaws to discredit their findings. For the appropriately skeptical among us, though these trials initially appear to discredit our well meaning rants, on closer examination they are far more validating.

Earlier this year the publication of a large, well done, RCT examining the use of endovascular treatment for acute ischemic stroke threatened to drastically change the acute management of CVA as we know it. And though this trial was given a most unfortunate name (MR CLEAN), it marked the first time endovascular therapy has demonstrated any clinically relevant benefit (1). We have discussed this trial in depth in two previous posts. While MR CLEAN’s results were promising, there are many reasons why they should be viewed with a healthy dose of skepticism. Before we commit to a resource heavy intervention like that of endovascular therapy, more studies validating these findings are required. Since the publication of MR CLEAN, two active trials were stopped early for benefit, seeming to be the very validation for which we asked. The results of both of these studies, EXTEND-IA and ESCAPE, were recently published in the NEJM (2,3).

The first trial, Extending the Time for Thrombolysis in Emergency Neurological Deficits — Intra-Arterial (EXTEND-IA) trial, by Campbell et al, is a multi-center RCT that examined the efficacy of endovascular treatment in patients with CVA whose symptoms began within 4.5 hours of randomization. Like MR CLEAN this trial was a stunning success. In fact its results far outpaced the, by comparison, paltry benefits found in MR CLEAN. EXTEND-IA was stopped early after enrolling 70 patients for overwhelming benefit. The rate of significant improvement after 3 days (reduction in NIHSS > 8) was 80% vs 37% in the endovascular group and control group respectively. Likewise the rate of favorable outcome at 90-days (mRS of 0-2) was 71% vs 40% respectively, boasting an absolute difference of 31% (2).

The second and far more statistically robust trial is the Endovascular Treatment for Small Core and Anterior Circulation Proximal Occlusion with Emphasis on Minimizing CT to Recanalization Times (ESCAPE) trial, published by Goyal et al. In this trial, authors examined patients up to 12-hours after symptom onset, (though the large majority of the patients enrolled were evaluated within 3-hours of symptom onset). Like EXTEND-IA, the ESCAPE trial was an overwhelming success. Authors randomized 316 patients to either standard care or standard care plus endovascular therapy. Like EXTEND-IA, the authors found overwhelming benefits of the endovascular therapy. The rate of functional independence at 90-days (mRS of 0-2) was 53.0% vs 29.3% in favor of the endovascular arm. With authors noting a 33.7% absolute increase in positive outcomes in patients who received endovascular therapy. For the first time in the history of reperfusion therapies for acute ischemic stroke, a clinically significant mortality benefit was demonstrated. 90-day mortality was 10.4% in the endovascular group compared to 19.0% in the control group. Not to mention the surprisingly low rate of intracranial hemorrhage, (3.6% vs 2.7%) (3).

Neither trial is definitive in its own right. The EXTEND-IA cohort only examined the efficacy of endovascular therapy in 70 patients. Originally intending to enroll 100 patients, this trial was stopped prematurely after an interim analysis demonstrated such impressive results. This premature investigation of the sealed data was not performed because of a pre-planned interim analysis, but rather because of the publication of MR CLEAN (2). Though the remaining 30 patients would have most likely not have altered the results, we cannot view this poorly powered trial as anything more than hypothesis building. In isolation, EXTEND-IA can only offer a guideline for the future of endovascular management in acute ischemic stroke. Even the authors themselves conceded this point in the statistical analysis plan they published in January 2014, in which they clearly defined EXTEND-IA as a phase II trial (4). A definition that is conveniently left out of the formal publication in the NEJM, an oversight possibly induced by the unexpected magnitude of their success causing well deserved delusions of grandeur.

ESCAPE, though far more statistically hardy than EXTEND-IA, is still a rather small cohort suffering from the same unfortunate biases. Originally intending to enroll 500 patients, the authors called for an early stoppage, prior to their planned interim analysis, again because of the results of MR CLEAN. Although the sample size of 316 patients lends a stronger validity than the 70 patients examined in EXTEND-IA, the early stoppage prevents us from confidently assessing the true effect size this treatment may provide. Interestingly when implementing this unplanned analysis, the authors utilized a dichotomous outcome comparing the mRS scale of patients alive and independent (mRS of 0-2) at 90-days rather than the ordinal analysis they had originally chosen and utilized as their primary outcome when performing the power calculation. The ordinal scale has recently gained favor as an outcome measure in stroke trials because of its ability to augment the p-value and turn otherwise negative trials into statistical successes. Conversely it is almost impossible to determine the clinical relevance of the odds ratio it produces. Given the impressive benefits of both trials, the small statistical augmentations offered by ordinal analysis are irrelevant. As such the authors of both trials favored the more traditional dichotomous outcome. The 33.7% absolute difference measured by the dichotomous scale in the ESCAPE trial, appears far more impressive than an odds ratio of 2.6 offered by ordinal analysis (3).

With the overwhelming success of both EXTEND-IA and ESCAPE, the MR CLEAN data appears almost lacking. In the MR CLEAN cohort, patients randomized to receive endovascular therapy had a 14% absolute benefit over those in the controls. It is safe to say neither group did all that well, with the amount of patients alive and independent at 90-days reported as 33% and 19% respectively(1). The EXTEND-IA and ESCAPE cohorts however did exponentially better (71% vs 41% and 53.0% vs 29.3% respectively) (2,3). Are we truly looking at the same patients as were examined in MR CLEAN, or do the EXTEND-IA and ESCAPE cohorts represent a completely different population?

It should come as no surprise that both the EXTEND-IA and ESCAPE cohorts included vastly different patients than those enrolled in MR CLEAN. In MR CLEAN, to be eligible for inclusion patients were required to have an occlusion of distal intracranial carotid artery or middle cerebral artery (M1, M2) or anterior cerebral artery (A1) as identified by CT angiography (CTA), magnetic resonance angiography (MRA) or digital subtraction angiography (DSA)(1). Both EXTEND-IA and ESCAPE had far stricter inclusion restrictions. Patients who were enrolled in the EXTEND-IA cohort needed to demonstrate an ischemic penumbra on perfusion imaging with a small infarcted core(2). Though slightly different criteria were utilized, like EXTEND-IA, the ESCAPE cohort used CT angiographic imaging to identify patients with small infarcted cores and large areas of salvageable tissue (3). These inclusion criteria significantly narrowed the subset of stroke patients examined. These differences in patient selection are not only responsible for the almost unbelievable efficacy demonstrated in both of the EXTEND-IA and ESCAPE trials, they mark the first time that imaging criteria was successfully used to identify a cohort of stoke patients who may benefit from reperfusion therapy.

There has been a long history of failure in the use of perfusion imaging for the management of acute ischemic stroke. Early studies investigating the use of diffusion weighted MRI to identify potentially salvageable ischemic brain failed to show benefit (5,6,7,8,9). These failures may be due in part to the industry bias of only enrolling patients presenting > 3 hours after onset, in the hopes of extending FDA approved treatment windows and more importantly their profit margins. Though these trials showed promising rates of reperfusion, the consistently high incidence of intracranial hemorrhage overshadowed the minimal benefits. The MR RESCUE trial, published in NEJM in February 2013 was the first to utilize this technology to identify potential candidates for endovascular therapy (10). Again this trial failed to demonstrate that patients with ischemic penumbrae benefitted from revascularization. However this may have been due more to the trial’s flawed design than the technology’s deficiencies. The authors of MR RESCUE only enrolled patients after initial IV tPA failure. In contrast to these historical failures both the EXTEND-IA and ESCAPE cohorts, unencumbered by fears of disproving tPAs early successes, aggressively pursued reperfusion therapy after salvageable tissue was identified on CT imaging. In doing so, these trials have, for the first time, identified the population that will most likely benefit from reperfusion therapy.

At the risk of sounding optimistic, both EXTEND-IA and ESCAPE are impressively positive trials. Although small and methodologically flawed, with likely exaggerated effect sizes, when viewed in concert with MR CLEAN, these trials present endovascular therapy in a promising light. For some time now legitimate cries for more data regarding tPA’s safety and efficacy in acute ischemic stroke management have been disregarded and marginalized. This almost fanatical acceptance based around the success of the NINDS trial, a single poorly powered study which treated patients with IV tPA within 3-hours of symptoms onset. Despite the many methodogical flaws of NINDS, its results were never duplicated because of the pharmaceutical industry’s fear of losing the tenuous ground they had gained. Although there are significant harms associated with the administration of tPA, the literature has consistently suggested that there is a subset of patients who will benefit from its administration. Rather than working to identify this narrow population, we have witnessed an industry driven effort to expand the indications for reperfusion therapy. EXTEND-IA and ESCAPE have identified potential cohorts of patients who will likely benefit from reperfusion therapy. If these results can be confirmed, no longer will we be forced to use the blunt tool of perceived time from symptom onset to determine which patients are eligible for treatment. These trials should inspire us to not only explore the successful utilization of endovascular therapy, but also reexamine the harmful practice of thrombolytic therapy we currently employ.

Sources Cited:

  1. Berkhemer OA, Fransen PS, Beumer D, et al. A randomized trial of intraarterial treatment for acute ischemic stroke. N Engl J Med. 2015;372:(1)11-20.
  2. Campbell BC, Mitchell PJ, Kleinig TJ, et al. Endovascular Therapy for Ischemic Stroke with Perfusion-Imaging Selection. N Engl J Med. 2015.
  3. Goyal M, Demchuk AM, Menon BK, et al. Randomized Assessment of Rapid Endovascular Treatment of Ischemic Stroke. N Engl J Med. 2015.
  4. Campbell BC, Mitchell PJ, Yan B, et al. A multicenter, randomized, controlled study to investigate EXtending the time for Thrombolysis in Emergency Neurological Deficits with Intra-Arterial therapy (EXTEND-IA). Int J Stroke 2014;9:126-132
  5. Davis SM, Donnan GA, Parsons MW, et al. Effects of alteplase beyond 3 h after stroke in the echoplanar imaging thrombolytic evaluation trial (EPITHET): a placebo-controlled randomised trial. Lancet Neurol. 2008;7:299–309.
  6. Albers GW, Thijs VN, Wechsler L, et al. Magnetic resonance imaging profiles predict clinical response to early reperfusion: the diffusion and perfusion imaging evaluation for understanding stroke evolution (DEFUSE) study. Ann Neurol. 2006;60:508–517
  7. Hacke W, Albers G, Al-Rawi Y, et al. The desmoteplase in acute ischemic stroke trial (DIAS): a phase II MRI-based 9-hour window acute stroke thrombolysis trial with intravenous desmoteplase. Stroke. 2005;36:66–73.
  8. Furlan AJ, Eyding D, Albers GW, et al. Dose Escalation of Desmoteplase for Acute Ischemic Stroke (DEDAS): evidence of safety and efficacy 3 to 9 hours after stroke onset. Stroke. 2006;37:1227–1231.
  9. Hacke W, Furlan AJ, Al-Rawi Y, et al. Intravenous desmoteplase in patients with acute ischaemic stroke selected by MRI perfusion-diffusion weighted imaging or perfusion CT (DIAS-2): a prospective, randomised, double-blind, placebo-controlled study. Lancet Neurol. 2009;8:(2)141-50.
  10. Kidwell CS, Jahan R, Gornbein J, et al. A trial of imaging selection and endovascular treatment for ischemic stroke. N Engl J 2013;368:(10)914-23.

The Adventure of the Blanched Soldier

fallen-bugler

 

So often in the management of the critically ill we are forced to choose between the lesser of two evils. The transfusion of blood products in the face of hemorrhagic shock is in some ways the best compromise of less than ideal choices. Every drop of resuscitative fluid given that does not mimic the blood a patient has recently lost further dilutes their already diminished coagulative capabilities. And yet an overtly zealous administration of blood products has the potential to cause a multitude of adverse events downstream, further complicating the patient’s potentially arduous recovery. That being said, the endeavor to replenish as close a surrogate to whole blood as logistically possible is an extremely feasible concept to accept as beneficial. Yet despite this strong biological plausibility, the balanced administration of packed red blood cells (PRBCs), plasma and platelets has never been demonstrated to be efficacious beyond this physiologic reasoning. A number of retrospective trials examining this concept have claimed benefit(1,2,3,4), but their results are so confounded by survivor bias, it is difficult to interpret their true meaning (8). Even the PROMMTT trial, the largest trial to examine this question in a prospective fashion, failed to include a prospectively randomized control group and as such, its results were equally limited. With the publication of the PROPPR trial, the first large RCT to evaluate the efficacy of a balanced transfusion strategy, we finally have some strong data to guide us (6). On first glance this well-done RCT seems to have vindicated those in support of the 1:1:1 transfusion strategy, but I fear, in reality it may have left us with more questions than answers.

The Pragmatic, Randomized Optimal Platelet and Plasma Ratios (PROPPR) Trial by Holcomb et al, published in JAMA on February 3, 2015, sought to identify the preferential ratio of plasma, platelets, and blood cells when resuscitating the critically ill trauma patient. The authors randomized 680 patients to either a 1:1:1 or 1:1:2 ratio of plasma, to platelets, to PRBCs. Inclusion criteria included; patients identified as having severe bleeding or being at risk of severe bleeding (defined as having at least 1 U of any blood component transfused prior to hospital arrival or within one hour of admission and prediction by an Assessment of Blood Consumption score of 2 or greater or by physician judgment of the need for a massive transfusion). Although authors specified the order and ratio that blood components should be transfused, the decision to administer products was left to the discretion of the treating physician. Using this pragmatic trial design authors hoped to examine the effects of each transfusion strategy on the primary endpoints, 24-hour and 30-day mortality. Holcomb et al also examined a number of secondary endpoints of importance including, time to hemostasis and the number and type of blood products administered until hemostasis was achieved.

On first glance the difference in transfusion strategies did not seem to make a difference, as the authors failed to find statistical significance in either of their two primary endpoints. A closer look reveals that this was more likely due to the authors overestimation of the true effect size of the 1:1:1 ratio rather than a lack of efficacy for this balanced transfusion strategy. Specifically the 24-hour mortality was 12.7% and 17.0% in the 1:1:1 and 1:1:2 groups respectively. Though not statistically significant this 4.3% absolute difference in favor of the more aggressive transfusion strategy clearly trends towards clinical relevance. Especially given that the rate of death due to exsanguination (9.2% vs 14.6%) and the percentage of patients who achieved hemostasis (86.1% vs 78.1%) were noticeably improved. Likewise though the 30-day mortality failed to reach statistical significance, it did maintain a robust absolute difference of 3.7% in favor of the 1:1:1 group.

As far as the transfusion related adverse events, the 1:1:1 strategy appears to be safe when compared to a less aggressive protocol. None of the 23 adverse events prospectively recorded seemed to occur with a greater regularity in patients randomized to the more aggressive strategy. There was a slight non-significant surge in the rate of systemic inflammatory response syndrome (SIRS) (5.2% absolute increase) in patients randomized to the 1:1:1, but it is hard to make much of this as the rates of both sepsis and acute respiratory distress syndrome seem equivalent.

It is important to note, despite the authors best intentions, this trial did not truly compare 1:1:1 vs 1:1:2 resuscitative strategies. Rather Holcomb et al examined a protocol intending to give 1:1:1 vs 1:1:2.  In reality neither group truly reached their proportional expectations. The 1:1:1 group in actuality was given products closer to a 2:1:2 ratio, while the 1:1:2 group only received products in a 2:1:4 ratio. It is difficult to know how these shortcomings affected outcomes.

By all intents and purposes it seems the rate of adverse reactions was not significantly increased when a more aggressive use of plasma and platelets was administered, though these results may too have been biased by the less than stringent implementation of each groups assigned blood product ratio. Throughout the intervention period the 1:1:1 group received a significantly higher ratio of PRBCs to plasma and PRBCs to platelets than the 1:1:2 group. However this ratio was reversed when the post-intervention period was examined. During the post-intervention period the treating physicians were able to select blood products in any ratio they deemed clinically relevant, and as such they attempted to replenish all the plasma and platelets they were restricted from giving during the intervention period. Though the total quantity was far less than what was given in the intervention period, the PRBCs to plasma to platelets ratio was higher in the 1:1:2 group during the post-intervention period. This in and of itself may have led to an increase in the rates of adverse events observed in the 1:1:2 group without providing the coagulative benefits the early administration of these products provided in the 1:1:1 group.

Despite some minor inconsistencies, the results appear to be a validation of the balanced transfusion strategy. And yet one has to ask, “what did these authors truly demonstrate?” Holcomb et al compared a 1:1:1 strategy to the slightly more conservative 1:1:2 strategy. Ideally the only difference in these two groups should have been that the 1:1:1 group received marginally more platelets and plasma during the initial resuscitation. Are these two transfusion strategies really dissimilar enough to demonstrate a clinically relevant difference? Should they have compared a balanced transfusion strategy to a reaction method where platelets and plasma are only administered when patients develop a coagulopathy? More importantly is any empirically chosen ratio the ideal strategy in today’s age of point of care testing? In 2013 CMAJ published a trial by Bartolomeu et al that compared a fixed ratio similar to that used by Holcomb et al (1:1:1) to a laboratory-guided transfusion strategy (7). In this laboratory-guided strategy, blood product administration was guided by INR, PTT, Hb and platelet values. Although the trial was far too small to be definitive (n=67), the results were interesting nonetheless.  The mortality in the laboratory-guided group was far less at 14.3% when compared to the 32.5% observed in the 1:1:1 strategy. Although a lab value guided resuscitation strategy is clearly impractical in the acute resuscitation period, a point-of-care based system like TEG may provide us with the instantaneous feedback we require to tailor our resuscitation strategies to the specific needs of the patient rather than the empiric strategy currently advised.

I doubt these results will lead to a significant change in practice. It seems the 1:1:1 massive transfusion strategy has become firmly entrenched in trauma resuscitation dogma. At least the PROPPR trial offers support to the notion that if one is going to use an empirically based transfusion strategy, striving for a equal ratio of cells, plasma and platelets appears to be of some benefit.

Sources Cited:

1. Holcomb JB, Wade CE, Michalek JE, Chisholm GB, Zarzabal LA, Schreiber MA et al. Increased plasma and platelet to red blood cell ratios improves outcome in 466 massively transfused civilian trauma patients. Ann Surg 2008; 248: 447–458.

2. Borgman, M.A., Spinella, P.C., Perkins, J.G. et al. The ratio of blood products transfused affects mortality in patients receiving massive transfusions at a combat support hospital. J Trauma 2007; 63: 805–813.

3. Holcomb, J.B., Wade, C.E., Michalek, J.E. et al. Increased plasma and platelet to red blood cell ratios improves outcome in 466 massively transfused civilian trauma patients. Ann Surg 2008; 248: 447–458.

4. Maegele, M., Lefering, R., Paffrath, T. et al. Red-blood-cell to plasma ratios transfused during massive transfusion are associated with mortality in severe multiple injury: a retrospective analysis from the Trauma Registry of the Deutsche Gesellschaft für Unfallchirurgie. Vox San. 2008; 95: 112–119.

5. Holcomb, J.B., del Junco, D.J., Fox, E.E. et al. The Prospective, Observational, Multicenter, Major Trauma Transfusion (PROMMTT) study. JAMA Surg 2013; 148: 127–136.

6. Holcomb JB, Tilley BC, Baraniuk S, et al. Transfusion of plasma, platelets, and red blood cells in a 1:1:1 vs a 1:1:2 ratio and mortality in patients with severe trauma: the PROPPR randomized clinical trial. JAMA 2015;313:(5)471-82.

7. Bartolomeu et al. “Effect of a Fixed-Ratio (1:1:1) Transfusion Protocol Versus Laboratory-Results–guided Transfusion in Patients with Severe Trauma: a Randomized Feasibility Trial.” CMAJ : Canadian Medical Association Journal 185.12 (2013): E583–E589. PMC. Web. 6 Feb. 2015.

8. Ho AM, Zamora JE, Holcomb JB, Ng CS, Karmakar MK, Dion PW. The Many Faces of Survivor Bias in Observational Studies on Trauma Resuscitation Requiring Massive Transfusion. Ann Emerg Med 2015.

The Case of the Balanced Solution

FEA2-3

Saline-based resuscitation strategies were first proposed as far back as 1831 during the Cholera Epidemic. In an article published in the Lancet in 1831, Dr. O’Shaughnessy suggests the use of injected salts into the venous system as a means of combating the dramatic dehydration seen in patients afflicted with this bacterial infection(1). Saline’s potential harms were first observed in post-surgical patients who after receiving large volumes of saline based resuscitation fluids during surgery were found to have a hyperchloremic acidosis (2). Though these changes appear transient and clinically trivial, it is theorized that when applied to the critically ill, the deleterious effects on renal blood flow may increase the rate of permanent renal impairment and even death. Unfortunately, no large prospective trials have demonstrated this hypothesis to be anything more than physiological reasoning. Small prospective trials have exhibited trivial trends in decreased renal blood flow, kidney function, and increased acidosis, though these perturbations were fleeting and of questionable clinical relevance (3, 4, 5, 6, 7). A larger retrospective study, bringing all the biases such trials are known to carry, demonstrated small improvements in mortality of ICU patients treated with a balanced fluid strategy, though it failed to demonstrate improvements in renal function (the theoretical model used to support balanced fluid administration) (8). In 2012 Yonus et al were the first to attempt to prospectively answer this question in an ICU population. Published in JAMA, on first glance the results seemed to vindicate those in support of the use of balanced fluids (9). Yet despite its superficial success, a closer look reveals this trial does little to demonstrate the deleterious effects of chloride-rich resuscitative strategies. In a recent publication in Intensive Care Medicine, Yonus et al re-examine this question in the hopes of once again demonstrating the benefits of balanced fluid strategies for the resuscitation of the critically ill (10).

In the original publication Yonus et al, using a prospective open-label before and after cohort design, hoped to demonstrate that use of balanced fluids in ICU patients would lead to improved renal function and decreased administration of renal replacement therapy (RRT). For the initial 6-month period fluid administration was left entirely to the whims of the treating intensivist. This was followed by a 6-month span during which ICU staff were trained and educated on the evils of chloride-rich solutions and the benefits of a more balanced approach to fluid selection. Following this smear campaign on normal saline and its high-chloride co-conspirators, authors spent the next 6-months recording fluid administration and subsequent patient outcomes. The authors’ co-primary outcomes were the increase in creatinine levels above baseline during ICU stay and the incidence of acute kidney injury (AKI) as defined by the RIFLE(Risk, Injury, Failure, Loss, End-stage) criteria. Secondary outcomes listed by the authors included the need for RRT, ICU length of stay, and mortality (9).

As far as convincing ICU staff that balanced solutions were beneficial, the authors’ experiment was an overwhelming success. 1,533 patients were examined, 760 patients during the 6-month control period and 773 patients during the subsequent 6-month intervention period. The total amount of normal saline used over the two periods was 2,411L and 52L respectively. Likewise the total chloride administration decreased by a total of 144,504 mmol, or by 198 mmol/patient (9).

On face value the study appears to have been a success, demonstrating statistically significant benefits for both primary outcomes. During the intervention period patients experienced a statistically lower rise in creatinine levels, 14.8 μmol/L (95% CI, 9.8-19.9 μmol/L) than during the control period 22.6 μmol/L (95% CI, 17.5-27.7 μmol/L). Authors also found a 5.6% absolute decrease in the rate of RIFLE defined kidney injury and kidney failure in patients during the intervention period when compared to those in the control period (9).

These seemingly positive results should be tempered by the fact that while statistically significant, the differences are, for the most part, clinically irrelevant. A 7.8 μmol/L increase in creatinine translates to an approximately 0.09 mg/dl difference between the control and intervention periods, which is hardly clinically pertinent. The 5.6% difference in rate of AKI was primarily powered by the 3.3% difference in rate of the less severe RIFLE class, kidney injury. When kidney failure was examined alone, unaccompanied by this statistical augmentation, the difference was found to be statistically insignificant (9).

Even the 3.7% absolute decrease in RRT in the intervention period (10.0% vs 6.3%) is hard to conclusively attribute to the balanced fluid strategy, given the open nature of the trial design and the fact that these benefits did not translate into either a decrease in the rate of long term dialysis requirements or mortality. Furthermore the annual rates of RRT during the control and intervention periods are almost identical (7.4% vs 7.9%). In fact, the rates of RRT in the years bookmarking this study are highly variable, which speaks to the potential for unmeasured bias and the cyclic nature of random chance causing the observed differences in these groups, rather than the intervention in question. It is important to remember that though RRT appears to be a finite objective endpoint, it is largely dependent on the treating physician’s subjective judgement. In an open label design such as this, in which the authors are clearly in favor of one intervention over another, the potential for bias affecting this outcome is evident (9).

In a secondary analysis of their data set, Yunos et al hoped to address some of these uncertainties. In this manuscript, published in Intensive Care Medicine in 2014, the authors added an additional 6 months of patient data to both the control and intervention periods, with the intention to demonstrate that the positive findings of their initial publication were due to the favorable influences of balanced fluids. The control period was expanded to include patient data (n=716) from the 6-month period prior to the study’s original start date. The authors then incorporated an additional 6 months of data to the intervention group (n=745) after its original stop date. Overall the two augmented periods ran from February 2007 to February 2008 and August 2008 to August 2009. The authors again found success. And though their primary endpoints remained of questionable clinical significance, the magnitude of their triumph was certainly more impressive (10).

With the addition of this 12-month period of data, the authors boast a 4.8% absolute decrease in the rate of moderate or severe kidney injury as compared to the control. Though the absolute difference in the rate of RRT decreased from 3.6% to 3.0%, when the additional patients were added to the analysis, the difference still remained statistically significant (10). Interestingly, despite both the added control and intervention groups regressing to the mean, the overall magnitude of benefit reported by the authors seemed to increase. This slight of hand was achieved not by some complex form of statistical wizardry, but rather simply lowering the bar for what the authors defined as success.

In their original manuscript, Yunos et al used the RIFLE criteria to define the varying degrees of AKI. Conversely in the more recent publication, AKI was evaluated using the Kidney Disease: Improving Global Outcomes (KDIGO) scale. Despite its grandiose title, in reality this scale is essentially the amalgamation of the previous two scales traditionally used to define AKI (the RIFLE and AKIN criteria). Creators of the KDIGO criteria hoped to identify a greater proportion of patients who would benefit from RRT, and thus created a novel tool by incorporating both definitions of AKI (11). Of course, as is typical with any diagnostic tool, augmenting its sensitivity is achieved by sacrificing its specificity.

Such is the case for the KDIGO score. Not surprisingly, when examined, the KDIGO score identified significantly more patients in renal failure than either the RIFLE or AKIN criteria. In a trial published by Critical Care in 2014, Luo et al compared RIFLE, AKIN and KDIGO’s abilities to identify clinically important AKI (12). They found that the use of the KDIGO criteria identified more overall patients as having AKI (51% compared to 46.9% and 38.4% respectively) as well as classified an larger subset of patients as being in failure (16% compared to 13.8% and 12.8% respectively). Despite the increased yield, no difference was seen in each respective criterion’s abilities to predict death (AOC were 0.738, 0.746, 0.757 respectively). It is still unclear whether the additional patients identified using the KDIGO criteria benefit from early aggressive management of their subtle renal impairment or are harmed from the invasive interventions performed in hopes of treating pathology that would likely resolve without interference. What is clear is that changing from the more conservative RIFLE criteria to the more liberal KDIGO, makes interpreting the clinical relevance of Yunos et al’s results difficult.

In the 2014 publication by Yunos et al, the absolute difference in AKI is similar to that described in the 2012 publication (4.8% vs 5.6%), but unlike their original population there is a shift to a more severe spectrum of renal impairment. Using the KDIGO criteria authors found significantly more stage 3 AKI than in their original publication. In the original manuscript the difference in RIFLE failure (class 3) AKI failed to reach clinical significance. In their updated cohort the authors now cite a statistically significant decrease in the rate of KDIGO class 3 AKI (the equivalent of RIFLE failure). The original trial states an absolute difference in the rate of RIFLE class 3 AKI of 2.1%. In their more recent document Yunos et al now cite a 4% (14% vs 10%) absolute decrease in KDIGO stage 3 AKI. Likewise the original manuscript states an absolute difference of 3.3% in the rate of RIFLE class 2 AKI. In the more recent document this same difference is now stated to be only 2%. Clearly the use of the KDIGO criteria has shifted the severity of the cohort in an alarming fashion. This increase in class 3 AKI may be a more accurate interpretation of reality, but given that these differences did not translate into a decrease in either long-term dialysis or mortality, its clinical relevance is unlikely.

Even these clinically questionable differences cannot be directly attributed to the more balanced fluid strategy utilized during the intervention period. It is equally likely the multiple biases introduced by a before and after study design were responsible. Using a multivariant regression model, Yunos et al hoped to account for many of these biases. On initial presentation authors seem to be vindicated in their assertions that these differences in renal function were due to the change in fluid administration. When the addition of the extended control and intervention periods were included in the multivariable analysis, the rate of KDIGO stage 2 and 3 AKI and RRT remained statistically significant. This benefit was powered completely by the initial cohort, the addition of the extended cohorts served only to regress these benefits towards the mean. The odds ratio in the original cohort for preventing AKI was 1.68 (1.28-2.21). When the extended groups were incorporated the odds ratio falls to 1.32 (1.11-1.58).  In fact a thorough examination comparing the four time periods uncovers the initial results are hardly as robust as they originally appear.  When the extended time period is examined alone (control vs intervention), there was no difference between in the incidence of AKI or RRT. Additionally when the extended control is compared to the original intervention period, the decrease in difference in AKI remains significant but the rate of RRT is no longer statistically significant. There is even a statistically significant increase in the rate of AKI when the original intervention period is compared to the extended intervention period. In fact this is the very same difference in both AKI and RRT that is observed when comparing the original control group to the extended intervention group (10) . Essentially, though it was the authors intent to validate the findings of their initial study, the inconsistent benefits demonstrated in the extended cohort do just the opposite.  These differences seem to be due more to random chance than any beneficial effects of a balanced fluid strategy.

The interpretation of medical literature very rarely is as straightforward as we would like to imagine. Much like searching for truth in a magic mirror, so often it serves only to confirm our own beliefs and supports our incredulities. And yet if we are to claim to be authentic curators of truth in medicine, it is important we apply just as much academic rigor when examining topics which we support as we do with those we distrust. A balanced approach to fluid administration has a strong physiological base to support its use. But physiologic reasoning has led us down many blind paths and dark alleys. It is only when we shine the light of critical research we reveal which are dead ends and which lead us and our patients to a better place. Currently we are uncertain as to whether the success of a balanced fluid strategy is due to its chloride-sparring effects or due to the uncontrollable bias introduced by a non-randomized, unblinded trial design, with serious potential for the Hawthorne effect. It may very well be that any fluid in excess is harmfull and “balanced” fluids high in acetate and lactate have their very own unintended consequences when administered in high volumes. The SPLIT trial (scheduled to be published in 2015) may validate our beliefs in the superiority of a balanced fluid strategy, but until then it is important we resist the urge to become quite so dogmatic with our cries of indignation towards chloride-rich solutions.

 A brief disclosure: I am, in fact, overwhelmingly and irredeemably in favor of the Stewart approach to acid-base disorders. although there is no convincing evidence directly demonstrating its superiority over the more traditional Henderson-Hasselbalch model, its elegance and intuitive nature make it perfect for the swirling chaos and uncertainty of the Emergency Department. As such it is not hard to imagine that the more judicious administration of fluid, specifically those high in chloride content, would benefit our patients by reducing hyperchloremic acidosis and the concomitant renal failure. I am however, less enthused by the evidence supporting this premise.

-A special thanks to Anand Swaminathan (@EMSwami) for his thoughts and guidance during the writing of this post.

-As always a special thanks to my ever patient wife, Rebecca Talmud(@DinosaurPT), for her editorial wizardry without which this blog would be the unstructured ramblings of a madman.

Sources Cited:

  1. O’Shaugnessy, WB (1831). “Proposal for a new method of treating the blue epidemic cholera by the injection of highly-oxygenated salts into the venous system”. Lancet 17 (432): 366–71
  2. Scheingraber S, Rehm M, Sehmisch C, Finsterer U. Rapid saline infusion produces hyperchloremic acidosis in patients undergoing gynaecologic surgery. Anesthesiology. 1999;90:1265–1270
  3. Quilley CP, Lin Y-S, McGiff JC. Chloride anion concentration as a determinant of renal vascular responsiveness to vasoconstrictor agents. Br J Pharmacol. 1993;108:106–110
  4. Hansen PB, Jensen BL, Skott O. Chloride regulates afferent arteriolar contraction in response to depolarization. Hypertension. 1998;32:1066–1070.
  5. O’Malley CM, Frumento RJ, Hardy MA, Benvenisty AI, Brentjens TE, Mercer JS, Bennett-Guerrero E. A randomized, double-blind comparison of lactated Ringer’s solution and 0.9% NaCl during renal transplantation. Anesth Analg. 2005;100:1518–1524
  6. Waters JH, Gottlieb A, Schoenwald P, Popovich MJ, Sprung J, Nelson DR. Normal saline versus lactated Ringer’s solution for intraoperative fluid management in patients undergoing abdominal aortic aneurysm repair: an outcome study. Anesth Analg. 2001;93:817–822.
  7. Hatherill M, Salie S, Waggie Z, Lawrenson J, Hewitson J, Reynolds L, Argent A. Hyperchloraemic metabolic acidosis following open cardiac surgery. Arch Dis Child. 2005;90:1288–1292
  8. Raghunathan K, Shaw A, Nathanson B, Stu ̈ rmer T, Brookhart A, Stefan MS, Setoguchi S, Beadles C, Lindenauer PK (2014) Association between the choice of IV crystalloid and in-hospital mortality among critically ill adults with sepsis. Crit Care Med 42:1585–1591
  9. Yunos NM, Bellomo R, Hegarty C, Story D, Ho L, Bailey M (2012) Association between a chloride-liberal vs chloride-restrictive intravenous fluid administration strategy and kidney injury in critically ill adults. JAMA 308:1566–1572
  10. Yunos NM, Bellomo R, Glassford N, Sutcliffe H, Lam Q, Bailey M. Chloride-liberal vs. chloride-restrictive intravenous fluid administration and acute kidney injury: an extended analysis. Intensive Care Med. 2014.
  11. http://www.kdigo.org/clinical_practice_guidelines/pdf/CKD/KDIGO_2012_CKD_GL.pdf
  12. Luo X, Jiang L, Du B, et al. A comparison of different diagnostic criteria of acute kidney injury in critically ill patients. Crit Care. 2014;18:(4)R144.

 

 

A Secondary Examination of The Adventure of the Cardboard Box-Addendum

Screen Shot 2014-12-18 at 7.02.55 PM Screen Shot 2014-12-19 at 10.57.56 AM

Published in the NEJM on December 17th 2014, ushered in with the inflated fanfare only the medical industry capable of, MR CLEAN marks the first successful trial of interventional therapy for acute ischemic stroke. In direct contrast to IMS-3, SYNTHESIS and MR RESCUE, MR CLEAN is a significantly positive trial.  The authors demonstrated success in their primary outcome, “improved neurological outcomes at 90 days” with an adjusted odds ratio of 1.67  (95% confidence interval [CI], 1.21 – 2.30). (1). Why MR CLEAN was positive when the three trials that came before were negative is still unclear. As discussed in my previous post (as well as far more elegant posts on emlitofnote.com and stemlynsblog.org) it may be due to better equipment, faster symptom onset to recanulization times, and the incorporation of CT angiography to identify a cohort of patients who would truly benefit from these invasive interventional strategies. Conversely it may simply be due to the traditional therapy group performing so poorly.

A closer look at the results from MR CLEAN reveal that though the interventional group outperformed the placebo group by a significant amount (an absolute increase of the number of patients alive and independent by 13.5%) compared to its peers, its performance was far from exceptional. In the IMS-3 trial, 40.2% patients in the control arm (tPA alone) were alive and independent at 90 days compared to only 32.6% of the patients in the intervention arm of the MR CLEAN trial (1,2). Even the placebo groups in NINDS and ECASS-3 who received no reperfusion therapies had better outcomes than the patients receiving interventional therapies in the MR CLEAN trial. 26% and 45.5% of the control patients in the NINDS and ECASS-3 trials respectively had a mRS of 0 or 1 at 90 days (3,4). Compared to only 11.6% of the patients in the interventional arm of MR CLEAN.

Though it is difficult and not completely appropriate to compare groups from different trials, it does call into question the reasons for MR CLEAN’s staggering success. It may very well have been the patients in the MR CLEAN cohort were far sicker than the earlier stroke trials, (though both their presenting NIHSS and 90 day mortality rates seem quite similar). It may be that the utilization of CT angiography to select patients for recruitment excluded the majority of the stroke mimics who were included in these earlier trials, and will universally have good outcomes (this seems to be the answer given by the authors when queried by Dr. Ryan Radecki). The subgroup analysis, which indicated only the patients with a NIHSS of greater than 20 demonstrated a statistically significant benefit from endovascular therapy, seems to support this supposition (1). The authors point to a meta-analysis of the six trials examining endovascular therapy for acute ischemic stroke as additional proof (5). In this analysis by Fargen et al, published in the J NeuroIntervent Surg, the authors examine the subgroup of patients with radiographically confirmed large vessel occlusion (LVO). Similar to MR CLEAN, patients receiving endovascular therapy had better outcomes at 90 days (a mRS of 0-2 38.3% vs 25.8% respectively ). Even in this combined cohort with radiographically confirmed LVO, outcomes were not as dire as those observed in the MR CLEAN trial.

MR CLEAN’s success may be attributed to the advancements in both procedural proficiency and technological prowess. But it is equally likely that the whimsy of random chance was responsible for these impressive results. On a final note it is important to remember that all of the trials examining endovascular therapy in acute ischemic stroke were compared to a control group that included the administration of IV tPA, an intervention whose own efficacy is very much in doubt. Although the rate of adverse events in the intervention arm of MR CLEAN did not differ significantly from those given IV tPA, this is only because alteplase comes with its own terrifying set of unpleasantries. When compared to a true placebo group, I am certain the rate of symptomatic intracranial hemorrhage and new ischemic stroke (7.7% and 5.6% respectively) would appear far more concerning.

Given the universal failure of the previous three trials examining the very same question, surely more confirmatory evidence is required before investing the unimaginable resources required to support the vast infrastructure needed to make interventional therapy a reality. Since MR CLEAN’s success was announced at the 9th annual World Stroke Conference in Istanbul held in October 2014, two trials examining endovascular therapy in acute ischemic stroke, ESCAPE and EXTEND IA, have halted enrollment early for benefit. It will be interesting to see if these premature stoppages were because of preplanned interim analyses or if MR CLEAN’s success influenced their early termination. I hope we invest the time and resources required to answer these questions with the methodological rigor they deserve. It would be frustratingly tragic to once again be forced to practice with continual doubt because we halted all further investigations out of the fear of discovering that reality is not as promising as the false-truth gained from interpreting only the data that pleases us.

Sources Cited:

  1. The MR CLEAN Investigators. A Randomized Trial of Intraarterial Treatment for Acute Ischemic Stroke. N Engl J Med, December 17, 2014.
  2. Broderick JP, Palesch YY, Demchuk AM, et al. Endovascular therapy after intravenous t-PA versus t-PA alone for stroke. N Engl J Med. 2013;368(10):893-903.
  3. Tissue plasminogen activator for acute ischemic stroke. The National Institute of Neurological Disorders and Stroke rt-PA Stroke Study Group. N Engl J Med. 1995;333(24):1581-7.
  4. Hacke W, Kaste M, Bluhmki E, et al. Thrombolysis with alteplase 3 to 4.5 hours after acute ischemic stroke. N Engl J Med. 2008;359(13):1317-29.
  5. Fargen KM,
Neal D, Fiorella DJ, et al. J NeuroIntervent Surg Published Online First: [please include Day Month Year] doi:10.1136/ neurintsurg-2014-011543.

A Case of Identity Part Two

46eb4509e76a2023d43a355db4032044_400

Our standards for acceptable benefit of antiplatelet agents in the management of ACS have become deplorably low. When ISIS-2 was first published we defined success only by aspirin’s ability to affect mortality. The number commonly cited, 2.4%, only describes aspirin’s absolute benefit to decrease death (1). In the one trial that examined its properties to prevent further infarction, published in the NEJM in 1988, aspirin demonstrated additional capabilities to decrease myocardial infarction as well as save lives (2). If in ISIS-2, aspirin had performed as poorly as clopidogrel did in its efficacy defining study, the CURE trial, it may have never gained the stature it currently holds in the management of ACS (3). To date aspirin has been the only antiplatelet agent that has demonstrated a consistent and clinically relevant mortality benefit. Despite the obvious benefits, it was not long before we turned towards other agents in an attempt to supplement aspirin’s antiplatelet properties. The concept of dual antiplatelet therapy so appealing, its theoretical basis so believable, it soon became a perfunctory part of our management strategy for patients presenting to the Emergency Department with ACS.

Our current concept of dual antiplatelet therapy was first defined in 1996 with the publication of the CAPRIE trial. Published in The Lancet in November of 1996, this trial was the first of many to indicate clopidogrel’s efficacy in finding clinically irrelevant reductions of methodologically deceitful composite endpoints (4). So began the era of dual antiplatelet therapy in which, based off weak endpoints and statistical chicanery, clopidogrel and its post-patent clones have bullied and fibbed their way to success.

Throughout the literature examining clopidogrel in a multitude of ACS cohorts of varying degrees of severity, its vaulted dual anti-platelet capabilities have never demonstrated a clinically relevant mortality benefit. Its success has been powered by the reduction of non-fatal MIs, the majority of which are peri-procedural troponin leaks of questionable clinical significance. The only clinically relevant endpoint clopidogrel has consistently demonstrated is a 1% absolute increase in serious bleeding. Despite these mediocre capabilities, Bristol-Meyer, through sheer force of will and marketing prowess, catapulted their drug to the forefront of pharmaceutical sales for over a decade. The true level of clopidogrel’s mundanity has been discussed in a previous post. The more important question for Emergency Physicians is, does the upstream administration of P2Y12 inhibitors provide added benefit over only administering these medications after confirming appropriate coronary anatomy during catherization?

Upstream use of P2Y12 inhibitors in ACS is commonly believed to be beneficial as it deters thrombus formation through additional inhibition of platelet adherence vs the singular capabilities of aspirin alone. Despite a dearth of evidence validating this claim and reasonable data stating otherwise, this practice has been routinely implemented since the publication of the CURE trial over a decade ago.

So is there a role for dual antiplatelet therapy in today’s Emergency Department?

Two recent publications sought to address this very question. The more methodologically ambitious of these trials was a systematic review and meta-analysis published in the BMJ in October of 2014 in which Bellemain-Appaix et al attempted to examine all the literature addressing the potential benefit of upstream use of P2Y12 inhibitors in NSTEMI patients undergoing PCI (5). Much to the chagrin of the authors, only one trial included (the recently published ACCOAST trial) specifically examines this specific inquiry. The authors, not to be dissuaded, attempted a piecemeal collection of various subgroups from a number of trials that seemed to meet their entry criteria. This resulted in a statistically and clinically heterogeneous cohort whose subsequent data should probably not have been combined into a single analysis. Nevertheless the authors plunged ahead examining the NSTEMI patients in both the CREDO and CURE trials, as well as the subset of patients in the ACUITY trial who were pre-treated with clopidogrel. Also including the entire cohort from the ACCOAST trial, the only trial included examining prasugrel. Combining these trials with the data from three observational cohorts, the authors collected information on 32,838 patients.

Contrary to most trials attempting to examine potential benefit in the use of P2Y12 inhibitors, these authors set aside the traditional composite endpoint (stroke, MI, or cardiovascular death) and looked exclusively at mortality and major bleeding. The authors found no mortality benefit in either the entire cohort, the subgroup of patients from the RCTs, or in the group that underwent PCI. Conversely, an increased risk of bleeding was observed in patients treated with upstream P2Y12 inhibitors. No difference was seen in the rate of stroke, myocardial infarction or urgent revascularization. However when a composite outcome the authors termed “adverse cardiac outcomes” was measured, they found a small reduction in risk (odds ratio 0.84, CI 0.72-098). As with all individual trials examined, this composite benefit is powered by a small increase in myocardial infarctions, that when examined alone did not reach statistical significance (odds ratio of 0.81, CI 0.64- 1.03). When the subgroup of patients who underwent PCI was isolated this decrease in “adverse cardiac events” was no longer significant (odds ratio 0.83, CI 0.99-0.1.03). This speaks more to the drastic decrease in sample size (32,383 to 17,545), decreasing the statistical power to detect clinically irrelevant differences, than a meaningful difference between those who undergo PCI compared to those who do not (5).

Unfortunately the clinical heterogeneity between the trials included in this meta-analysis is quite high. The trials included span different eras and different strategies in the overall management of patients experiencing NSTEMIs. It is questionable whether they should have been included in a formal analysis at all. Fortunately one does not require the above stated statistical manipulations to reach the same conclusions garnered by the authors. Each trial that examined the efficacy of upstream use of P2Y12 inhibitors failed to identify any clinically meaningful benefit.

The CURE trial was the earliest trial included in this analysis. This trial paved the way for clopidogrel’s use in the Emergency Department simply by utilizing composite endpoints of questionable clinical significance and a sample size large enough to bully even the smallest deviation from placebo towards statistical relevance. Published in NEJM in 2001, the authors claimed a statistically significant 2.1% absolute decrease in cardiovascular death and myocardial infarction (MI). This difference was completely powered by the 1.5% absolute difference in MIs, the majority of which were type IV MIs (peri-procedural). This small reduction is of questionable clinical consequence as the mortality rate between groups was identical at 30 days (1.0% vs 1.1%) as well as at the end of follow up (2.3% vs 2.4%)(mean of 8 months)(3).

The CREDO trial sought to answer the question of whether treatment with clopidogrel prior to angiography was beneficial, examined patients scheduled for urgent cardiac cathertization. Patients were randomized to upstream administration of clopidogrel or placebo 3-24 hours prior to cath. All patients received daily clopidogrel following PCI. Like CURE before it, the CREDO authors found no clinically important benefit to the upstream use of clopidogrel. No statistical difference was recorded in the rates of death, stroke or MI at 28 days between the placebo and upstream clopidogrel groups. The authors claim success in the statistical significance of a secondary endpoint, the per-protocol analysis of the 1-year outcomes, noting a 2% absolute reduction in the rate of death, MI, and stroke. Similar to CURE, this difference was powered by the 1.9% reduction of MIs. Interestingly, statistical significance is lost upon examining any of the three endpoints individually, or when the same composite endpoint is analyzed using the authors’ primary outcome and more statistically appropriate methodology, the intention-to-treat analysis. As is typical with all P2Y12 inhibitor trials there was a 1% increase in major bleeding events in both the CURE and CREDO trials, the vast majority of which were related to subsequent CABG procedures after coronary anatomy revealed less than ideal conditions for stent placement (6).

Neither of these trials are methodologically ideal to address the question, in the modern Emergency Department does the upstream use of P2Y12 inhibitors result in improved patient oriented benefits? In the CURE trial less than half the patients (43.8%) underwent coronary angiography and only 21.2% received PCI. In the CREDO trial only 67% of the patients were actually having an enzyme defined NSTEMI (3,6)

The final RCT included in the Bellemain-Appaix et al meta-analysis, the ACCOAST trial, is far more relevant. Using modern PCI techniques and standards, Montalescot et al specifically examined whether upstream administration of prasugrel was beneficial when compared to its administration in the cath lab after visualizing the coronary anatomy. The recent marketing disaster that was the publication of the TRILOGY trial, a completely negative study comparing prasugrel to clopidogrel in patients with ACS, was the first obstacle in prasugrel’s race to fill the post-patent void at the top of the antiplatelet hierarchy. ACCOAST was an attempt to regain the momentum lost with TRILOGY’s failure. Authors randomized 4,033 patients with NSTEMI in the Emergency Department to either 30 mg of prasugrel 2-48 hours prior to PCI or placebo. Following visualization of the anatomy during angiography and stent placement when appropriate, the prasugrel group received the remaining 30 mg of the 60 mg loading dose that is recommended by Eli Lilly. The placebo group received the full 60 mg dose of prasugrel at the time of PCI if stent placement was thought to be beneficial. The authors failed to demonstrate a benefit in upstream administration of prasugrel when compared to its administration in the cath lab, with no difference in cardiovascular death, myocardial infarct, stroke, urgent revascularization or glycoprotein IIb/IIIa bailout (10.8% vs 10.8%). As is consistent with the rest of the literature examining the use of P2Y12 inhibitors, the pretreatment group was found to have approximately 1% increase in major bleeding. Most due to an increase in bleeding during CABG (20.7% vs 13.7%)(7).

Finally, what about the added benefit of dual antiplatelet therapy in the hyperacute patient? What is the benefit of P2Y12 inhibitors in patients with a time dependent lesion? In an article published in September 2014 in the NEJM, Montalescot et al examined this very question. The authors of the “Administration of Ticagrelor in the Cath Lab or in the Ambulance for New ST Elevation Myocardial Infarction to Open the Coronary Artery (ATLANTIC)” trial randomized patients to a 180 mg loading dose of ticagrelor in either the ambulance on the way to the hospital or in the cath lab prior to angiography. The authors and their benefactors, AstraZeneca, hoped to demonstrate that upstream use of ticagrelor was an important addition to the management of these hyperacute patients. Though they included patients with ST-elevation MIs up to 6 hours after symptom onset, the majority of patients were identified within 70 minutes of symptom onset (well within the time dependent portion of STEMI pathology)(8).

A total of 1,862 patients were randomized to either pre-hospital or in-hospital treatment. The authors selected the unfortunate clinically irrelevant measures of the proportion of patients who did not have 70% or greater resolution of ST-segment elevation before PCI, and the proportion of patients who did not meet the criteria for TIMI flow grade 3 in the infarct-related artery at angiography before PCI as their co-primary endpoints. Thankfully we were saved the discussion of why neither of these metrics translated into patient oriented outcomes as the authors failed to find a difference between the pre-hospital and in-hospital groups. Nor was there a difference in the more traditional composite endpoint (rate of cardiovascular deaths, MIs, strokes, urgent revascularizations, or definitive stent thrombosis) so commonly used in P2Y12 inhibitor trials (4.5% vs 4.4%). In an altogether unsurprising twist, the authors claim this trial a success after dredging up a single secondary endpoint from the many measured that achieved a p-value of significance. Patients who received pre-hospital administration of ticagrelor experienced a smaller rate of definitive stent thrombosis when compared to patients receiving the drug in the cath lab (0.2% vs 1.2%). This difference seemed to have no clinical relevance as there was no difference in the rate of myocardial infarction or death at 30 days between the groups. In fact the mortality rate in the pre-hospital group was alarmingly higher, though the 1.3% absolute difference (3.3% vs 2.0%) in 30-mortality failed to reach statistical significance(8).

As is the case with any in depth examination of the literature supporting the use of P2Y12 inhibitors we are left entirely underwhelmed. From as far back as the CURE trial the theoretical benefits of dual antiplatelet therapy have consistently lacked evidentiary support of clinically relevant patient oriented outcomes. Even the small industry manipulated advantages seemingly evaporate when drugs are compared to their own administration downstream in the cath lab once suitable anatomy has been defined. Recently published trials examining long-term use of P2Y12 inhibitors after stent placement have been far from stellar (9,10). Both a meta-analysis and large RCT demonstrated that even in cases of anatomically confirmed disease with stent placement, these medications offer very limited benefits over aspirin therapy alone and in some cases even demonstrate a small increase in mortality (0.5% increase in all-cause mortality). This is not a case where more evidence is required. It is time we reexamine the utility of dual anti-platelet therapy in the Emergency Department. Clearly we now have convincing data that upstream use of P2Y12 inhibitors whether administered pre-hospital or in the Emergency Department do not provide any patient oriented benefits and can only lead to harm. In fact the only benefit that can be gained by maintaining this dual anti-platelet delusion is to ensure the well being of the pharmaceutical companies whose lies and manipulation have led us down this fool’s path in the first place.

Sources Cited:

  1. Randomized trial of intravenous streptokinase, oral aspirin, both, or neither among 17,187 cases of suspected acute myocardial infarction: ISIS-2. ISIS-2 (Second International Study of Infarct Survival) Collaborative Group. Lancet. 1988;2(8607):349-60.Heparin, Aspirin or Both NEJM
  2. Théroux P, Ouimet H, Mccans J, et al. Aspirin, heparin, or both to treat acute unstable angina. N Engl J Med. 1988;319(17):1105-11.
  3. Yusuf S, Zhao F, Mehta SR, et al. Effects of clopidogrel in addition to aspirin in patients with acute coronary syndromes without ST-segment elevation. N Engl J Med. 2001;345(7):494-502.
  4. A randomised, blinded, trial of clopidogrel versus aspirin in patients at risk of ischaemic events (CAPRIE). CAPRIE Steering Committee. Lancet. 1996;348(9038):1329-39.
  5. Bellemain-Appaix Anne, Kerneis Mathieu, O’Connor Stephen A, Silvain Johanne, Cucherat Michel, Beygui Farzin et al. Reappraisal of thienopyridine pretreatment in patients with non-ST elevation acute coronary syndrome: a systematic review and meta-analysis BMJ 2014; 349:g6269
  6. Steinhubl SR, Berger PB, Mann JT, et al. Early and sustained dual oral antiplatelet therapy following percutaneous coronary intervention: a randomized controlled trial. JAMA. 2002;288(19):2411-20.
  7. Montalescot G, Bolognese L, Dudek D, et al. Pretreatment with prasugrel in non-ST-segment elevation acute coronary syndromes. N Engl J Med. 2013;369(11):999-1010.
  8. Montalescot G, Hof AW, Lapostolle F, et al. Prehospital Ticagrelor in ST-Segment Elevation Myocardial Infarction. N Engl J Med. 2014
  9. Mauri L, Kereiakes DJ, Yeh RW, et al. Twelve or 30 Months of Dual Antiplatelet Therapy after Drug-Eluting Stents. N Engl J Med. 2014
  10. Sammy Elmariah MD,Laura Mauri MD,Gheorghe Doros PhD,Benjamin Z Galper MD,Kelly E O’Neill BS,Prof Philippe Gabriel Steg MD,Prof Dean J Kereiakes MD,Dr Robert W Yeh MD. Extended duration dual antiplatelet therapy and mortality: a systematic review and meta-analysis. The Lancet – 16 November 2014

A Secondary Examination of The Adventure of the Cardboard Box

1339516747878

In November of 1995 stroke care as we know it drastically and permanently changed. With the publication of NINDS-2 the NEJM ushered in the interventional era of acute ischemic stroke (1). No longer were we powerless in our management of these patients. Finally we could offer them more than an aspirin to chew on, a corner to sit in, and an appointment with a neurologist in the morning.  And yet NINDS-2 was not the first trial examining thrombolytic therapy for acute ischemic stroke. In fact three trials were published prior to NINDS-2 all of which were negative (NINDS-1, MAST-I, ECASS-1) with two finding an increase in mortality in patients given thromblytics (1,2,3). With the publication of NINDS-2 all this was forgotten. NINDS-2 was impressively positive, demonstrating a 13% absolute increase in patients who were given tPA that were alive and independent (mRS of 0 or 1) at 90 days (1). Supporters justified the 6% absolute increase in symptomatic intracranial hemorrhage by arguing that it did not increase 90-day mortality (21% vs 17%). Despite these impressive results there were still three negative trials to account for. What made NINDS-2 different than all the trials that came before it? Was it the agent? Supporters claim that tPA was the superior thrombolytic and we should ignore all trials studying other agents. Was it time? NINDS examined patients who received tPA within 180 minutes of symptom onset (half in under 90 minutes); two of the earlier trials examined patients who received thrombolytc therapy over a much broader treatment window. Was it the patient population? The authors of NINDS used very strict selection criteria to determine which patients were acceptable candidates. There was of course a fourth reason proposed by a less enthusiastic contingent, that being random chance. This more skeptical party posited that an intervention that possesses little or no efficacy, if studied enough times would eventually demonstrate positive results simply by chance alone. They reminded the more eager supporters of tPA therapy that though the findings of NINDS-2 may be true, taking these results at face value without further validation was not only bad science, but even worse medicine. Despite these warnings the FDA fast tracked the approval of tPA for acute ischemic stroke in under 3-hours and all other trials attempting to validate this benefit were abandoned. As Elliot Grosbard, Genentech scientist, said in internal communications in regards to further trials comparing streptokinase to tPA for acute coronary syndrome;

 We do not know how another trial would turn out, and if we don’t come out ahead we would have a terribly self inflicted wound… (another study) may be a good thing for America, but it wouldn’t be a good thing for us.

Four consecutive trials were published following NINDS (all examining time windows greater than 3 hours) all six were negative and four demonstrating harm (4,5,6,7,8.9). It wasn’t until the 2008 publication of ECASS-3 that another trial examining thrombolytics for ischemic stroke demonstrated benefit (10). These benefits though not as impressive as NINDS-2 were convincing enough to make us forget the seven other negative trials examining similar time windows. Unlike NINDS-2 we were unable to claim ECASS-3 was different than these other negative trials examining similar patients during similar time windows. So instead, we just ignored them. Like our nostalgia for our endless childhood summers we have chosen to selectively interpret the literature that confirms our biases. Remembering only the fireworks, campfire tales, and days spent in the crashing waves of the Atlantic Ocean, we conveniently forget the sun burnt shoulders, poison ivy scorched legs and the tattered knees so commonly acquired during childhood adventures. This tunnel-vision has (mis)guided stroke care for the last two decades. Investigators continue to role the dice, ignoring all numbers that do not suit their purposes.

19 years after the publication of NINDS changed stroke management, the Multicenter Randomized Clinical Trial of Endovascular Treatment for Acute Ischemic Stroke in the Netherlands (given the unfortunate acronym MR CLEAN) once again threatens to overthrow the infrastructure of stroke care (11). The results of this multi-center trial comparing endovascular therapy to standard care were presented at the 9th annual World Stroke Conference in Istanbul held in October 2014. Though these results were not published, with the fanfare NINDS-2 experienced after its initial announcement, it difficult not to see the similarities between these two trials. Like NINDS-2, this is the first trial to show benefit of a novel therapy in acute ischemic stroke. Like NINDS, these results are in direct contrast to the 3 trials published to date examining endovascular interventions for acute ischemic stroke.

On February 7th 2013, NEJM published 3 articles all examining the efficacy of endovascular treatment for acute ischemic stroke. All 3 trials were universally negative each one failing to demonstrate benefit in their own unique manner (12,13,14). These trials were instantly discredited by endovascular apologists stating a number of reasons why they should be ignored. For one they enrolled the wrong patients. These trials primarily used only a non-contrast CT to select patients appropriate for endovascular therapies. Most experts argue this is not the optimal imaging technique for selecting patients for endovascular interventions and all patients should have undergone CT-angiography before enrollment. These trials examined endovascular therapy during the wrong time period. Proponents of these interventions argued that time-to-clot retrieval was far longer than their current standards and these delays erased any benefits endovascular therapy may have provided. Finally and most importantly these trials used the wrong equipment. The Merci retrieval device was prominently featured in all three of these initial trials. A device that most interventionists now consider antiquated and, in a number of small trials demonstrated suboptimal performance when compared to newer devices used currently by most interventionists.

In direct contrast to these results, MR CLEAN is a significantly positive trial. Although the official manuscript has yet to be published a great deal can be garnered from the poster presentation abstract, press release and previously published protocol and statistical analysis plan (11, 15, 16). MR CLEAN is a multicenter randomized clinical trial comparing endovascular treatments for acute ischemic stroke conducted in the Netherlands. Using the prospective open label, blinded endpoint (PROBE) design comparing patients randomized to receive either endovascular treatment (consisting of intra-arterial tPA or urokinase followed by clot retrieval) or standard therapy (91% of which received intravenous tPA alone). Over a 5-year period, authors enrolled 500 patients aged 18 years or older with acute ischemic stroke and a symptomatic anterior proximal artery occlusion, which could be treated within 6 hours after stroke onset.

The authors found patients randomized to endovascular treatment were more likely to have improved neurological outcomes at 90 days with an adjusted odds ratio of 1.67 (95% confidence interval [CI], 1.21 – 2.30). Though there was a significantly higher rate of adverse reactions in the endovascular group (47% vs 42%), these rates did not seem to affect either 90 day neurological outcomes or morality (21%vs 22%). If you were to use the NINDS-2 definition of good neurological outcomes (mRS of 0 or 1), endovascular treatment would have demonstrated a 14% absolute increase in patients alive and independent at 90 days, with a number needed-to-treat (NNT) of 7.

So why was MR CLEAN positive when the 3 trials which came before were negative? Was it the patients? It is true that the authors of MR CLEAN were far more selective in the patients they included. In fact authors required patients to have an occlusion of distal intracranial carotid artery or middle cerebral artery(M1, M2) or anterior cerebral artery (A1) demonstrated with CT angiography (CTA), magnetic resonance angiography (MRA) or digital subtraction angiography (DSA) before they were enrolled in the trial. Was it time? Patients received both IV tPA and endovascular treatments far faster than any of the patients in IMS-3, Synthesis or MR RESCUE. In the MR CLEAN trial patients received their IV tPA on average 85 to 87 minutes after symptom onset and underwent endovascular therapy 196 to 204 minutes after symptom onset. Was it the devices used? In contrast to the initial 3 trials, 97% of the endovascular interventions performed in the MR CLEAN cohort utilized a modern retrievable stent device. Or was it just the random fickle nature of fortune that provided us with these impressive results?

Currently we do not have the full publication of MR CLEAN, so a detailed analysis of the results proves difficult. That being said there are a number of interesting points we can take away from the published protocol and results presented during the 9th annual stroke conference. Firstly the authors claim success in their primary outcome, which they define as “the score on the mRS at 90 days” (11,15). They claim this benefit by citing an adjusted odds ratio of 1.67 (95% confidence interval [CI], 1.21 – 2.30). What are we to take from this odds ratio? What exactly were they measuring and what imbalances were they attempting to adjust that randomization would not account for? In the statistical analysis portion of their protocol the authors only slightly expand on the vague nature of this outcome. The authors used an ordinal analysis in an attempt to quantify the benefits of endovascular therapy over the entire mRS. They then decided to further adjust these outcomes using multivariable logistic regression in an attempt “adjust for chance imbalances in main prognostic variables between intervention and control group”. The specific variables they chose to adjust for were age, stroke severity (NIHSS), time since onset, previous stroke, atrial fibrillation, carotid top occlusion and diabetes mellitus (11). This is the same statistical wizardry used in IST-3 to magically transform a decidedly negative trial into a statistically positive one (17). MR CLEAN marks the first time this type of statistical analysis was used as a trial’s primary end point rather than a secondary experimental, trial saving outcome.

To be clear this trial was an overwhelming success and this analysis is in no way intended to take away from these findings. Rather to question whether an adjusted ordinal analysis is the appropriate outcome to assess efficacy. We have discussed the problems with ordinal analyses in depth in a prior post, but briefly it is an attempt to granularize the data so as to detect smaller changes in outcomes than the more tradition dichotomous cutoff (mRS 0,1, or 2 vs 3,4,5,or 6) is capable of detecting. On face value this seems like a noble pursuit, but logistically presents a number of problems when employed in a trial. Most importantly, is an ordinal analysis an appropriate measure of functional outcomes? Ordinal analysis is an attempt to examine shifts across an entire functional scale. Minute changes in outcomes that would be missed by a dichotomous measurement. To do so one has to assume flawlessness of the collection process and intrinsic reliability of the functional assessment tool. We know that the reliability of the mRS is questionable at best. In fact when two neurologists assess the same patient their results will often differ by up to 2 points (19).

The MR CLEAN 90 day mRS data was assessed using a structured phone interview conducted by a trained research nurse. This trial employed an open design where the patients were not blinded to their group assignments, using an outcome scale of questionable reliability, collected by a phone interview. Authors then utilized a secondary adjustment for variables that should have been controlled by the randomization process. (11) This data is far from flawless. To think you can granularize such data and then extract meaningful outcomes is certainly an error in judgment. Such analysis should be reserved for secondary measures only after a more robust means of appraisal has proven fruitful.

Like all the stroke literature this leaves us trying to compare the soft endpoints of functional neurological outcomes to the hard endpoints of mortality and intracranial hemorrhage (ICH). Despite its success, like NINDS before it, MR CLEAN failed to demonstrate a mortality benefit for endovascular therapies in acute ischemic stroke. The mortality at 90 days was 21% and 22% respectively (15). Add to that a 5% increase in the rate of serious adverse events (47% vs 42%) in the endovascular therapy group. Despite the claim that the newer endovascular devices were safer and caused less bleeds the rate of clinically relevant ICH was statistically equivalent to the patients who received IV tPA alone (6.0% vs 5.2%) (15). This is the same rate of ICH seen in both the IMS-3 and Synthesis trials in which the MERCI retrieval devices were the primary means of clot retrieval (12,13). Furthermore there was a concerning increase in the rate of secondary ischemic strokes in a different vascular territory (5.6% vs 0.4%) and the number of hemicraniectomies performed (6.0 vs 4.9%) in the endovascular treatment group, though given the overall functional outcomes at 90 days were markedly improved in the endovascular therapy group these strokes may not be clinically relevant (15).

So why did endovascular interventions perform so much better in MR CLEAN than in any of the 3 trials that came before it? Was it the modern devices that create superior reperfusion with fewer complications? Interestingly the rate of recanulization in the intervention group at 24 hours was approximately 80% compared to 32% in the IV tPA group alone (15). When compared to the 24 hour recanulization rates in IMS-3 the intervention group were found to have approximately 80% with similar recanulizations rates in the IV tPA group as MR CLEAN (35%) (12). Furthermore the rates of ICH and secondary ischemic infarction seem to be no less than what was observed during IMS-3 and SYNTHESIS. Seemingly these newer devices add little as far as objective effectiveness. Was it time to reperfusion? Patients in MR CLEAN received both IV tPA administration as well as endovascular therapy incredibly fast. So fast that some may question the trial’s external validity. Despite the fact that patients in MR CLEAN underwent both IV and mechanical reperfusion significantly earlier than patients in IMS-3 and Synthesis, earlier treatment with endovascular therapy did not appear to improve outcomes (15). In fact in MR CLEAN, patients who received IV tPA therapy greater than 120 minutes after their symptom onset did better when randomized to the endovascular intervention arm. Conversely when patients received IV tPA prior than 120 minutes after symptom onset, endovascular therapy demonstrated no added benefit. In both IMS-3 and SYNTHESIS no temporal benefit could be demonstrated for patients receiving endovascular therapy (12,12). Was it the patients MR CLEAN selected that made a difference? Though MR CLEAN required CT angiographic proof of a large vessel occlusion, the resulting population seems very similar to the patients in IMS-3. The median age and presenting NIHSS was fairly similar (65-66 vs 68-69 and 17-18 vs 16-17 respectively) (12,15). Even the variation in stroke location was similar with the large majority of the clots located in the M1 segment of the middle cerebral artery (MCA), followed by a third found in the carotid artery terminus and a small minority found in the M2 segment of the MCA.

A few final thoughts of interest, the authors measured change in NIHSS at 24-hour and 1-week intervals. It will be interesting if these findings are expanded upon in the published document, but as far as I can tell from the data presented at the conference, the difference in NIHSS scores between the groups was 2.3 points at 24 hours and 2.9 points at 1 week. I cannot tell if this difference reached statistical significance but seemingly it is under the threshold of a 4-point improvement on the NIHSS that was deemed clinically relevant by the authors of NINDS in their original publication(1). If this data does prove to be accurate than it means that the anecdotal stories of patients rising from the cath lab table shortly after clot removal, was just that, anecdote. Finally it is important to point out that this trial compared endovascular treatment to standard care, which for all intents and purposes was IV tPA (91% of the control group received IV tPA). It is by no means certain that IV tPA provides any added benefit over placebo alone and some skeptics, such as myself, think there is a suggestion of harm. An additional control group, comparing placebo to both IV tPA and endovascular therapy is needed. In the subgroup analysis though endovascular therapy performed better than standard care in patients who received tPA, these benefits were not seen when IV tPA was withheld.

Surely we are left with more uncertainty than when we started this line of investigation. Thankfully there are a number of studies currently underway that may provide us the clarity we require. MR CLEAN is the first trial to demonstrate the potential benefits endovascular therapy may provide, but one trial should not define the standard of care, especially when multiple trials have concluded quite the opposite. The cost and resources needed to create an infrastructure capable of delivering patients to the endovascular suite with the swiftness seen in this cohort would be extraordinary. We should require more than an ambiguous odds ratio, bolstered by further needless statistical adjustments to justify these costs.

Sources Cited:

  1. Tissue plasminogen activator for acute ischemic stroke. The National Institute of Neurological Disorders and Stroke rt-PA Stroke Study Group. N Engl J Med. 1995;333(24):1581-7.
  2. Randomised controlled trial of streptokinase, aspirin, and combination of both in treatment of acute ischaemic stroke. Multicentre Acute Stroke Trial–Italy (MAST-I) Group. Lancet. 1995;346(8989):1509-14.
  3. Hacke W, Kaste M, Fieschi C, et al. Intravenous thrombolysis with recombinant tissue plasminogen activator for acute hemispheric stroke. The European Cooperative Acute Stroke Study (ECASS). JAMA. 1995;274(13):1017-25.
  4. Thrombolytic therapy with streptokinase in acute ischemic stroke. The Multicenter Acute Stroke Trial–Europe Study Group. N Engl J Med. 1996;335(3):145-50.
  5. Donnan GA, Davis SM, Chambers BR, et al. Streptokinase for acute ischemic stroke with relationship to time of administration: Australian Streptokinase (ASK) Trial Study Group. JAMA. 1996;276(12):961-6.
  6. Hacke W, Kaste M, Fieschi C, et al. Randomised double-blind placebo-controlled trial of thrombolytic therapy with intravenous alteplase in acute ischaemic stroke (ECASS II). Second European-Australasian Acute Stroke Study Investigators. Lancet. 1998;352(9136):1245-51.
  7. Clark WM, Wissman S, Albers GW, Jhamandas JH, Madden KP, Hamilton S. Recombinant tissue-type plasminogen activator (Alteplase) for ischemic stroke 3 to 5 hours after symptom onset. The ATLANTIS Study: a randomized controlled trial. Alteplase Thrombolysis for Acute Noninterventional Therapy in Ischemic Stroke. JAMA. 1999;282(21):2019-26.
  8. Clark WM, Albers GW, Madden KP, Hamilton S. The rtPA (alteplase) 0- to 6-hour acute stroke trial, part A (A0276g): results of a double-blind, placebo-controlled, multicenter study: Thrombolytic Therapy in Acute Ischemic Stroke Study investigators. Stroke. 2000; 31: 811–816.
  9. Hacke W, Furlan AJ, Al-rawi Y, et al. Intravenous desmoteplase in patients with acute ischaemic stroke selected by MRI perfusion-diffusion weighted imaging or perfusion CT (DIAS-2): a prospective, randomised, double-blind, placebo-controlled study. Lancet Neurol. 2009;8(2):141-50.
  10. Hacke W, Kaste M, Bluhmki E, et al. Thrombolysis with alteplase 3 to 4.5 hours after acute ischemic stroke. N Engl J Med. 2008;359(13):1317-29.
  11. Fransen PS, Beumer D, Berkhemer OA, et al. MR CLEAN, a multicenter randomized clinical trial of endovascular treatment for acute ischemic stroke in the Netherlands: study protocol for a randomized controlled trial. Trials. 2014;15:343.
  12. Broderick JP, Palesch YY, Demchuk AM, et al. Endovascular therapy after intravenous t-PA versus t-PA alone for stroke. N Engl J Med. 2013;368(10):893-903.
  13. Ciccone A, Valvassori L, Nichelatti M, et al. Endovascular treatment for acute ischemic stroke. N Engl J Med. 2013;368(10):904-13.
  14. Kidwell CS, Jahan R, Gornbein J, et al. A trial of imaging selection and endovascular treatment for ischemic stroke. N Engl J Med. 2013;368(10):914-23.
  15. http://www.medscape.com/viewarticle/834064#vp_1
  16. Dipple et al. WSC-1158  Results of the multicenter randomized clinical trial of endovascular treatment for acute ischemic stroke in The Netherlands. The MR CLEAN Investigators. International Journal of Stroke. Volume 9, Issue S3 ,October 2014 Pages 16–40
  17. Sandercock P, Wardlaw JM, Lindley RI, et al. The benefits and harms of intravenous thrombolysis with recombinant tissue plasminogen activator within 6 h of acute ischaemic stroke (the third international stroke trial [IST-3]): a randomised controlled trial. Lancet. 2012;379(9834):2352-63.
  18. Saver JL: Novel end point analytic techniques and interpreting shifts across the entire range of outcome scales in acute stroke trials. Stroke 2007, 38:3055–3062.
  19. Banks et al. Outcomes validity and reliability of the modified Rankin scale: implications for stroke clinical trials: a literature review and synthesis. Stroke. 2007 Mar;38(3):1091-6. Epub 2007 Feb 1.