Author + information
- Received March 8, 2016
- Revision received April 18, 2016
- Accepted April 19, 2016
- Published online August 8, 2016.
- Yohei Sotomi, MDa,
- Rafael Cavalcante, MD, PhDb,
- David van Klaveren, MScc,
- Jung-Min Ahn, MDd,
- Cheol Whan Lee, MDd,
- Robbert J. de Winter, MD, PhDa,
- Joanna J. Wykrzykowska, MD, PhDa,
- Yoshinobu Onuma, MD, PhDb,
- Ewout W. Steyerberg, PhDc,
- Seung-Jung Park, MD, PhDd and
- Patrick W. Serruys, MD, PhDe,∗ ()
- aAcademic Medical Center, University of Amsterdam, Amsterdam, the Netherlands
- bThoraxCenter, Erasmus Medical Center, Rotterdam, the Netherlands
- cDepartment of Public Health, Erasmus Medical Center, Rotterdam, the Netherlands
- dHeart Institute, University of Ulsan College of Medicine, Asan Medical Center, Seoul, South Korea
- eInternational Centre for Circulatory Health, National Heart and Lung Institute, Imperial College London, London, United Kingdom
- ↵∗Reprint requests and correspondence:
Dr. Patrick W. Serruys, International Centre for Circulatory Health, NHLI, Imperial College London, London SW7 2AZ, United Kingdom.
Objectives The study sought to validate the SYNTAX (Synergy Between Percutaneous Coronary Intervention With Taxus and Cardiac Surgery) score II mortality prediction model after percutaneous coronary intervention (PCI) or coronary artery bypass grafting in a large pooled population of patients with multivessel coronary disease (MVD) and/or unprotected left main disease (UPLMD) enrolled in the PRECOMBAT (Bypass Surgery Versus Angioplasty Using Sirolimus-Eluting Stent in Patients With Left Main Coronary Artery Disease) and BEST (Artery Bypass Surgery and Everolimus-Eluting Stent Implantation in the Treatment of Patients with Multivessel Coronary Artery Disease) randomized controlled trials.
Background For patients with MVD and/or UPLMD, the choice of the best revascularization strategy remains challenging.
Methods Pooled individual patient-level data from PRECOMBAT and BEST were used to assess calibration and discrimination of the SYNTAX score II prediction model for all-cause mortality after PCI and coronary artery bypass grafting at 4-year follow-up. The study population comprised 1,480 patients (600 with UPLMD, 880 with MVD).
Results The overall incidence of all-cause mortality was 6.1% after a median follow-up period of 4.9 years. Validation plots showed good model calibration overall and across treatment groups but tended to overestimate all-cause mortality in the highest risk quintiles of patients in the whole population and the PCI arm. The SYNTAX score II showed moderate discrimination ability for the whole population (C index = 0.685) but better for patients receiving PCI than CABG (C index = 0.718 vs. 0.662 in patients with UPLMD, C index = 0.700 vs. 0.661 in those with MVD). Observed all-cause mortality was higher when the treatment received was at variance with that recommended by the model and similar when it was concordant.
Conclusions The SYNTAX score II has good calibration but only moderate discrimination ability for long-term mortality prediction in this randomized population. This score provides an important tool to help guide the heart team’s decision-making process regarding the selection of the best revascularization strategy for patients with MVD and/or UPLMD. (Bypass Surgery Versus Angioplasty Using Sirolimus-Eluting Stent in Patients With Left Main Coronary Artery Disease, NCT00422968; Bypass Surgery Versus Everolimus-Eluting Stent Implantation for Multivessel Coronary Artery Disease, NCT00997828)
- coronary artery disease
- external validation
- multivessel disease
- randomized control trial
- SYNTAX Score II
- unprotected left main disease
In patients with multivessel coronary disease (MVD) and/or unprotected left main disease (UPLMD), the choice of the best revascularization strategy is a rather complex undertaking. The factors that drive adverse events after percutaneous coronary intervention (PCI) and coronary artery bypass grafting (CABG) are not always the same. For instance, angiographic anatomic complexity has a major impact on PCI results, while it does not affect the results after CABG (1). In contrast, advanced age and clinical comorbidities such as chronic obstructive pulmonary disease have a more negative impact on surgery than PCI results (2).
Current guidelines recommend a heart team approach for the decision-making process regarding the revascularization strategy (3,4). They also recommend the use of anatomic the SYNTAX (Synergy Between Percutaneous Coronary Intervention With Taxus and Cardiac Surgery) score and the Society of Thoracic Surgeons score to help guide this decision. The SYNTAX score II is a tool created using the predictors of 4-year mortality after both treatments in the landmark all-comers SYNTAX trial (2). This prediction model is already recommended for risk stratification in the European guidelines as class IIa (4). It takes into account not only the anatomic complexity of the disease but also clinical comorbidities that were shown to affect mortality in that trial and the interaction of the factors with both PCI and CABG. This score provides an individualized estimation of long-term mortality for both revascularization strategies. On the basis of the difference between estimates, it gives a treatment recommendation of PCI, CABG, or both (equipoise) as the preferred method of revascularization for an individual patient.
The score was externally validated in the DELTA (Drug-Eluting Stent for Left Main Coronary Artery Disease) registry of patients with UPLMD (5). It also showed good calibration and discrimination in the CREDO-Kyoto (Coronary Revascularization Demonstrating Outcome Study in Kyoto) registry of patients with MVD and/or UPLMD (6). Although registries are a good example of real-world patients, they are subject to the flaws of observational studies and to the fact that the patients have already been selected for their treatment. In the present study, therefore, we pooled together individual patient-level data with long-term follow-up from 2 large randomized controlled trials that included patients with 3-vessel and/or unprotected left-main disease, the BEST (Artery Bypass Surgery and Everolimus-Eluting Stent Implantation in the Treatment of Patients with Multivessel Coronary Artery Disease) and PRECOMBAT (Bypass Surgery Versus Angioplasty Using Sirolimus-Eluting Stent in Patients With Left Main Coronary Artery Disease) trials (7,8). The objective of the present analysis was to assess the mortality prediction provided by the SYNTAX score II in this large population of randomized patients and to compare it with the observed mortality at long-term follow-up.
The methods and designs of the BEST and PRECOMBAT trials have been previously described elsewhere (7,8). Some differences between them are worth noting and are summarized as follows.
The BEST trial was a randomized trial conducted at 27 sites in South Korea, China, Malaysia, and Thailand that included 880 patients with MVD and without left main involvement. The PRECOMBAT was a randomized trial conducted at 13 sites in Korea that included 600 patients with UPLMD. In both trials, the populations were randomized to undergo CABG or PCI. The differences in the variables of SYNTAX score II model and predicted mortality by the model among BEST, PRECOMBAT, and SYNTAX trials were evaluated in this analysis.
In both studies, patients deemed eligible for both PCI and CABG by an interventional cardiologist and a cardiac surgeon were prospectively enrolled and randomized in a 1:1 ratio, with the use of an interactive Web-response system, to undergo PCI or CABG. Whereas in the BEST trial, PCI procedures were done with the use of everolimus-eluting stents, in the PRECOMBAT trial, sirolimus-eluting stents were used. The other details of the procedures for PCI and CABG were described elsewhere (7,8).
Outcomes and definitions
The primary endpoint of the BEST trial was a composite of all-cause death, myocardial infarction, or target vessel revascularization. The primary endpoint of the PRECOMBAT trial was the composite of all-cause death, myocardial infarction, stroke, and ischemia-driven target vessel revascularization. The definitions of events were described elsewhere (7,8). In both studies, independent clinical events committees blinded to group allocation adjudicated all events. All angiographic data were analyzed in the angiographic core laboratory of the Cardio-Vascular Research Foundation (Seoul, South Korea). Data quality was monitored systematically as described in our previous publications (7,8).
Our primary objective was to assess the capacity of the SYNTAX score II for PCI and CABG to appropriately stratify the risk for all-cause mortality in patients with severe coronary artery disease. Therefore, for the current pooled data analysis, the primary endpoint was the incidence of all-cause death.
SYNTAX score II
The recent SYNTAX score II has been described in detail previously (2). Briefly, the SYNTAX score II, derived from the 1,800 patients randomized in the landmark SYNTAX trial, consists of 2 anatomic (UPLMD and anatomic SYNTAX score) and 6 clinical (age, creatinine clearance, left ventricular ejection fraction, sex, chronic obstructive pulmonary disease, and peripheral vascular disease) variables that were independently associated with long-term all-cause death in that trial. On the basis of the treatment effect interactions for PCI and CABG, the SYNTAX score II generates different scores and distinct estimated mortalities for each revascularization strategy. Thus, patients are recommended for CABG if the difference in the predicted mortality risk is in favor of CABG with 95% confidence. Likewise, patients are recommended for PCI if the difference in mortality risk predictions is in favor of PCI with 95% confidence. If mortality rates are not statistically significant between PCI and CABG (within the 95% confidence interval [CI]), patients are recommended for equipoise (both treatments are acceptable as equally safe). We used a computer-based automatic calculator for the SYNTAX score II instead of nomogram, which eliminated the variability of calculation by using manual procedure.
Continuous variables are presented as mean ± SD or median and interquartile range (IQR) as appropriate and were compared using Student t tests or Mann-Whitney U tests. Binary variables are expressed as counts and percentages and were compared using chi-square tests. Two patient subsets were predefined in this study: patients with UPLMD (with or without additional vessel involvement) in the PRECOMBAT trial and those with MVD in the absence of left main coronary disease in the BEST trial. In this pooled database, of 1,480 patients, there were missing values for SYNTAX score II variables in 106 cases: anatomic SYNTAX score in 34 patients, creatinine clearance in 31 patients, and left ventricular ejection fraction in 44 patients. To calculate the SYNTAX score II, multiple imputation of missing values in 106 cases was performed, taking into account the correlation between all potential predictors, and sensitivity analyses were done to account for missing values (2). A regression model was used for the imputation, including the following variables as predictors: age, sex, anatomic SYNTAX score, European System for Cardiac Operative Risk Evaluation score, height, body weight, diabetes, current smoking status, dyslipidemia, previous myocardial infarction, previous PCI, previous cerebrovascular event, chronic obstructive pulmonary disease, left main disease, categorical left ventricular ejection fraction, and creatinine clearance. The outcome (all-cause death) was analyzed using the Kaplan-Meier method. Receiver-operating characteristic curves were used to assess the discrimination ability of the SYNTAX score II to predict all-cause death in the whole population, the PCI arm, and the CABG arm in the UPLMD population and in the PCI arm and CABG arm in the MVD population (9,10). We also assessed the discrimination ability of the anatomic SYNTAX score in the whole population as a comparator. Discrimination was studied with the concordance (C) index, which is identical to the area under the receiver-operating characteristic curve. The C index estimates the probability that, of 2 randomly chosen patients, the patient with the more favorable prognostic score will outlive the patient with the less favorable prognostic score and ranges from 0.5 (no discrimination) to a theoretical maximum of 1 (11,12). To take clustering into account in the model evaluation, we assessed the predictive performance in individual trials and treatment arms (13). The calibration performance of the SYNTAX score II was evaluated using calibration plots (14) in the same populations by quintiles of SYNTAX score II–predicted 4-year risk. Calibration refers to the agreement between Kaplan-Meier-estimated and SYNTAX score II–predicted 4-year mortalities. The possible over- or underestimation of the predicted risks was graphically assessed with calibration plots. Differences in distributions of SYNTAX score II variables among the SYNTAX, PRECOMBAT, and BEST trials were evaluated using the Kruskal-Wallis test. A p value <0.05 was considered to indicate statistical significance. All analyses were undertaken using SPSS version 23.0 (IBM Corporation, Armonk, New York).
Patient characteristics are shown in Table 1. The median age was 65 years (IQR: 57 to 71 years); 1,088 patients (73.5%) were men; 696 patients (47.2%) presented with stable angina and 509 (34.5%) with unstable angina. The median anatomic SYNTAX score was 24 (IQR: 18 to 30). Six hundred patients (40.5%) had UPLMD, and the remainder had MVD. Approximately 40% of the study population had diabetes, and 708 patients (47.8%) had hypercholesterolemia. In the CABG arm, 407 patients (65.1%) underwent off-pump CABG. In the PCI arm, 300 patients (40.7%) were treated with sirolimus-eluting stents and the remainder with everolimus-eluting stents. The median total stent number was 3.0 (IQR: 2.0 to 4.0), and the median total stent length was 74 mm (IQR: 50 to 102 mm). The rate of complete revascularization in the whole population was 64.0%, and those by SYNTAX score II–predicted mortality quintiles are indicated in Online Table 1. The incidence of all-cause mortality in the whole population was 6.1% after a median follow-up period of 4.9 years (1,800 days; IQR: 1,420 to 1,800 days) (91 deaths in 1,480 patients).
Predictive performance of SYNTAX score II
For the calibration analysis of SYNTAX score II, calibration plots demonstrate that the SYNTAX score II predicted 4-year all-cause mortality (Figure 1). Overall, good calibration was observed in the PCI and CABG arms, with UPLMD or MVD. In a sensitivity analysis, all analyses with the different iterations of the multiple imputation datasets and the complete data yielded the similar results (Online Figure 1). The SYNTAX score II, however, tended to overestimate all-cause mortality in the highest risk quintiles of patients in the whole population and the PCI arm in patients with UPLMD and those with MVD. The model showed moderate discrimination ability, with a C index of 0.683 (95% CI: 0.628 to 0.738) for the whole population, 0.718 (95% CI: 0.611 to 0.825) for the PCI arm in patients with UPLMD, 0.662 (95% CI: 0.535 to 0.789) for the CABG arm in patients with UPLMD, 0.700 (95% CI: 0.605 to 0.795) for the PCI arm in patients with MVD, and 0.661 (95% CI: 0.561 to 0.761) for the CABG arm in patients with MVD. The C indexes for the PRECOMBAT (UPLMD) and BEST (MVD) trials were 0.684 (95% CI: 0.597 to 0.770) and 0.683 (95% CI: 0.612 to 0.754), respectively. Anatomic SYNTAX score demonstrated poor discrimination ability, with a C index of 0.546 (95% CI: 0.481 to 0.611).
The Kaplan-Meier estimate of all-cause mortality according to treatment recommendations by the SYNTAX score II is shown in Table 2. Although all the comparisons were statistically not significant because of the small numbers of events, observed all-cause mortality was higher when the treatment received was at variance with that recommended by the model and similar when it was concordant. In PCI-recommended patients, the Kaplan-Meier-estimated all-cause mortality was higher in the CABG arm than in the PCI arm (absolute difference −5.7% [PCI better], p = 0.475). In CABG-recommended patients, it was higher in the PCI arm than in the CABG arm (absolute difference 1.7% [CABG better], p = 0.442). In the equipoise population, similar mortality was observed in both arms (5.9% with PCI and 6.3% with CABG, p = 0.810). Table 3 indicates the differences of SYNTAX score II variables in the SYNTAX, PRECOMBAT, and BEST trials. The cumulative frequency distribution curves of SYNTAX score II–predicted mortality in the 3 trials (Figure 2) demonstrated that a relatively higher risk population was included in the SYNTAX trial compared with the PRECOMBAT and BEST trials.
The main findings of this study are as follows: first, the validity of the recommendation for PCI or CABG by SYNTAX score II model was supported for patients with MVD and/or UPLMD; second, the SYNTAX II score model showed good calibration but only moderate discrimination for the individual prediction of 4-year all-cause mortality after PCI and CABG in this population.
The heart team approach and the anatomic SYNTAX score represented significant advances that influenced the care of patients with MVD across the globe and are strongly recommended in both American and European guidelines (3,4). Despite those advances, the application of the results from randomized trials for individual patients in everyday clinical practice remains challenging. This is influenced mainly by the lack of ways to individually predict outcomes for the different treatment strategies available, namely, PCI and CABG.
The SYNTAX score II is a tool developed for that purpose. Still, to reach ubiquitous application (i.e., generalizability and transportability), such a tool needs to be tested many times in different and independent studies (15). The model was first externally validated in 1 UPLMD cohort (2). Nevertheless, because the model includes UPLMD as 1 important predictor with significant interaction between PCI and CABG, this might have biased the first validation. The model was then tested in a population including both MVD and UPLMD, more similar to the development (SYNTAX trial) population (6). However, in both instances, observational registry–type cohorts, prone to selection biases, were used. Therefore, we report for the first time a validation of the model in a large randomized population including patients with MVD and those with UPLMD, with long-term follow-up.
In comparison with previous validations, our analysis showed that the model performed slightly worse in terms of discrimination, with an overall C index of 0.68 (95% CI: 0.628 to 0.738) against a C index of 0.72 for the DELTA registry and C indexes of 0.70 and 0.75 for the CABG and PCI arms of the CREDO-Kyoto registry, respectively. Similarly, in the development cohort (SYNTAX trial), the C index was 0.73.
One interesting finding of the present analysis is the fact that the model overestimated mortality risk more in the PCI arms than in the CABG arms, especially in the higher quintiles of predicted mortality (Figure 1). One possible explanation is related to the fact that different stents were used in the development and validation populations. In the SYNTAX trial, patients were treated with the paclitaxel-eluting Taxus stent, whereas in the PRECOMBAT and BEST trials, sirolimus- and everolimus-eluting stents were used, respectively. Recent studies have shown the superiority of stents eluting drugs of the limus family to paclitaxel-eluting ones with regard to both efficacy and safety outcomes, including all-cause mortality (16,17). Thus, it is possible that a model developed with a less safe and effective stent would overestimate outcomes in patients treated with safer and more efficacious coronary stents. Second, in higher quintiles of SYNTAX score II–predicted mortality, PCI resulted in lower complete revascularization (Online Table 1), which might confound the mortality prediction. Another possible explanation is that the population in the present validation analysis had a lower mortality rate related to the somewhat different clinical risk profile observed in the BEST and PRECOMBAT trials (Table 4), with fewer patients presenting with chronic obstructive pulmonary disease and peripheral vascular disease, as well as overall lower anatomic SYNTAX scores. These characteristics are all important independent mortality predictors (Table 3). This higher risk for the SYNTAX trial population can be observed in the cumulative distribution of mortality predictions of the 3 trials (Figure 2). When we focused on the lower risk patients with predicted mortality ≤11.59% (first to fourth quintiles), the prediction by the SYNTAX score II performed quite well. Future validation of this score in a population with higher anatomic complexity and higher risk for comorbidities is warranted.
When assessing the observed mortality of the patients according to SYNTAX score II model–derived treatment recommendations, we note that the rates of death were lower when the treatment assigned at the randomization (intention-to-treat analysis) was concordant with model recommendation compared with when it was discordant. These numeric differences did not reach statistical significance, probably because of the very low event rates. Nevertheless, the mortality was almost twice as high when they were treated with CABG. Moreover, the large equipoise group shows that the score performed well in predicting similar mortality rates with both treatment strategies. These findings altogether demonstrate good performance of the score in these 2 recommendation groups (PCI and equipoise). In the CABG recommendation group, the score could not discriminate very well the mortality rates after PCI and CABG. The reasons for that are still unclear. One of the possible explanations is the apparently higher homogeneity in outcomes once CABG is the performed treatment, as depicted by the higher discrimination ability (PCI C index = 0.72 vs. CABG C index = 0.66 in patients with UPLMD, PCI C index = 0.70 vs. CABG C index = 0.66 in patients with MVD) of the score in the PCI arm compared with the CABG arm. Although this validation, because of the small number of events, could not prove unquestionably the value of the score, it definitely represented a demonstration of its ability to help the decision-making process. A further validation in a larger patient population (e.g., the EXCEL trial) will help establish the actual role of the SYNTAX score II model.
Last, other factors related to more contemporary clinical management used in the more recent trials included in this validation and racial (Caucasian vs. Asian) differences could have played a role in the overall performance of the model.
The main advantage of the present analysis is that it included a large, randomized population not related to that of the development cohort. The issue of validating this score in a registry population is related to the fact that the observed performance of the score could be due to several known and unknown or unobserved confounders that might have driven treatment selection and therefore consequent outcome for each individual patient. In a randomized population, these confounders are accounted for at randomization as much as possible, and differences in treatment outcomes more reliably reflect treatment effect. The finding of a similar or even better calibration in registries is a very positive one in the sense that they reflect an appropriate treatment selection by doctors in real-world practice, as is shown by a good agreement with the prediction model, whereas in the randomized trials, doctors did not influence treatment selection. It is obvious that in real-world practice, doctors are likely to select the treatment associated with better outcomes. This might be what was shown by a good calibration of the SYNTAX score II in the DELTA and CREDO-Kyoto registries (2,6). In addition, the outcome predicted by the model is all-cause mortality, which is not subject to adjudication and is less prone to problems with reporting. Furthermore, in the present study, we report our findings following the guidelines of the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis statement (18).
First, the number of events (91 deaths) in the present study might not be large enough to provide sufficient statistical power. Although we evaluated the Kaplan-Meier estimate of all-cause mortality by treatment recommendation (Table 2), the number of PCI-recommended patients was only 50, resulting in statistical underpowering. Second, even though all patients in this study presented with MVD and/or UPLMD, the difference in the clinical profile between the development and validation populations is large enough to potentially influence the outcome measured (i.e., all-cause mortality), and this probably affected the performance of the model. Furthermore, the validation cohort was composed of a heterogeneous group of patients from 2 different trials that occurred in different scenarios and at different time points. Nevertheless, our strategy to pool data from those trials was undertaken to compensate for a lack of a large enough randomized population with MVD and/or UPLMD in published research. Last, because of the very rapid technological advances seen in the field of coronary disease treatment, predictors of mortality such as anatomically complex disease (i.e., high SYNTAX score) or left ventricular dysfunction might lose power over time, following the introduction of better performing coronary stents or newer ventricular assist devices. Thus, a model based on such predictors might tend to lose predictive ability as time goes by.
The SYNTAX score II has good calibration but only moderate discrimination ability for 4-year mortality prediction after PCI and CABG in patients with MVD and/or UPLMD in the populations as randomized for the BEST and PRECOMBAT trials. This score provides an important tool to help guide the heart team’s decision-making process regarding the selection of the best revascularization strategy for this patient population.
WHAT IS KNOWN? The SYNTAX score II is a mortality prediction tool created on the basis of the SYNTAX trial.
WHAT IS NEW? The present study demonstrated that the SYNTAX score II has good calibration but only moderate discrimination ability for long-term mortality prediction in the randomized population. This score provides an important tool to help guide the heart team’s decision-making process regarding the selection of the best revascularization strategy for patients with MVD and/or UPLMD.
WHAT IS NEXT? A further validation in a larger patient population (e.g., the EXCEL trial) will help establish the actual role of the SYNTAX score II model.
For a supplemental table and figure, please see the online version of this article.
The authors have reported that they have no relationships relevant to the contents of this paper to disclose. Drs. Sotomi and Cavalcante contributed equally to this work.
- Abbreviations and Acronyms
- coronary artery bypass grafting
- confidence interval
- interquartile range
- multivessel coronary disease
- percutaneous coronary intervention
- unprotected left main disease
- Received March 8, 2016.
- Revision received April 18, 2016.
- Accepted April 19, 2016.
- American College of Cardiology Foundation
- Farooq V.,
- van Klaveren D.,
- Steyerberg E.W.,
- et al.
- Levine G.N.,
- Bates E.R.,
- Blankenship J.C.,
- et al.
- Windecker S.,
- Kolh P.,
- Alfonso F.,
- et al.
- Chieffo A.,
- Meliga E.,
- Latib A.,
- et al.
- Ahn J.M.,
- Roh J.H.,
- Kim Y.H.,
- et al.
- Dangas G.D.,
- Serruys P.W.,
- Kereiakes D.J.,
- et al.
- Collins G.S.,
- Reitsma J.B.,
- Altman D.G.,
- Moons K.G.,
- for the TRIPOD Group