Author + information
- Received May 7, 2008
- Revision received September 22, 2008
- Accepted September 24, 2008
- Published online February 1, 2009.
- Lloyd W. Klein, MD, FACC⁎,⁎ (, )
- Paul Kolm, PhD†,
- Xin Xu, MS†,
- Ronald J. Krone, MD, FACC‡,
- H. Vernon Anderson, MD, FACC§,
- John S. Rumsfeld, MD, PhD, FACC∥,
- Ralph G. Brindis, MD, MPH, FACC¶ and
- William S. Weintraub, MD, FACC†
- ↵⁎Reprint requests and correspondence:
Dr. Lloyd W. Klein, Director, Clinical Cardiology Associates, Gottlieb Memorial Hospital, 675 West North Avenue, Melrose Park, Illinois 60160
Objectives This study applied risk adjustment methods to evaluate member institutions of the American College of Cardiology–National Cardiovascular Data Registry with respect to in-hospital mortality in percutaneous coronary intervention patients over a 4-year period to assess variability in risk-adjusted performance measures.
Background Cardiac catheterization laboratories, hospital networks, and third-party payers are interested in assessing the outcomes of percutaneous coronary interventions. Evaluation of outcomes without considering case selection may lead to erroneous conclusions about program quality.
Methods The National Cardiovascular Data Registry database was queried for all percutaneous coronary intervention cases performed between January 1, 2001, and September 30, 2004. Random effects logistic regression was used to develop models of in-hospital mortality and compute an expected mortality rate for each program. The observed mortality rate in each program was divided by the program's predicted rate to obtain the observed/expected (O/E) mortality ratio. Change in the O/E ratio was assessed by a generalized estimating equation approach to repeated measures. An index of variability was calculated by the mean absolute difference between O/E ratios of each pair of years.
Results There were 664,909 interventional procedures performed in 403 National Cardiovascular Data Registry programs from 2001 to 2004. There was no significant systematic change in O/E ratios over the 4-year period, but there was significantly greater variation in O/E ratios associated with lower percutaneous coronary intervention volume programs.
Conclusions Our risk-adjustment models had very good discrimination and were relatively consistent over the study period. There was substantial within-program variation in O/E ratios. This information would provide an indication for a detailed examination of individual programs.
Cardiac catheterization laboratories, hospital networks, and third-party payers are interested in assessing the outcomes of percutaneous coronary interventions (PCI) and comparing the results objectively across institutions. Evaluating unadjusted clinical outcomes without considering case selection may lead to erroneous conclusions about program quality (1–5). Present trends toward public release of quality information to evaluate programs depending on performance necessitate maintaining, as much as possible, a level playing field for the laboratories involved.
Risk-adjusted mortality rate (RAMR) is a standard method of assessing quality of care in the interventional catheterization laboratory. Risk-adjusted mortality rate adjusts on the basis of risk factors that affect, or potentially affect, PCI outcomes and thus compensates to some degree for differences in case mix across institutions. A component of RAMR is the ratio of the observed outcome rate to the expected outcome rate, the latter computed from a regression model of relevant risk factors. The observed to expected (O/E) mortality ratio has been used, along with RAMR, to compare outcomes of various procedures and outcomes across institutions (6–9). Because the expected outcome rate is computed from a model, sampling variability and model discriminatory ability affect the accuracy of the expected rate and thus the O/E ratio and RAMR. In addition, the value of model parameters, or the combination of model parameters, may change over time and consequently affect estimates for individual institutions.
The purpose of this study was to evaluate member institutions of the American College of Cardiology–National Cardiovascular Data Registry (ACC-NCDR) with respect to in-hospital mortality in PCI patients over a 4-year period using risk-adjustment methods. We assessed variation of O/E ratios over the 4-year period and whether O/E ratios differed by institution demographics of location, PCI volume, number of board-certified cardiologists, and teaching or nonteaching status.
Data registry and selection
The ACC-NCDR is a voluntary national registry that currently receives data from >700 participating hospitals. The basic details of the dataset used here have previously been published (10,11). All data elements are linked to the American Heart Association/ American College of Cardiology PCI guidelines. The data collection process used by the ACC-NCDR has been described in detail (11). Data at each participant institution are entered locally into software purchased from vendors certified to accurately acquire and transmit data to the ACC-NCDR. Local institutional PCI programs audited all data for completeness and accuracy. Many local quality assurance programs are based on the collected data. Additionally, a national audit program sponsored by NCDR has reviewed about 5% of all cases. Only data meeting strict predefined criteria for completeness and accuracy (10,11) are entered into the ACC-NCDR registry and used in this analysis. Each data element is predefined, linked to the American Heart Association/ACC PCI guidelines, and available on the ACC website (12).
The NCDR database was queried for all PCI cases performed between January 1, 2001, and September 30, 2004. There were 403 member institutions enrolling patients within this time frame (though not all for all 4 years). These institutions are diverse and widely distributed throughout the country. Member institutions include rural, suburban, and urban centers, both teaching and community hospitals, and institutions of all sizes.
The ACC-NCDR previously developed and validated risk-adjustment models for in-hospital mortality in PCI (13,14) using data collected between 1998 and 2000. Using the same set of risk factors (Table 1), models were developed using the current data to calculate expected mortality. Ejection fraction was categorized as “not done,” ≤40%, and >40%; the latter being the reference category for calculation of odds ratios. Lesion severity was categorized as high, moderate, and low, the latter being the reference category. Acute myocardial infarction was categorized as ST-segment elevated, nonelevated, and no acute myocardial infarction. Smoking history was categorized as current, former, and never smoked, the latter being the reference category. Age was analyzed in 10-year increments and body mass index in 5-kg/m2 units. Angina class and New York Heart Association functional class were analyzed as ordinal categories. All other risk factors were dichotomous. For this study, we developed separate risk-adjustment models for each year: 2001, 2002, 2003, and 2004.
Random effects logistic regression was used develop models of in-hospital mortality. The random effects model accounts for patient clustering within institutions. Beginning with the 27 risk factors, a backward elimination strategy was used to develop reduced models. Nested models were compared using the likelihood ratio test. Bootstrap methods were used to validate and calibrate bias-corrected indexes of model performance (15). The C-statistic index (equivalent to the area under a receiver-operator characteristic curve) was calculated to assess model discrimination, the ability of the model to correctly identify patients with respect to in-hospital mortality.
Missing risk factor data were assumed to be missing at random and imputed using Markov Chain Monte Carlo methods (16). A total of 3 imputed datasets were generated for each year. Regression coefficients from the 3 imputed datasets were combined, and a predicted probability of in-hospital mortality was calculated for each patient. Thus, all patients were used in the development of the mortality models.
A logit (natural logarithm of the odds of in-hospital mortality) for each patient was calculated from the logistic regression equations and transformed to a predicted probability of in-hospital death by taking the exponent of the logit and dividing by 1 plus the exponent of the logit. The predicted probabilities were averaged across patients at a participating institution to calculate the predicted mortality at that institution, given its case mix.
The observed mortality rate in each program was divided by the program's predicted rate to obtain the O/E mortality ratio. The O/E ratio yields an index for which a value <1 means that in-hospital mortality was less than expected, and a value >1 means that in-hospital mortality was greater than expected. Programs with the largest O/E ratios would rank the lowest, and programs with smaller O/E ratios would rank higher. Risk-adjusted mortality rate is calculated as the O/E ratio times the overall mortality rate in a given year. In this study, we present results only for the O/E ratio. Bootstrap methods (17) were used to calculate the O/E ratio 95% confidence intervals. Variation in program O/E ratios over the 4 years was assessed by computing the mean absolute difference of O/E ratios, and rankings for each pair of years. A small value would indicate similarity in rankings over the years, whereas a large value would indicate dissimilar rankings.
Generalized estimating equation models for repeated measures were used to compare O/E ratios with respect to number of board-certified cardiologists, PCI volume, location, and teaching status across years. The O/E ratios were right skewed and thus were modeled assuming a gamma distribution. One-way analysis of variance was used to compare absolute differences in rankings averaged over each pair of years for programs with data in all 4 years. SAS version 9.2 (SAS Institute, Cary, North Carolina), and Stata version 10 (Stata Corp., College Station, Texas) were used to develop regression models and compare O/E ratios.
There were 664,909 interventional procedures performed in the 403 programs from 2001 to 2004. The observed mortality rate declined significantly over the 4-year period (p < 0.001). Rates were 1.36%, 1.30%, 1.23%, and 1.12% in 2001, 2002, 2003, and 2004, respectively. Complete patient data were available for all 27 risk factors included in the risk models except recent congestive heart failure, New York Heart Association classification, and body mass index. Missing congestive heart failure and New York Heart Association data were <1%, whereas 3% of body mass index data were missing. These variables were imputed as described, and models of in-hospital mortality were obtained by combining the regression coefficients from the 3 imputed datasets.
Table 2 presents the risk-adjustment models developed for each year. Models were relatively consistent from year to year with respect to risk factor composition and impact on mortality. Shock at PCI had the greatest impact on mortality, with odds ratios ranging from 10 to 13. Acute myocardial infarction status, renal failure, left main disease, ejection fraction, nonelective PCI status, and lesion severity were consistent predictors of mortality in all 4 years with odds ratios of 2 or greater. All 4 models had good discrimination with C-statistic indexes of 0.9.
Although there were a total of 403 programs involved over the 4-year period, not all were available for evaluation in every year as some programs may have joined after 2001, some may have left the program before 2004, or possibly, some may not have reported data in every year. Of the 403 programs, 45% had data for all 4 years, 21% for 3 of the 4 years, 24% for 2 years, and 10% had data for only 1 of the years. For 2001, there were 228 programs with data for evaluation. A total of 41 (18%) of these programs had O/E ratios significantly >1 and ranged from 1.30 to 5.96. There were 285 programs evaluated in 2002, and 69 (24%) of these had O/E ratios significantly >1, ranging from 1.20 to 4.54. For 2003, 355 programs were evaluated, and 90 (25%) of these had O/E ratios significantly >1, ranging from 1.13 to 7.71. For 2004, 339 programs were evaluated, and 54 (16%) of these had O/E ratios significantly >1, ranging from 1.22 to 5.77.
There was considerable variability of O/E ratios among the various programs over the 4-year period. For presentation, we chose the 20 lowest ranked (largest O/E ratios) programs in each year (Table 3). Of the 403 programs, 339 (84%) were never among the 20 lowest-ranked programs in the 4-year period. There were a total of 64 programs that were among the 20 lowest-ranked programs over the 4-year period. Of the 64, 1 program had data for only 1 year, 9 programs had data for 2 years, 18 had data for 3 years, and the remaining 36 had data for all 4 years. Of the 64 programs, 53 (83%) appeared among the 20 lowest-ranked only once in the 4-year period. Eight appeared in 2 of the 4 years, 1 appeared in 3 of the 4 years, and 2 were ranked among the 20 lowest all 4 years.
There were a total of 180 programs (45%) for which O/E ratios could be computed in all 4 years. Using these data, the top panels of Figure 1 plot O/E ratios for the 20 lowest, 20 middle, and 20 highest-ranked programs in 2001 and corresponding O/E ratios in subsequent years. Generally, there was greater variability of O/E ratios over the years in the lowest-ranked programs, as well as the highest-ranked programs, when compared with middle-ranked programs. The lower panels highlight 2 programs from each group in the top panels. Programs 2, 4, 45, 54, and 29 are identified by the same number in Table 3 and illustrate yearly variation in rankings.
Analysis of O/E ratios across years indicated no significant relationship with location (p = 0.46), teaching status (p = 0.20), number of board-certified cardiologists (p = 0.22), or PCI volume (p = 0.24). Analyzing only those programs with 4 years of data also indicated no significant relationships (p = 0.47, 0.82, 0.95, and 0.18 for location, teaching status, cardiologists, and PCI volume, respectively). Comparing mean absolute difference in rankings for each pair of years indicated no significant differences between locations (p = 0.86), teaching status (p = 0.51), and number of board-certified cardiologists (p = 0.37). Lower-volume programs (<400 per year) had significantly greater differences in rankings over the years than medium- (400 to 800 per year) and high- (>800 per year) volume programs (p < 0.001 for both), likely reflecting sampling variability. Lower-volume programs had a mean absolute difference of 53 ± 20 places in rankings over the 4-year period compared with 40 ± 19 and 33 ± 16 for medium- and high-volume programs, respectively. The same findings were obtained for mean absolute difference in O/E ratios (data not shown).
Use of outcomes to assess coronary interventional program quality has become a common practice and is used in public domain reports such as the California CABG Mortality Reporting Program (18) and the New York Cardiac Surgery Reporting System (19). Such reports employ risk adjustment methods, because observed outcome rates are not adjusted for case mix and therefore are not usually appropriate for comparison of programs or individuals. Reporting of risk-adjustment outcomes for individual institutions or physicians is regarded as the more appropriate measure of quality (20–22).
A number of studies have been done to assess the validity and use of risk-adjustment methods (1,7,8). Because risk-adjustment models involve sampling variability in estimating predicted probabilities of the outcome, program size, as well as number of events, will affect the confidence intervals of O/E ratios (or RAMR). In particular, smaller programs may have wider confidence intervals that include 1 for O/E (or 0 for RAMR) even though the ratio may be large. This would be the case whether confidence intervals were obtained by bootstrap or normal theory methods.
The discriminatory ability of the model will affect the accuracy of predicted outcomes. If model discrimination were low, then prediction accuracy may not be sufficient to use for program evaluation. Model discrimination for in-hospital mortality in our study was quite high (0.9), so there would be good confidence in the estimated expected probabilities for calculation of the O/E ratio. On the other hand, model discrimination for the outcome of unplanned coronary artery bypass graft in our data was much lower, about 0.7 (data not shown). Thus, use of unplanned coronary artery bypass graft as a measure of program quality would be less valid than in-hospital mortality. Setting a value for acceptable discrimination would be arbitrary, but it would be important for program evaluators to understand that an index of 0.5 indicates no discrimination and the closer the index to 1.0, the better the discrimination.
Our study evaluated NCDR participating programs over a 4-year period from 2001 to 2004 with respect to in-hospital mortality. An O/E ratio and 95% confidence interval was obtained for each program in each year. There were, on average, 40 programs each year that had fewer than 100 patients. However, over the 4-year period, only 9 programs with fewer than 100 patients had O/E ratios of 1.5 or greater and 95% confidence intervals that included 1. Thus, most NCDR programs had sufficient patient volume so that a decision whether their O/E ratio was significantly different from 1 was not an issue. Program evaluation of outcomes related to patient volume itself has been studied but has been questioned as an appropriate measure of quality (23–26). Our results indicated that there was no difference with respect to O/E ratios between programs with differing patient volumes, but the variability in year-to-year rankings and O/E ratios was greater in lower volume programs.
Our data suggest that a few NCDR participant programs were habitually problematic and others fell in and out of the lowest O/E ratio rankings. This may reflect real changes in programmatic quality due to local quality assurance programs that used their data to identify systematic problems and fix them. However, chance also affects placement. If a longitudinal assessment is considered, as in our study, then in a 1-year period, the probability of a program appearing by chance among the 20 lowest-ranked of 400 programs is 1 in 20 (20 of 400). The probability of being ranked, by chance, in the lowest 20 in 2 of the years is 6 in 400 (4!/[(4-2)!*2!] * 1/20 * 1/20). For 3 of the years, the probability decreases to 4 in 8,000, and for all 4 years, 1 in 160,000. For a larger number of programs, the probabilities would be much smaller.
A ranking among the lowest in 1 year may not be adequate to judge a program's performance, but being ranked lowest in more than 1 of the years by chance is small and would likely be a cause for concern. The variation in individual O/E ratios (and rankings) from year to year indicates that “program quality,” at least in terms of mortality, is not as readily apparent as it might if only 1 year's data were analyzed. For example, program #2, with large O/E ratios all 4 years, would probably want to evaluate their program in more depth to determine whether there was a systematic problem at the hospital level, or whether a particular operator required further scrutiny. However, program #29 may also want to evaluate their program to determine the reason for the large O/E ratio after 3 consecutive years of no deaths (Fig. 1).
We found no evidence of systematic change in O/E ratios over the 4-year period: that is, there was no overall trend, and none related to program demographics of PCI volume, number of board-certified cardiologists, or teaching status. There was considerable individual program variation in O/E ratios, and this was related to PCI volume with larger variation associated with smaller volume programs.
The primary limitation of this study, or any other where reporting is voluntary, is that member institutions were not necessarily the same from year to year. This may affect the models developed in each year and thus impact expected mortality differently from year to year. However, in a large data registry, this impact may be minimal, and our results suggest this is the case as the models were relatively similar from year to year and consistent with respect to discrimination. The prediction model itself is influenced by case selection (8) as well as advances in technique and adjunctive pharmacology. We developed models for each year, but applying the 2001 model to subsequent years resulted in some loss of accuracy, so the model would need to be recalibrated (7). At some point, it would likely be necessary to update the model.
Calculation of a meaningful institutional O/E ratio is dependent on adequate patient volume. For example, the death of 1 patient or 2 patients in a low-volume site that has 10 cases in a given year will yield a high mortality rate that may not accurately reflect the program's quality. There were a small number of NCDR participant low-volume sites, as assessed in relative terms. However, the distribution of volume in NCDR is characteristic of the practice of medicine in the U.S., and NCDR participant institutions represent almost 30% of all PCI programs. It is possible that mainly institutions with a strong commitment to quality participate in NCDR, and thus, the NCDR may not reflect all institutions in their volume category.
It is also possible that certain operators might “game” the system: that is, recognizing that their data is being collected, they may purposely exaggerate the presence of high risk variables that will lead to higher estimated mortality scores. However, there is no objective evidence of such a practice, and there is a robust external audit system in place designed to minimize this concern.
Third-party payers, including insurance carriers and centers for Medicare and Medicaid services, are demanding an objective method to assess program quality of individual PCI programs. Risk-adjustment methods provide a way of making a comparison of outcomes more equitable by adjusting for case mix. In particular, the O/E ratio is an easily interpreted metric that indicates a program's standing with respect to worse-than-expected or better-than-expected outcomes. In our study, the risk-adjustment models used to calculate the expected outcome had very good discrimination. Over a 4-year period, there was no systematic change in O/E ratios, but there was substantial individual program variability. Variation in O/E ratios over some time frame could be provided for individual programs in the form of “quality control” plots (as in the lower panels of Fig. 1), and comparison with other programs by rankings, or percentiles of rankings, based on O/E ratios. Although the O/E ratio has a straightforward interpretation, understanding how it is derived, the effects of sampling variability, and model accuracy on its derivation would help institutions to use it as an indication for more detailed examination of their program rather than as an absolute number that defines the program as simply “high” or “low” quality.
E. Magnus Ohman, MD, was the Guest Editor of this report.
- Abbreviations and Acronyms
- American College of Cardiology–National Cardiovascular Data Registry
- observed to expected mortality ratio
- percutaneous coronary intervention
- risk-adjusted mortality rate
- Received May 7, 2008.
- Revision received September 22, 2008.
- Accepted September 24, 2008.
- American College of Cardiology Foundation
- Moscucci M.,
- Eagle K.A.,
- Share D.,
- et al.
- Brindis R.G.,
- Dehmer G.J.
- Epstein A.J.,
- Rathshore S.S.,
- Volpp K.G.M.,
- Krumholz H.M.
- Racz M.J.,
- Hannan E.L.,
- Isom O.W.,
- et al.
- Peterson E.D.,
- DeLong E.R.,
- Muhlbaier L.H.,
- et al.
- Anderson H.V.,
- Shaw R.E.,
- Brindis R.G.,
- et al.,
- Weintraub W.S.,
- McKay C.R.,
- Riner R.N.,
- et al.,
- American College of Cardiology Database Committee
- ↵ACC National Cardiovascular Data Registry: Cardiac Cath Lab Module, version2.0c [pdf]. www.accncdr.com/WebNCDR/NCDRDocuments/DataDictDefsOnlyv20c.pdf. Accessed June 7, 2007.
- Shaw R.E.,
- Anderson H.V.,
- Brindis R.G.,
- et al.,
- Harrell F.E. Jr.
- Efron B.
- California CABG Outcomes Reporting Program (CCORP)
- New York State Department of Health
- Go V.,
- Calvin J.E.,
- Klein L.W.
- Vakili B.A.,
- Kaplan R.,
- Brown D.L.
- Kuntz R.E.,
- Normand S.L.T.