Author + information
- Received April 22, 2019
- Revision received June 12, 2019
- Accepted June 18, 2019
- Published online October 21, 2019.
- James P. Howard, MA, MB BChira,∗,
- Christopher M. Cook, MBBS, BSca,∗,
- Tim P. van de Hoef, MD, PhDb,
- Martijn Meuwissen, MD, PhDc,
- Guus A. de Waard, MDd,
- Martijn A. van Lavieren, MScb,
- Mauro Echavarria-Pinto, MDe,
- Ibrahim Danad, MD, PhDd,
- Jan J. Piek, MD, PhDb,
- Matthias Götberg, MD, PhDf,
- Rasha K. Al-Lamee, MBBS, PhDa,
- Sayan Sen, MBBS, PhDa,
- Sukhjinder S. Nijjer, MBChB, PhDa,
- Henry Seligman, MBBSa,
- Niels van Royen, MD, PhDd,
- Paul Knaapen, MD, PhDd,
- Javier Escaned, MD, PhDe,
- Darrel P. Francis, MA, MDa,
- Ricardo Petraco, MD, PhDa and
- Justin E. Davies, MD, PhDa,∗ (, )@jerd10
- aDepartment of Cardiology, Hammersmith Hospital, Imperial College Healthcare NHS Trust, London, United Kingdom
- bDepartment of Cardiology, Academic Medical Center, Amsterdam, the Netherlands
- cDepartment of Cardiology, Amphia Hospital, Breda, the Netherlands
- dDepartment of Cardiology, VU University Medical Center, Amsterdam, the Netherlands
- eDepartment of Cardiology, Hospital Clínico San Carlos, Madrid, Spain
- fDepartment of Cardiology, Skane University Hospital, Lund, Sweden
- ↵∗Address for correspondence:
Dr. Justin E. Davies, Hammersmith Hospital, Du Cane Road, London W12 0NN, United Kingdom.
Objectives This study developed a neural network to perform automated pressure waveform analysis and allow real-time accurate identification of damping.
Background Damping of aortic pressure during coronary angiography must be identified to avoid serious complications and make accurate coronary physiology measurements. There are currently no automated methods to do this, and so identification of damping requires constant monitoring, which is prone to human error.
Methods The neural network was trained and tested versus core laboratory expert opinions derived from 2 separate datasets. A total of 5,709 aortic pressure waveforms of individual heart beats were extracted and classified. The study developed a recurrent convolutional neural network to classify beats as either normal, showing damping, or artifactual. Accuracies were reported using the opinions of 2 independent core laboratories.
Results The neural network was 99.4% accurate (95% confidence interval: 98.8% to 99.6%) at classifying beats from the testing dataset when judged against the opinions of the internal core laboratory. It was 98.7% accurate (95% confidence interval: 98.0% to 99.2%) when judged against the opinions of an external core laboratory not involved in neural network training. The neural network was 100% sensitive, with no beats classified as damped misclassified, with a specificity of 99.8%. The positive predictive and negative predictive values were 98.1% and 99.5%. The 2 core laboratories agreed more closely with the neural network than with each other.
Conclusions Arterial waveform analysis using neural networks allows rapid and accurate identification of damping. This demonstrates how machine learning can assist with patient safety and the quality control of procedures.
Over 1,000,000 cardiac catheterizations are performed annually in the United States (1). Although increasingly safe (2), myocardial infarction can still occur as frequently as 1 in 400 cases and death as frequently as 1 in 2,000 (3). Catheter-associated dissections are a major cause of morbidity and mortality (4). Accordingly, performing cardiologists minimize the risk of these complications by preventing deep intubation of the coronary arteries and avoiding contrast injection when the catheter is tightly engaged. These 2 scenarios can often be identified due to the presence of “damping,” which refers to various characteristic appearances of the arterial pressure waveforms, indicating that the catheter tip is not transducing true aortic pressure. However, accurate identification of damping is dependent on operator training, experience, and attentiveness. These factors are variable, and thus the important safety signals of aortic pressure damping are frequently missed.
In addition to the aforementioned safety implications, aortic pressure damping can also lead to diagnostic errors during physiological stenosis assessment with pressure-based techniques such as the instantaneous wave-free ratio (iFR) or fractional flow reserve (FFR), or flow-based techniques such as the coronary flow reserve. The effects of hyperemia in particular can create a suction effect on the guiding catheter, promoting dynamic aortic pressure damping (5). This can lead to systematic underestimations of FFR values and potentially inappropriate decisions regarding revascularization. Furthermore, damping during FFR measurement can be particularly hard for operators to identify, as the signal from the pressure wire is typically superimposed over the arterial signal during these measurements (6).
Finally, previous studies have shown that the presence of damping in arterial waveforms can indicate an ostial coronary lesion (7), underlying the importance of continuous waveform analysis during coronary angiography.
In recent years, neural networks have shown increasing performance in the classification of medical imaging (8–10) and biological waveform data, including electrocardiograms (11), encephalograms (12), and even snoring from sleep sounds (13).
Owing to the importance of identifying damping, and the difficulties in doing so in routine clinical practice, we set out to create a neural network to accurately identify damped arterial waveform traces in real time during invasive coronary angiography.
We used 2 pre-existing datasets from patients undergoing invasive coronary angiography at 4 European cardiac centers. The larger dataset was designated as the training and validation dataset, and the smaller dataset as the testing dataset. Cases within the training dataset were randomized in a 3:1 ratio between being used to train the neural network (training) and assess the relative performances of different network designs and training strategies (validation), respectively. The testing set was kept aside during development and was only used to assess and report the model’s final accuracy.
Data extraction and labeling
For each included physiological recording, data from the arterial catheter and the electrocardiogram were extracted at a sampling frequency of 200 Hz. Individual beats were segmented out using electrocardiographic gating.
Labeling of the data was performed by 2 separate core laboratories, serving 3 separate purposes: training, internal validation, and external validation. The internal core laboratory (C.M.C., J.E.D.) provided labels for both the training and testing datasets. The labels they provided for the training dataset were used to train the neural network. The labels they provided for the testing dataset were then used to assess the neural network’s performance. This performance represents the ability of the network to mimic the decision-making processes of the internal core laboratory and predict their responses on a dataset the network has not been trained on.
In contrast, the external core laboratory (T.P.v.d.H., M.M.) only labeled the test set. These labels were used in 2 ways. First, the agreement between the internal core laboratory and the external core laboratory provided a measure of the inter-reviewer reproducibility of damped beat analysis. Second, the accuracy of the network judged against the external core laboratory’s labels provided an estimate of the ability of the network to agree with an independent physician’s assessment.
Labeling of beats from the training and test sets was performed using custom-made software. Briefly, this software would choose a random case for reporting, and then a random R-R interval from that file, to ensure beats were equally distributed across cases rather than biased toward longer recordings. When labeling a beat, the software also showed the reporter the preceding and succeeding 100 beats so that the presence or absence of damping could be contextualized with the patient’s previous recordings to better mimic routine clinical practice.
Morphological and clinical categories
Each core laboratory classified the beats into the 3 clinical classes (normal, damping, or artifactual) using their conventional clinical judgment. The study protocol indicated that pointers to damping were the hallmarks of “ventricularization” (namely the absence of the dicrotic notch, an abrupt diastolic downstroke, and a small positive deflection before systole representing atrial contraction) or a sudden reduction in systolic pressure (14,15).
The primary clinical aim of the neural network was to differentiate among these 3 clinical classes, which have different management implications, but each category may have more than 1 waveform morphology.
Therefore, we provided the neural network with 5 waveform categories. The “normal” (W1) and “noisy (nondamped)” (W2) waveform categories belonged to the “normal” clinical class. The “port open” (W3) and “other artifact” (W4) waveform categories belong to the “artifact” clinical class. Finally, the “damping” (W5) waveform category comprised the “damping” clinical class. This approach is advantageous as it allows the network to learn the characteristic differences between unique waveform configurations, even when they may have the same clinical class and management.
During the labeling process, the reviewer staff identified some beats with subtle features of damping which they felt in clinical practice would not be considered damped, but which they felt would warrant ongoing monitoring to ensure frank damping did not develop (Central Illustration). We termed these morphologies dampoid.
Reviewers therefore had 6 categories of waveforms to choose from: (W1) normal, (W2) noisy nondamped, (W3) port open, (W4) artifactual, (W5a) damped, or the extra category, (W5b) dampoid. To ensure maximum sensitivity and hence safety for any adverse waveforms, the neural network was trained to treat these last 2 categories (W5a and W5b) identically as both showing “damping” (W5).
Neural network architecture and training
In this study we use a specific type of neural network, termed a 1-dimensional convolutional neural network. This type of neural network learns to classify 1-dimensional data (here aortic pressure vs. time) by sliding (or “convolving”) a series of small templates, termed kernels, through the data and looking for matches. As the data pass through the network, deeper layers of the network learn to make decisions based on which features are present and in which locations. This is actually inspired by the human optic cortex, which at the most basic superficial layer (V1) comprises a series of detectors of basic shapes such as edges, although deeper layers learn to combine these features, allowing us to recognize complex images such as human faces.
We trained this neural network to classify the arterial pressure traces. This is an automated process termed backpropagation, which involves adjusting both the kernels (i.e., what features the neural network identifies) and how the kernels are used to come to a final decision. Creating the final neural network model involved 2 stages (Figure 1).
In stage 1, we created a neural network that was trained to classify individual beats as 1 of the 5 waveform categories: normal (W1), noisy (nondamped) (W2), port open (W3), other artifact (W4), or damping (W5).
Examples of these classes are shown in Online Figure 1. This network’s design was inspired by the 2-dimensional ResNet architecture, which we adapted into a novel 1-dimensional design (16). A schematic of the network, and how the 5 waveform categories map to the 3 clinical classes, is shown and described in Figure 2.
In stage 2, we encased the 1-dimensional convolutional neural network in a larger “recurrent” network, which worked to process several sequential beats in parallel and classify the final beat in the context of the beats which preceded. These have been employed previously in the fields of video processing (17), and we theorized this approach would be helpful, as it might allow the final network to better identify whether certain features (e.g., a small diastolic deflection) are genuine (a dicrotic notch) and present in successive beats or merely noise.
During both stages, loss was calculated over batches of 32 beats using the categorical cross-entropy loss function and weights were updated using the Adadelta optimizer (learning rate of 1.0, rho of 0.95, and epsilon of 1 × 10–06). Training continued until validation loss plateaued (14 epochs). Because the temporal characteristics of the waveform features (including durations and slopes) are critical for identifying damping, it would not be appropriate to add additional simulated datasets by distorting the real waveforms (data augmentation). Programming was performed with the Python programming language version 3.6, with the TensorFlow (18) and Keras (19) machine learning frameworks.
Endpoints and statistical analysis
The prespecified primary endpoint was the accuracy of the neural network when judged against the opinions of the internal core laboratory. Assuming a true accuracy of 97.5%, we calculated that a testing set of 1,800 beats would be necessary to demonstrate an accuracy exceeding 96% (i.e., the lower 95% confidence interval [CI] was above 0.96) using a 2-tailed general proportion test with continuity correction for 90% power at the 5% significance level. As well as raw accuracy, the performance of the neural network was also reported using 2 statistics less susceptible to imbalanced class sizes: Cohen’s kappa and the F1 score, defined as double the harmonic average of the precision (positive predictive value) and recall (sensitivity). Positive and negative predictive values were calculated with reference to the neural network identifying a beat as damped.
The pre-specified safety endpoint was the ability of the network to correctly classify all beats labeled as damped by either of the 2 core laboratories.
Secondary endpoints included the accuracy of the neural network when judged against a second set of expert opinions: the external core laboratory. A further secondary analysis was to clarify the inter-core laboratory agreement as a proportion of beats that were classified identically across the 2 core laboratories, and this was compared with the agreement of the network (i.e., network accuracy) using McNemar’s chi-square test with 2-tailed p value of 0.05 as the threshold for statistical significance. An exact test was used for contingency tables including any counts below 25.
Statistical analysis was performed using R version 3.5.0 (R Foundation for Statistical Computing, Vienna, Austria).
The training dataset comprised 237 recordings from unique vessels in 107 patients. The testing dataset comprised 123 recordings from unique vessels in 53 patients. The baseline characteristics of the patients and lesions are outlined in Table 1.
A total of 3,855 beats across the training dataset were randomly selected and labeled by the internal core laboratory and used for training and validation.
A total of 1,854 beats were randomly selected and labeled by both the internal and external core laboratories and used for assessment of the trained models. The time taken to predict a single batch of beats was 0.9 s.
The distribution of labels across the 2 datasets are shown in Table 2. The proportion of beats classified as normal across the 2 datasets ranged between 77% and 87%. Damped beats comprised between 1% and 2% of the datasets, with dampoid beats making up between 4% and 9% of beats. The remaining beats were artifactual, the majority of which were due to the catheter side port being open, resulting in an inability to transduce pressure.
Neural network accuracy
The neural network was 99.4% accurate (95% CI: 98.8% to 99.6%) at correctly classifying whether the 1,854 beats within the testing set were normal, showed damping, or were artifactual, when judged against the labels provided by the internal core laboratory. The confusion matrix in Figure 3 demonstrates these results by class.
All 20 beats classified as damped by the internal core laboratory were correctly classified as showing damping by the network (100% sensitivity). A total of 135 of the 137 beats (98.5%) classified as dampoid by the internal core laboratory were correctly classified as showing damping by the network. A total of 1,624 of 1,627 beats classified as normal (or noisy nondamped) by the internal core laboratory were correctly classified as normal by the network, corresponding to a specificity of 99.8%. The corresponding Cohen’s kappa was 0.970, and the F1 score was 0.976. The positive predictive value for a neural network making a prediction of “damping” was 98.1%. The negative predictive value was 99.5%.
The secondary endpoint of the study was the ability of the neural network to agree with a second physician who did not influence the network’s training (i.e., the accuracy of the network when judged against the external core laboratory). The neural network’s predictions agreed with the external core laboratory’s labels in 98.7% of beats (95% CI: 98.0% to 99.2%). The corresponding Cohen’s kappa was 0.943, and the F1 score was 0.963. The confusion matrix in Figure 4 demonstrates these results by class.
The primary safety endpoint was the ability to correctly classify all beats that either core laboratory thought showed evidence of profound damping. The internal and external core laboratory classified 20 and 29 of the 1,854 beats as damped, respectively. The neural network correctly classified all these beats as showing damping (100% sensitivity).
Inter–core laboratory agreement
The neural network agreed with each core laboratory numerically more frequently (99.4% and 98.7% of beats, respectively) than the 2 core laboratories agreed with each (98.2% of beats; 95% CI: 97.5% to 98.8%). This was not statistically significant, however (McNemar’s chi-square = 0.833, df = 1; p = 0.361). The confusion matrix in Online Figure 2 demonstrates these results by class.
We have found that a recurrent convolutional neural network is able to monitor arterial pressure traces during coronary angiography and accurately identify when damping is present. The performance of this algorithm, when defined as its ability to agree with a core laboratory, is at least as good as that of a second core laboratory. More specifically, it was able to successfully identify all beats that the 2 core laboratories classified as damped, although having an exceptionally high specificity (Central Illustration). Furthermore, the time taken to process a beat was below 1 s, proving that real-time waveform analysis using machine learning in the catheterization laboratory is possible. These results have implications for both patient safety and diagnostics.
Damping may provide information about adverse catheter placement. For example, damping often represents deep engagement of the catheter with the coronary artery, or when an ostial stenosis is present, or when the catheter size is too large for the vessel. This can result in ischemia, chest pain, and arrhythmia (20), and so if damping is not identified quickly and the catheter repositioned, complications may occur. Damped pressure traces may also represent a catheter orifice abutting a vessel wall (15), and injection of contrast in such circumstances can result in coronary dissection. Although the overall responsibility for the identification of damping lies with medical staff, algorithms such as this could provide an additional layer of safety for cardiologists during procedures, which is supported by the 100% sensitivity we report for the identification of damped beats.
Damping of arterial pressure waveforms is not only a safety concern but can provide valuable clinical information and prevent inaccurate physiological measurements being made.
Studies have shown that 2.3% of patients undergoing coronary angiography have pressure damping during the procedure, and intravascular ultrasound shows that in almost half of these patients there is a true atherosclerotic ostial stenosis (7). The ability to identify damping in these cases is especially important because ostial lesions in particular can be missed during coronary angiography, and by definition subtend a larger amount of myocardium than more distal stenoses (21).
Cardiologists frequently perform coronary physiology measurements such as FFR and iFR during catheterization procedures to assess the significance of intermediate coronary stenoses. These measurements rely on calculating pressure differences between the aorta and the coronary artery distal to the lesion. The presence of damping will result in falsely low aortic pressure readings, and may also affect coronary blood flow, reducing distal coronary pressures. This may result in an underestimation of the significance of a stenosis. These difficulties are confounded by the fact that “dynamic damping” occurs frequently during adenosine-mediated hyperemia, which is essential to FFR measurement (5), and yet may be difficult to identify when the distal coronary pressure waveform is superimposed on the measurement console (Figure 5).
This study has shown that a neural network is able to accurately classify beats it has not been trained on, from a large dataset of 1,854 beats from over 100 different coronary physiology recordings. However, like all tests in clinical medicine, this algorithm is not perfect, and its accuracy rate appears between 98.7% and 99.4% when judged against expert humans. We are unable to definitively state whether these discrepancies are due to misclassifications by the neural network or the core laboratories, but there are several possible reasons why they may differ. One contributor is the subjectivity of human judgment for whether a beat is damped or not. This is confirmed by the fact that the 2 core laboratories only agreed with each other in 98.2% of beats. This should not be surprising when one considers that there must be a continuous scale between “completely normal” and “very damped” beats and yet the core laboratories are being asked to categorize beats as if this was a simple dichotomy. For this reason, we also wished to report accuracies on beats they felt were unequivocally damped and that the core laboratories would want to ensure the algorithm did not misclassify. For these beats, the neural network was 100% accurate.
The prevalence of definitive damping on this dataset was low, with only 1.3% of the training set being labeled as this category by the internal core laboratory. However, the neural network appears to have been able to correctly classify these beats, as demonstrated by the 100% sensitivity and negative predictive value across damped and dampoid beats of 99.5%. This is likely in part due to the natural spectrum of changes between definitively damped beats and the much more common category of dampoid beats, which the neural network has been able to identify features from.
This neural network was trained using data derived from expert human core laboratories, and so the “gold standard” against which it has been assessed is human opinion, rather than an objective measure of failure to transduce aortic pressure accurately. However, there currently exists no system for this task, and so cardiologists must rely on their own decision-making and constant attention to avoid complications related to damping. This system could therefore become an invaluable safety and quality control tool that answers an unmet need in catheterization laboratories around the world.
This study demonstrates the first automated approach for identifying damping of arterial pressure waveforms during coronary angiography. This machine learning approach using a recurrent convolutional neural network was able to correctly identify all beats displaying definitive damping, with an overall accuracy above 99%. This has implications for both patient safety and diagnostics in the cardiac catheterization laboratory worldwide.
WHAT IS KNOWN? Failure to identify damping of aortic pressure during coronary angiography can lead to inaccurate coronary physiology measurements and serious complications. Despite this, there are no automated methods for identifying damping, which makes the process prone to human error.
WHAT IS NEW? This study shows that artificial intelligence can be used to identify adverse aortic pressure waveforms with an accuracy comparable to that of expert humans.
WHAT IS NEXT? This study of over 5,000 aortic pressure waveforms will be followed by prospective clinical validation.
The authors thank Jeremy Walker and Thalamus AI for software support.
↵∗ Drs. Howard and Cook contributed equally to this work.
This work was supported by the Imperial College Healthcare NHS Trust Biomedical Research Centre. Drs. Cook, Al-Lamee, Sen, Nijjer, and van de Hoef have received speaker honoraria from Philips Volcano. Dr. Piek has received consultant and speaker fees from Abbott Vascular and Philips Volcano. Dr. Seligman has received research grant support from Amgen. Dr. van Royen has received research grant support from Abbott, Philips, and Biotronik; and has received honoraria from Medtronic, Microport, and Amgen. Dr. Escaned has received consulting and speaker fees from Philips Volcano, Boston Scientific, and Abbott/St. Jude Medical. Dr. Petraco has received research grant support from Amgen and Miracor; and has received consulting and speaker honoraria from Philips Volcano. Dr. van Lavieren is an employee of Philips. Dr. Davies holds patents pertaining to the iFR technology; and has served as a consultant for and received research grants from Philips Volcano. All other authors have reported that they have no relationships relevant to the contents of this paper to disclose.
- Abbreviations and Acronyms
- confidence interval
- fractional flow reserve
- instantaneous wave-free ratio
- Received April 22, 2019.
- Revision received June 12, 2019.
- Accepted June 18, 2019.
- 2019 American College of Cardiology Foundation
- Mozaffarian D.,
- Benjamin E.J.,
- Go A.S.,
- et al.
- Neumann F.-J.,
- Sousa-Uva M.,
- Ahlsson A.,
- et al.
- Her A.-Y.,
- Ann S.H.,
- Singh G.B.,
- Kim Y.H.,
- Koo B.-K.,
- Shin E.-S.
- Howard J.P.,
- Fisher L.,
- Shun-Shin M.J.,
- et al.
- Rajpurkar P.,
- Irvin J.,
- Zhu K.,
- et al.
- Zhang J.,
- Gajjala S.,
- Agrawal P.,
- et al.
- Hannun A.Y.,
- Rajpurkar P.,
- Haghpanahi M.,
- et al.
- Hong M.-K.
- Klein L.W.,
- Korpu D.
- He K.,
- Zhang X.,
- Ren S.,
- Sun J.
- Donahue J.,
- Hendricks L.A.,
- Rohrbach M.,
- et al.
- Abadi M.,
- Agarwal A.,
- Barham P.,
- et al.
- Chollet F.
- Kahraman Ay N.