(CCSP062) COMPARISON OF MACHINE LEARNING AND CONVENTIONAL STATISTICAL MODELING FOR PREDICTING READMISSION FOLLOWING ACUTE HEART FAILURE HOSPITALIZATION
Thursday, October 26, 2023
13:50 – 14:00 EST
Location: ePoster Screen 6
Disclosure(s):
Karem Abdul-Samad, BSc: No financial relationships to disclose
Background: Developing 30-day readmission prediction models for patients with heart failure, with sufficient discrimination, has been challenging. Emerging evidence has suggested that machine learning (ML) algorithms may have an advantage over conventional modelling (CSM). However, since few studies have reported calibration metrics for ML models, it remains unclear if ML algorithms are superior to CSM regression. In this study, we developed ML models to predict 30-day cardiovascular and non-cardiovascular readmissions and compared their performance to that of models generated using conventional statistical methods.
METHODS AND RESULTS: We studied patients >18 years diagnosed with heart failure and discharged alive from a hospital or emergency department between 2004-2007, selected by random sampling of the Enhanced Feedback for Effective Cardiac Treatment (EFFECT) and Emergency Department Heart Failure (EDHF) chart review cohorts. We collected an extensive set of clinical and demographic factors for this cohort. The study sample was randomly divided into training (2/3) and validation (1/3) sets. The 30-day readmission prediction models were developed using the Fine-Gray subdistribution hazard regression model (treating death as a competing risk) and the ML algorithm Random Survival Forests. Models were evaluated in the validation set using discrimination and calibration metrics.
In a cohort of 10,919 patients, Random Survival Forests (c-statistic = 0.620) showed similar discrimination to the Fine-Gray competing risk model (c-statistic= 0.621) for 30-day cardiovascular readmission. In contrast, for 30-day non-cardiovascular readmission, the Fine-Gray model (c-statistic= 0.641) slightly outperformed the Random Survival Forests model (c-statistic = 0.632). For both outcomes, subdistribution hazard regression displayed better calibration than random survival models using plots that compared observed vs. predicted risk across the deciles of risk (Panels A-D), with random survival forests overestimating the risk of readmissions across all risk deciles.
Conclusion: Fine-Gray subdistribution hazard regression was comparable to ML in discrimination for both outcomes but was superior to ML algorithms in calibration. This study highlights the importance of reporting calibration metrics in addition to discrimination to evaluate prediction models generated using machine learning.