Breast Imaging Fellow Massachusetts General Hospital/Harvard Medical School Massachusetts General Hospital/Harvard Medical School Boston, Massachusetts
Purpose: Patient eligibility for high-risk screening MRI is typically assessed by traditional risk assessment scores. However, emerging deep learning (DL) risk stratification models are proving promising, with recent studies demonstrating DL-based models outperforming traditional risk stratification methods. The objective of this study is to measure the impact of a DL risk assessment model to support more effective supplemental screening with MRI for patients at increased risk of developing breast cancer.
Materials and Methods: This retrospective, multisite study included consecutive patients, >40 years, undergoing high-risk breast cancer screening MRI from 9/18/2017 to 9/17/2020 at four facilities. Tyrer-Cuzick version 8 (TC) and NCI Breast Cancer Risk Assessment Tool 5-year and lifetime models and a DL 5-year model were used to assess risk. The following thresholds were used to define increased-risk: >1.67% for traditional 5-year models, >3.1 for DL model and high-risk: >20% for traditional lifetime models, >6.9 for DL model. Patient demographics were retrieved from electronic medical records. Cancer outcomes determined through linkage to a regional tumor registry. We compared risk model performance using Pearson’s chi-squared tests.
Results: 6153 patients underwent high-risk screening MRI during the study period. Median patient age was 54 years (IQR: 48-60 years). 91.8% (5535/6027) were White-Non Hispanic and 8.2% (492/6027) were patients of color/Hispanic. 33.4 % (1434/4296) of patients met increased-risk criteria by DL model, 77.2% (1868/2419) by TC 5-year, and 75.3% (1574/2090) by NCI 5-year. 13.9% (597/4296) met high-risk criteria by DL model, 49.7% (1203/2419) by TC lifetime, and 34.8% (760/2187) by NCI lifetime. Cancers detected per thousand women screened were higher in patients at increased (28.5, 95% CI, 17.9, 45.1) or high-risk (20.2, 95% CI 14.1,28.9) by DL model, vs TC (8.6, 95% CI, 5.3, 13.9) and NCI (6.4, 95% CI, 3.5,11.7) 5-year models, and TC (7.5, 95% CI 3.9,14.2) and NCI (7.9, 95% CI 3.6,17.1) lifetime models (p < 0.01). Positive predictive values 1,2,3 were significantly higher in DL vs traditional models (p < 0.05). There was not evidence of differences in sensitivity or specificity in DL vs traditional models (p>0.05).
Conclusion: Improved risk assessment strategies are necessary to address breast MRI performance in women screened by traditional methods. A DL model can effectively identify increased/high-risk patients who may benefit from supplemental screening with MRI.
Clinical Relevance Statement: Traditional risk assessment models are not adequately identifying patients benefiting from screening MRI. A DL model may provide a more robust method to identify those who may benefit from screening MRI compared to traditional models.