Introduction: We developed a classification model based on the clustering deep learning features of hematoxylin and eosin (H&E)-stained WSIs, to predict tumor mutation burden(TMB) level of patients with ccRCC. Due to the potential prognostic value of TMB, we further analyzed the relationship between the obtained deep TMB signature and the patient's survival outcome.
Methods: 1. Patient cohorts and TMB threshold definition This study included two independent cohorts. The in-house discovery cohort consisted of 264 ccRCC patients associated with 295 WSIs, the matched data came from the next-generation sequencing (NGS) test for clinical purposes. The external validation cohort consisted of 302 ccRCC patients associated with 304 WSIs, from The Cancer Genome Atlas (TCGA) data portal. We chose to perform a binary classification task that predicted high- and low-TMB. We used segmented regression (“broken-stick analysis”) to define the threshold in the two independent cohorts.
2. WSIs processing In this study, we chose to focus on the tumor regions in the WSIs to avoid interference from redundant information.
3. Deep learning features-based patient-level TMB prediction We used a ResNet50 model pre-trained on ImageNet to extract a 1,024-dimensional feature vector of each tile. Based on the top features (n=64), logistic regression with L2 regularization was applied to construct the deep TMB score for each patient. Hyperparameter selection was carried out through the 5-fold cross-validation of the training cohort.
4. Statistical analysis We used Python (version 3.7.3) for data processing and model training, and used R software (version 3.6.1) for statistical analyses.
Results: 1. Patient clinical characteristics and TMB distribution We did not choose to divide high- and low-TMB by quantile. In the two cohorts, 44 (16.7%) and 40 (13.2%) patients were divided into high-TMB group.
2. Deep learning features in WSI can be used to predict ccRCC TMB We totally obtained 1,104,028 tiles in the discovery cohort and 1,449,222 tiles in the external validation cohort. The AUC of the model in the training cohort, internal validation cohort and external validation cohort are 0.797, 0.813, and 0.655, respectively.
3. Prognostic value of TMB signatures The results of K-M survival analysis revealed that the OS of patients in the high-TMB groups was significantly lower than those in the low-TMB groups (P = 0.0067, Log-Rank test).
Conclusions: We constructed a TMB-level prediction model based on WSI's deep learning features, and obtained patient-level deep TMB signature. The model had good classification performance and was validated on the internal and external validation cohorts.The deep TMB signature maintained a good prognostic risk stratification performance in the advanced and metastatic ccRCC patient population, which can provide further support for the formulation of treatment plans.
Source of Funding: the National Natural Science Foundation of China