An interpretable prediction model based on CT radiomics, deep learning, and clinical factors for viral pneumonia in patients with hematologic malignancies: Development and validation study
Abstract
Background: Viral pneumonia (VP) is a common complication in patients with hematologic malignancies (HM). This study aimed to construct and validate an interpretable prognostic prediction model for VP using machine learning (ML) techniques. Methods: A total of 189 patients were randomly assigned to a training set (n = 133) and a test set (n = 56) at a ratio of 7:3. Clinical independent risk factors were screened using univariate analysis and multivariate logistic regression analysis. Radiomics features were extracted from CT images and further selected by LASSO regression. Deep learning (DL) features were extracted and screened using six pre-trained convolutional neural networks (CNNs), respectively. Subsequently, the radiomics score (Radscore), the DLscore, and the Rad-DLscore were calculated using the logistic regression (LR) classifier. A combined model integrating clinical independent risk factors, Radscore and DLscore was also developed. Receiver operating characteristic (ROC) curves and area under the curve (AUC) were used to compare the predictive efficacy of the models. The SHapley Additive exPlanation (SHAP) algorithm attributes interpretability to the optimal prediction model. Furthermore, based on the optimal model on the LR classifier, four other ML classifiers were constructed to evaluate the predictive value for VP. Results: A total of 3 clinical factors, 5 radiomics features, and 8 DL features were selected for ML model construction. The combined model demonstrated the best performance, with an AUC of 0.928 in the training set and 0.897 in the test set. In the combined model, the use of glucocorticoids emerged as the feature with the greatest impact on the model's prediction. The SHAP force plot provided a visualization of the direction and degree of influence of each feature on the predicting results of the model. Then, the combined model, trained on multiple ML classifiers, demonstrated good performance in predicting VP, with the Support Vector Machine (SVM) achieving the highest AUC of 0.914 in the test set. Conclusion: The interpretable ML model using CT radiomics, deep learning, and clinical factors demonstrated strong performance for VP risk in patients with HM. The SHAP method enhanced the interpretability of ML models, helping clinicians better understand the underlying reasons behind the results.
Citation Information
@article{haiminglyu2026,
title={An interpretable prediction model based on CT radiomics, deep learning, and clinical factors for viral pneumonia in patients with hematologic malignancies: Development and validation study},
author={Haiming Lyu and Yingying Shi and Wenhui Nian and Kaixiang Zhang and Weijin Su},
journal={Scientific Reports},
year={2026},
doi={https://doi.org/10.21203/rs.3.rs-9157202/v1}
}
SinoXiv