Article 2026-04-23 posted v1

Evaluating the predictive power of machine learning in preeclampsia

F
Fatemeh Abdi Iran University of Medical Sciences
N
Nasibeh Roozbeh Hormozgan University of Medical Sciences
F
Fatemeh Darsareh Hormozgan University of Medical Sciences
V
Vahid Mehrnoush Hormozgan University of Medical Sciences
A
Ali Haghighat Shiraz University
A
Anna Nami Shiraz University
F
Farideh Montazeri Hormozgan University of Medical Sciences
M
Mojdeh Banaei Hormozgan University of Medical Sciences

Abstract

Background Given that machine learning is one of the most effective approaches for identifying disease risk factors, this study aimed to explore the predictors of preeclampsia through machine learning.Methods This retrospective study evaluated data collected over a span of two years [2020–2022] that was retrieved from electronic health records from a tertiary care medical center in Bandar Abbas, Iran. It consists of 36 features, reflecting both demographic and medical characteristics of 8,888 patients who gave birth at our center during the study period. The original dataset contained a target variable that categorized women into two distinct groups: “preeclamptic” and “non-preeclamptic”. The input data were incorporated into 12 machine learning models. The area under the curve (AUC), accuracy, precision, Brier score, recall, F1 score, precision-recall (PR-AUC) were employed to evaluate the model's performance.Results The incidence of preeclampsia was 6.5%. Due to imbalanced data, the Synthetic Minority Over-sampling Technique (SMOTE) approach was utilized to run the models. Among all models Random Forest stands out as the top model across multiple measures with an accuracy of 0.993, AUC of 0.972, a Brier Score of 0.016, and a PR-AUC of 0.935, reflecting the least error and highlighting this model’s superior accuracy and dependability in predicting preeclampsia. Previous preeclampsi, gestational age, parity, maternal education, thyroid dysfunction, place of residency, diabetes, iron deficiency anemia, newborn sex, and maternal age showed to be the most significant features in predicting preeclampsia. Previous preeclampsia by far, was the most important predictor.Conclusions Utilizing the SMOTE approach to balance the data revealed that the Random Forest model stands out as the top model across multiple measures showing superior accuracy and dependability in predicting preeclampsia. Previous preeclampsia was the most important predictor.

Citation Information

@article{fatemehabdi2026,
  title={Evaluating the predictive power of machine learning in preeclampsia},
  author={Fatemeh Abdi and Nasibeh Roozbeh and Fatemeh Darsareh and Vahid Mehrnoush and Ali Haghighat and Anna Nami and Farideh Montazeri and Mojdeh Banaei},
  journal={Research Square},
  year={2026},
  doi={https://doi.org/10.21203/rs.3.rs-9292292/v1}
}
Back to Top
Home
Paper List
Submit
0.019861s