research employs a stacking model with the addition
of Support Vector Machine (SVM), Random Forest,
and XGBoost. Stacking is an ensemble learning
where multiple base models are learned separately
and their predictions are combined using a meta-
model, thereby improving accuracy and robustness.
The method employs the strength of different
algorithms while making up for their respective
weaknesses. Although machine learning applications
to predicting preterm birth are still in early
development, current research targets primarily
clinical and demographic data rather than
physiological signals, e.g., uterine contractions. This
study seeks to close this gap by evaluating machine
learning models trained on extracted features from
contractions. The chosen algorithms provide different
strengths: Support Vector Machine (SVM) is
particularly good at processing high-dimensional data
and nonlinear relationships through the aid of kernel
functions; Random Forest provides high accuracy and
resistance to overfitting through the aggregation of
multiple decision trees; and XGBoost provides
improved interpretability, thereby enabling improved
understanding of feature contributions to predictions.
The research compares the models on the basis of
important performance parameters like accuracy,
precision, recall, and F1-score. Among the standalone
models, the Random Forest algorithm is the most
accurate, followed by Support Vector Machine
(SVM) and XGBoost. The stacking model also
improves the accuracy of prediction by fusing the
predictions of all three models, thus also showing the
power of ensemble learning in improving diagnostic
accuracy. The results suggest the promise of machine
learning to enhance preterm birth prediction as a more
refined tool to support healthcare professionals in
making early diagnosis and intervention. With the
combination of machine learning and obstetric care,
this study adds to the body of evidence on preterm
birth prediction. The application of contraction-
related features allows for new insight into the pattern
of uterine activity, making the risk assessment more
effective. Although machine learning has evolved
significantly, the gap in studies conducted for
employing such methods for preterm birth prediction,
especially when contracting-related data is utilized,
still remains. This study aims to close the gap by
comparing the performance of various machine
learning models for this purpose, ultimately resulting
in better maternal and neonatal health.
2 RELATED WORKS
Prediction of preterm birth remains an obstetric
medicine enigma and there have been multiple studies
aimed to innovate improving diagnostic accuracy.
With the growth of machine learning in the area of
research, we have observed improvement in the
quality of prediction models and decision-making
assistance for doctors. Prior studies have extensively
explored various machine learning methods for the
prediction of high-risk pregnancy.
Liu et al. (2024) demonstrated the utility of a
machine learning predictive model for preterm birth
risk prediction, incorporating clinical parameters
within a nomogram for improved accuracy. Their
research is in line with Xu, Zhang, and Zhang (2020)
1, who also demonstrated that hybrid machine
learning models incorporating electronic health
records could be especially effective. Likewise,
Goodwin, Maher, and Callaghan (2020) examined
predictive models based on electronic health record
data, and thus further adds to the importance of big
data in obstetric analytics.
Support Vector Machines (SVM), Random
Forest, and XGBoost, as machine learning methods,
have attracted great attention for their capabilities to
process high-dimensional data and model complex
relationships. Włodarczyk et al. (2021) conducted a
comprehensive examination of machine learning
techniques focused on predicting preterm birth,
highlighting the relevance of ensemble learning
approaches. This study extends these findings to
bring contraction-based features into predictive
models, an activity that the literature has addressed
only minimally. These parameters consist primarily
of changes in patterns or signals within and around
the uterus, as suggested in (Kavitha, S. N, and Asha.
V.2024) The inclusion of uterine contraction
parameters, as suggested in (Villar, J and
Papageorghiou, A. T. 2014). Even though individual
machine learning models have been beneficial,
ensemble methods (for example, stacking) have been
shown to be more effective, especially when it comes
to improving prediction power. Combining
algorithms has the potential to enhance the risk
assessment as shown in a recent publication (Kavitha
and Asha, 2024), which has been mainly supported in
the present study. The stacking model applied in this
study capitalizes on the advantages of SVM,
Random Forest, and XGBoost and provides a more
stable predictive model. Furthermore, the robustness
of hybrid SVM models for predicting preterm birth
was highlighted by Santoso and Wulandari, 2018,