
quires extended processing times, making them less
ideal for real-time applications.
In this project, we employ the YOLOv8 model,
the latest iteration of the YOLO series, to advance
bone fracture detection in X-ray images. Our ap-
proach is designed to address two primary goals:
identifying the location of the fracture and quantify-
ing the fracture length, an aspect rarely explored in
existing research. By training the YOLOv8 model on
a diverse dataset, we aim to develop a robust, an ef-
ficient solution for fracture detection that can be de-
ployed across various healthcare settings, from well-
resourced hospitals to under-resourced clinics. We
further enhance model performance through data aug-
mentation techniques, optimizing the YOLOv8 algo-
rithm for pediatric wrist fractures.
Through experimental comparison, we assess the
YOLOv8 model’s perfor- mance against YOLOv7
and its improved variants, using mean average preci-
sion (mAP 50) as the evaluation metric. Our findings
demonstrate that YOLOv8, when trained with tailored
data augmentation strategies, achieves the highest
mAP 50 score, underscoring its efficacy in accurately
detecting and quantifying fractures. This project, by
automating fracture detection and measurement, has
the poten- tial to alleviate radiologists’ workloads, en-
sure consistent diagnostic outcomes, and improve pa-
tient care across diverse healthcare settings, particu-
larly in areas where access to radiology expertise is
limited.
2 RELATED WORKS
This field has experienced significant growth, partic-
ularly in leveraging deep learning techniques to en-
hance medical image analysis. Recent advancements
focus on improving the accuracy and reliability of
bone fracture detection and quantification in X-ray
images. Much of the earlier research serves as a foun-
dation for developing modern approaches, such as the
method presented in this study, which helps in deliv-
ering enhanced diagnostic performance.
(A. Saad, 2023) developed a convolutional neural
network (CNN) using Keras to detect fractures in X-
ray images. The model was trained and augmented
on a dataset of 9,103 X-ray images, to improve the
diversity and robustness of the training. The CNN
model achieved a high accuracy of 91%, with pre-
cision and recall rates of 89.5% and 87%, respec-
tively, largely due to the data augmentation. While
this accuracy places it above several other methods,
the study notes a risk of false positives, suggesting
further refinements to make it suitable for clinical set-
tings. (Kalb and Harris, 2021) The dataset consisted
of X-rays classified as fractured and non-fractured,
enhanced through augmentation techniques. The re-
sults are promising, with the model showing signif-
icant accuracy; however, the research highlights the
need for comparisons with other models to ensure
consistency and reduce the rate of false positives.
(Zou and Arshad, 2022) explored the performance
of YOLO variants and two-stage models for frac-
ture detection, emphasizing Enhanced Intersection
over Union (EIoU) to improve bounding box preci-
sion. The study found that the YOLOv7-ATT model
achieved a mean average precision (mAP) of 80.2%
and 86.2% on the FracAtlas dataset, outperform-
ing other models in terms of precision and recall.
(M. Salimi, 2022) While YOLOv7-ATT stood out,
the research also revealed that other two-stage models
and SSD performed suboptimally, and additional en-
hancements are still needed for further accuracy im-
provements. (T. Gruber, 2022) The dataset included
annotated images representing four types of fractures
and was evaluated using precision, recall, mAP, and
IoU. Overall, the YOLOv7-ATT model demonstrated
that single-stage models generally surpass two-stage
models in terms of both speed and detection accuracy.
(J. Li, 2021) employed DenseNet-201, a deep
learning model that was trained on 1,370 X-ray
images, with preprocessing and data augmentation
methods applied to enhance its accuracy. The model’s
performance was measured by metrics like accuracy,
sensitivity, AUC and specificity, where it achieved
94.1% accuracy and an AUC of 98.7%. The model
also demonstrated high sensitivity and specificity
rates, with sensitivity at 93.2% and specificity at
94.8%. (M. Oppenheimer, 2021) However, the study
notes that further clinical validation is necessary to
ensure its reliability for widespread clinical use. The
dataset focused on pediatric elbow fractures, provid-
ing a specialized area for evaluation. DenseNet-201’s
promising results indicate its high diagnostic poten-
tial, especially for pediatric fractures, though broader
testing is recommended.
(Riska, 2022) investigated the application of a De-
cision Tree classifier on 4,083 X-ray images, utiliz-
ing Canny edge detection and Hu Moments for ef-
fective feature extraction. The model was validated
through 5-fold cross-validation, achieving a moderate
accuracy range of 69.89% to 74.05%, with balanced
performance metrics across evaluations. (R. Hruby,
2023) Although the classifier provided a reliable base
for fracture detection, the study highlights variability
in performance, suggesting that advanced algorithms
and optimization are required to enhance accuracy.
The dataset used consisted of labeled X-ray images
Enhanced Bone Fracture Detection and Quantification in X-Ray Images Using Deep Learning
81