Hybrid Vae‑XGBoost Framework for Efficient Classification of

Diabetic Foot Ulcer Images

N. Nagarani, Gokul Priyan G. V., Sivanesan R. and Sukha Dev A.

Velammal College of Engineering and Technology, Madurai, Tamil Nadu, India

Keywords: Diabetic Foot Ulcer, Variational Autoencoder, XGBoost, Feature Extraction, Classification Accuracy.

Abstract: Diabetic Foot Ulcer (DFU) classification is critical for early classify and planning for manage, with a view to

minimizing complications. In this article, a new hybrid model is developed, with Variational Autoencoder

(VAE) for feature extraction and XGBoost for classify and with a view to improving accuracy and efficiency

in classify of DFU images. VAE learns a low-dimensional and discriminative feature representation of ulcer

images, encoding significant structures and textures and dimensionality reduction. Features extracted via VAE

are then fed into an optimized XGBoost classify, with a view to improving decision-making via gradient-

boosted trees. The proposed model is compared with a benchmarked DFU dataset and contrasted with

traditional deep networks, with considerable performance improvement in accuracy, precision, recall, and F1-

score. Experimental observations confirm that combining VAE for unsupervised feature extraction with

XGBoost for classify enormously improves robustness and generalizability. This hybrid model introduces an

efficient and interpretable model for computerized DFU classify, with a view to supporting clinicians in early

and correct classify.

1 INTRODUCTION

Diabetic Foot Ulcer (DFU) is one of the most severe

diabetes mellitus complications, and it occurs in

millions of patients worldwide. Approximately 15%

of the diabetes patients get ulcers in the foot, which,

if untreated, can go on to get infected, result in

gangrene, and lead to the loss of a limb. Having a

presence of DFUs in a high chance of getting a patient

hospitalized and even dead, and thus, early detection

and proper grading become critical for effective

therapy and averts grave complications. Traditional

DFU diagnosis consists of clinical examination,

estimation of wound depth, and radiologic modalities

such as infrared thermography and Doppler

ultrasound. Grading scales such as Wagner

Classification System and Texas University Wound

Classification have been used for estimating severity

of an ulcer, but such techniques are subjective, time

consuming, and have inter-observer variation, and

thus, computerized grading tools become a necessity.

In recent years, methodologies in Machine

Learning (ML) and Deep Learning (DL) have been

powerful tools for DFU image classification, with

high accuracy and efficiency over conventional,

manual methodologies. Methods such as Support

Vector Machines (SVM), Random Forest, and

XGBoost have been adopted for DFU classification,

using hand-designed features such as texture, color,

and shape descriptors. However, such methodologies

have been restricted by the need for feature

engineering, a process that sometimes fails to extract

complex visual structures in DFUs. Convolutional

Neural Networks (CNNs), ResNet, VGG, and

EfficientNet, under deep learning, have been seen to

outdo them through a capability to learn

discriminative features in an unsupervised manner

directly from raw DFU images. In contrast, even with

success, deep networks require a lot of labelled data,

use a lot of computation, and suffer from overfitting,

specifically when dealing with small, unbalanced

medical datasets.

To overcome such challenges, in this work, a

Hybrid VAE-XGBoost model is proposed, leveraging

the capabilities of Variational Autoencoder (VAE) for

unsupervised feature extraction and XGBoost for

efficient classification of DFU images. VAE model is

adopted for discovering a concise, reduced-

dimensional abstraction of DFU images, with reduced

dimensions and retained important structures and

Nagarani, N., V., G. P. G., R., S. and A., S. D.

Hybrid Vae-XGBoost Framework for Efﬁcient Classiﬁcation of Diabetic Foot Ulcer Images.

DOI: 10.5220/0013895500004919

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 1st International Conference on Research and Development in Information, Communication, and Computing Technologies (ICRDICCT‘25 2025) - Volume 3, pages

209-214

ISBN: 978-989-758-777-1

209

textures information. High-dimensional

representations extracted through them are then

leveraged for training an optimized XGBoost

classifier, famous for its efficiency in processing

structured information and producing strong

prediction with less overfitting. By fusing deep

feature extraction with a high-performance gradient-

boosted decision tree classifier, proposed model

achieves increased accuracy, generalizability, and less

computational cost compared with standalone deep

learning and traditional ML models.

The primary contributions of this study include:

(1) Proposing a VAE-XGBoost model that achieves

high accuracy and efficiency in the classification of

DFU. (2) Investigating representations within the

latent space to enhance features in DFU while

minimizing reliance on large datasets. (3) Conducting

a comparison with leading deep neural networks,

demonstrating comparable performance in terms of

accuracy, precision, recall, and F1-score. (4)

Presenting a lightweight and transparent model

suitable for real-world medical applications,

facilitating early and computerized diagnosis of DFU

for healthcare professionals.

The rest of this paper is organized as follows: Section

2 reviews related research in DFU classification,

Section 3 details the proposed method, Section 4

presents experimental results and comparisons, and

Section 5 concludes with suggestions for future

research and final thoughts.

2 LITERATURE SURVEY

A. Huong et al present an application of an

automatized technique for optimization in finding an

ideal solution for a problem. PSO was utilized in

overcoming a disadvantage of a conventional

technique, and in improving training of a neural

network for its use in diabetic foot ulcer (DFU)

application. The system forms an ideal platform for

technology adaptability in controlling DFU. It can

even act as an ideal decision-support tool for limb

salvage and healing processes' optimizations.

A. Huong, et al utilizes a two-dimensional image

asits basis and a collection of neural networks for

picture processing. It can label an image in four

categories, infection, ischaemia, both, and none.

X.Wuetal developed a flexible model for creating

an efficient augmentation pool for Diabetic Foot

Ulcers medical images. In addition, we use ensemble

learning for enhancing model performance. Unlike

conventional plurality voting, we present a scheme

with a name "voting with expertise" having a bias

towards prediction with reasonably sound value.

Experimental testing confirms efficacy of proposed

techniques and secured a second rank through

integration of aforementioned two enhancements in

present ongoing challenge-Dfuc2021 Challenge.

3 PROPOSED SYSTEM

The novel Hybrid VAE-XGBoost system is created to

classify DFU images better by combining deep

learning-based Variational Autoencoder (VAE) to

perform feature extraction and XGBoost to conduct

classification. Deep learning algorithms such as

CNNs have been known to require immense

computational power and immense training sets,

while traditional machine learning algorithms have

been dependent upon hand-crafted feature extraction

that does not perform in every context. To address

these issues in this work, the system utilizes VAE to

get meaningful DFU image representations in the

latent space. The encoder in the VAE compresses the

input image xxx to get a latent variable zzz following

a Gaussian distribution:

   



 (1)

where μ and σ represent the learned mean and

variance. The decoder then reconstructs the original

image from zzz, ensuring the preservation of crucial

visual information. The loss function of VAE consists

of two components: Rconstruction loss (Lrec), which

minimizes the difference between input and

reconstructed image, given by:





 











 







(2)

and the KullbackLeibler (KL) divergence loss LKL ,

which ensures that the learned distribution remains

close to a standard normal distribution:





























  













 





 (3)

where β is employed to balance between

regularization in the latent space and performance in

reconstruction. This is subsequently followed by a

process of selecting features in the form of Principal

Component Analysis (PCA) or statistics-based

importance to remove redundant information while

retaining only the most discriminatory features.

ICRDICCT‘25 2025 - INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION,

COMMUNICATION, AND COMPUTING TECHNOLOGIES

210

The desired attributes are then passed to the

XGBoost classifier where an efficient classification is

done by adopting a gradient-boosted decision tree

process. XGBoost minimizes an objective function

that consists not only of an added loss term but an

added regularization term:

  















 









 (4)

By combining deep feature learning from VAE

with the structured decision-making capability of

XGBoost, the proposed system achieves higher

classification accuracy, reduced computational cost,

and better generalization compared to standalone

deep learning models. This Hybrid VAE-XGBoost

framework is particularly effective for small medical

datasets, as it leverages unsupervised learning to

extract robust representations and gradient boosting

to make optimal predictions. The model performance

is evaluated using metrics such as Approach, Success

Rate (%) Exactness (%), Sensitivity (%), F1-Measure

(%), ensuring its reliability in real-world clinical

applications.

Figure 1: Proposed System Modules.

The proposed Hybrid VAE-XGBoost framework

is structured into multiple modules, ensuring an

efficient workflow from data acquisition to

classification and evaluation. Each module plays a

vital role in improving the accuracy and

generalization of the model for Diabetic Foot Ulcer

(DFU) image classification. The key modules are as

follows: Figure 1 show the Proposed system Modules

3.1 Data Acquisition and Pre-

Processing

This module involves collecting DFU images from

publicly available datasets or hospital-based

repositories. Since medical images often contain

noise, illumination variations, and artifacts, proper

pre-processing techniques are applied to ensure

uniformity and quality before feature extraction.

Key pre-processing steps include:

• Image Resizing: Standardizing images to a

fixed dimension for consistency.

• Normalization: Scaling pixel intensities to a

standard range to improve model stability.

• Data Augmentation: Applying techniques

such as rotation, flipping, contrast

enhancement, and noise addition to increase

dataset variability and reduce overfitting.

• Segmentation (if required): Extracting the

ulcer region using U-Net or thresholding

techniques to focus on relevant features.

This module ensures that the input data is optimized

for feature extraction and classification.

3.2 Feature Extraction Using

Variational Autoencoder (VAE)

In this module, a Variational Autoencoder (VAE) is

utilized to extract significant features from DFU

images. The encoder maps the image into a

condensed, low-dimensional latent space, capturing

vital information and filtering out unnecessary noise

and redundancy. The decoder subsequently

reconstructs the image from this representation,

ensuring that only the most relevant visual elements

are preserved.

VAE is particularly effective in learning robust

and structured representations that are useful for

classification. Instead of using raw pixel values, the

latent space embeddings generated by the encoder

serve as input for the next stage of the pipeline.

3.3 Feature Selection and

Dimensionality Reduction

Since deep learning models often generate high-

dimensional feature spaces, it is crucial to select the

most informative features to enhance classification

efficiency. This module applies Principal Component

Analysis (PCA) or other statistical techniques to

eliminate redundant or less significant features. By

reducing dimensionality, the model ensures faster

training and better generalization while maintaining

important ulcer characteristics.

3.4 Classification Using XGBoost

The refined feature set is then passed into XGBoost,

an optimized gradient boosting algorithm that builds

multiple decision trees to classify images. XGBoost

is chosen due to its efficiency, scalability, and ability

to handle imbalanced datasets. It constructs trees

Hybrid Vae-XGBoost Framework for Efﬁcient Classiﬁcation of Diabetic Foot Ulcer Images

211

sequentially, with each tree correcting errors made by

the previous one.

During training, XGBoost optimizes

hyperparameters such as learning rate, tree depth, and

number of estimators to improve classification

performance. The classifier outputs the final ulcer

classification, distinguishing between normal skin,

infected ulcer, and healing ulcer based on extracted

features.

3.5 Performance Evaluation and

Validation

To assess the effectiveness of the proposed system,

various performance metrics are calculated,

including:

• Success Rate (%): Measures the overall

correctness of the model.

• Exactness (%) and Sensitivity (%): Evaluate

the model’s ability to correctly classify

ulcers.

• F1-Measure (%): Balances precision and

recall, especially for imbalanced datasets.

• AUC-ROC Curve: Analyzes the classifier’s

ability to distinguish between ulcer types.

Cross-validation techniques, such as k-fold

validation, are applied to ensure that the model

generalizes well to unseen data. The results are then

compared with traditional CNN-based models to

highlight the advantages of the Hybrid VAE-

XGBoost framework.

4 RESULT & DISCUSSION

For this work, we have accumulated a dataset of 800

images from website which was divided in three

phases. We have divided this dataset in training

dataset of 80 images and testing dataset of 20 images.

We have trained the Multi scale architecture in

training dataset by following transfer learning

technique in which pre-trained weights of proposed

model have been used to initialize training weights.

We have trained the VAE model over training dataset

in 50 epochs with batch size=10 and learning rate= 0.

0001.. Figure 2 show the Input image.

Figure 2: Input Image.

Figure 3: Validation and Testing Curve.

Figure 4: Classification Result.

ICRDICCT‘25 2025 - INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION,

COMMUNICATION, AND COMPUTING TECHNOLOGIES

212

Figure 3 and 4 shows the Validation and Testing curve

and Classification result respectively.

The training curve represents the model's progress on

the training dataset over time, whereas the validation

curve illustrates its performance on the validation

dataset. Ideally, the model's efficiency should

increase with each epoch until it stabilizes at a certain

level.

Table 1 Performance Analysis of Precision, F1

Score, Accuracy and Specificity of the proposed

method with Various Models

Table 1: Performance Comparison.

Approac

Succes

s Rate

(%)

Exactnes

s (%)

Sensitivit

y (%)

F1-

Measur

e (%)

CNN-

Based

Model

85.3

82.7

83.1

82.9

ResNet-

88.9

87.5

86.8

87.1

VGG-16

87.2

85.9

85.5

85.7

XGBoos

t (Raw

Features

)

83.5

81.2

80.9

81.0

Propose

d VAE-

XGBoos

92.4

91.1

90.5

90.8

5 DISCUSSIONS

Highest accuracy (92.4%) is produced by the Hybrid

VAE-XGBoost framework compared to CNN-based

approaches (ResNet-50, VGG-16) and raw features

Precision, Recall, and F1-score of the resultant model

are significantly improved due to efficient feature

extraction by VAE that learns to identify

discriminative ulcer patterns while reducing noise.

The traditional CNN methods including ResNet-

50 and VGG-16 have accuracies that are high but

have computationally demanding models and

extensive training sets.

The performance of XGBoost trained over raw

features is relatively inferior because hand-crafted

features perform poorer compared to deep-learned

representations extracted by VAE.

The technique utilizes both strengths of deep

feature extraction (VAE) and machine learning

classification (XGBoost) to gain improved

generalizability and stability.

Impact on Medical Diagnosis: The results indicate

that the Hybrid VAE-XGBoost system can

effectively aid healthcare workers in early DFU

identification to prevent amputation and serious

complications. The system provides:

• Enhanced diagnostic specificity to reduce

misclassification

• Successful pattern recognition with respect

to ulcers.

• Reduced overfitting because VAE learns in

an organized feature space.

• Scalability and interpretability to render it

appropriate to apply in clinical practice.

The superior performance of this model suggests that

it can be used in mobile diagnostic apps or in

telemedicine systems to achieve automated, efficient,

and accurate DFU classification.

6 CONCLUSIONS

The developed Hybrid VAE-XGBoost technique

provides an accurate yet efficient DFU classification

method. Using the Variational Autoencoder (VAE) to

perform deep feature extraction and XGBoost to

obtain reliable classification results, the system

provides improved performance in identifying

diverse types of ulcers. Deep learning-based

representation learning in combination with machine

learning-based classification provides improved

generalizability, overfitting minimization, and

diagnostic performance. Strong preprocessing,

feature selection, and evaluation practices further

confirm the reliability of the system. Experimental

results confirm that the developed method is superior

to conventional CNN-based methods in providing an

efficient, scalable, and interpretable DFU diagnostic

method. Future work can focus on integrating real-

world deployment, multi-modal fusion capabilities,

and explainable AI practices to further support

clinical utility.

The system can support early DFU identification

among professionals to avoid future complications

and improved outcomes in patients.

REFERENCES

C. Y. Rubavathi and J. Diofrin, "Diabetes Foot Ulcer

Diagnosis using Fast Convolution Neural Network,"

Hybrid Vae-XGBoost Framework for Efﬁcient Classiﬁcation of Diabetic Foot Ulcer Images

213

2023 International Conference on Networking and

Communications (ICNWC), Chennai, India, 2023, pp.

1-5, doi: 10.1109/ICNWC57852.2023.10127332.

E. Santos, F. Santos, J. Dallyson, K. Aires, J. M. R. S.

Tavares and R. Veras, "Diabetic Foot Ulcers

Classification using a fine-tuned CNNs Ensemble,"

2022 IEEE 35th International Symposium on

Computer-Based Medical Systems (CBMS), Shenzen,

China, 2022, pp. 282- 287, doi:10.1109/CBMS55023.

2022.00056.

F. Santos et al., "DFU-VGG, a Novel and Improved VGG-

19 Network for Diabetic Foot Ulcer Classification,"

2022 29th International Conference on Systems,

Signals and Image Processing (IWSSIP), Sofia,

Bulgaria, 2022, pp. 1- 4, doi: 10.1109/IWSSIP55020.

2022.9854392

F. Arnia, K. Saddami, R. Muharar, D. A. Dwi Pratiwi and

Y. Nurdin, "Diabetic Foot Ulcer Detection on Mobile

Platforms through Thermal Imaging and Deep

Learning," 2023 International Conference on Smart-

Green Technology in Electrical and Information

Systems (ICSGTEIS), Badung, Bali, Indonesia, 2023,

pp. 104- 108, doi:10.1109/ICSGTEIS60500.2023.104

24364.

H. Jin and L. Liu, "Quantification of Diabetic Foot Ulcer

Based on Ulcer Segmentation," 2022 IEEE

International Conference on Industrial Technology

(ICIT), Shanghai, China, 2022, pp. 1-7, doi:

10.1109/ICIT48603.2022.10002830.

J. Katual and A. Kaul, "Analysis of Thermal Images with

Parallel Convolutional Deep Neural Network for

Diabetic Foot Detection," 2022 IEEE 3rd Global

Conference for Advancement in Technology (GCAT),

Bangalore, India, 2022, pp. 1-5, doi:

10.1109/GCAT55367.2022.9972064.

Kumar, L. Nelson and S. Singh, "ResNet-50 Transfer

Learning Model for Diabetic Foot Ulcer Detection

Using Thermal Images," 2023 2nd International

Conference on Futuristic Technologies (INCOFT),

Belagavi, Karnataka, India, 2023, pp. 1-5, doi:

10.1109/INCOFT60753.2023.10425447.

P. Kedia, P. Soni, P. Gupta, R. Pillai and A. Chaudhary,

"ConvXGDFU - Ensemble Learning Techniques for

Diabetic Foot Ulcer Detection," 2022 4th International

Conference on Advances in Computing,

Communication Control and Networking (ICAC3N),

Greater Noida, India, 2022, pp. 1551-1557, doi:

10.1109/ICAC3N56670.2022.10074466.

S. Sudha and R. S. Sabeenian, "ALEXNet Optimized Cross

layer Convolution Neural Network Algorithm for

Efficient Diabetic Foot Ulcer Identification," 2023

International Conference on Integrated Intelligence and

Communication Systems (ICIICS), Kalaburagi,

India, 2023, pp. 1- 6, doi: 10.1109/ICIICS59993.2023.

10421606

S. G, B. Avinash, M. Shabbir Alam, M. Vyankatesh

Ghamande, K. K. Gupta and R. Rastogi, "Automatic

Detection of Infection in Diabetic Foot Ulcer Images

Using Improved-CNN-SVM Approach," 2023 7th

International Conference on Electronics,

Communication and Aerospace Technology (ICECA),

Coimbatore, India, 2023, pp. 505-510,

T. K. Vaiyapuri and R. Valli, "Segmentation of Diabetic

Foot-Ulcer from Tsallis and Aquila-Optmizer

Enhanced Images," 2023 International Conference on

System, Computation, Automation and Networking

(ICSCAN), PUDUCHERRY, India, 2023, pp. 1-5,

Z. Liu, J. John and E. Agu, "Diabetic Foot Ulcer Ischemia

and Infection Classification Using EfficientNet Deep

Learning Models," in IEEE Open Journal of

Engineering in Medicine and Biology, vol. 3, pp. 189-

201, 2022, doi: 10.1109/OJEMB.2022.3219725.

ICRDICCT‘25 2025 - INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION,

COMMUNICATION, AND COMPUTING TECHNOLOGIES

214