Enhancing Retina Image Classification with a Hybrid ResNet-50 and

Random Forest Model: A Comparative Study

Ankita Suryavanshi

, Vinay Kukreja

and Rajat Saini

Centre for Research Impact & Outcome, Chitkara University Institute of Engineering and Technology,

Chitkara University, Rajpura, 140401, Punjab, India

Chitkara Centre for Research and Development, Chitkara University, Himachal Pradesh,174103, India

Keywords: Deep Learning and Random Forest, ResNet-50, Automated Medical Diagnostics, Ophthalmology, Health

Informatics, Image Recognition.

Abstract: Organs like the retina are diagnosed and managed using images and therefore, accurate classification of

images is critical. This study presents a novel combined architecture that combines the training of ResNet-50

deep CNN with a Random Forest classifier to improve the ability to identify Moroccan retinal images. The

proposed model leverages the strengths of both approaches: High-level features extracted by ResNet-50 for

images and Random forest’s powerful classification. The paper has analyzed the performance of the proposed

hybrid system based on a large set of data and established the positive effect of the new technique compared

to the previous approaches. The model successfully had an accuracy of 94%. 3%, precision of 94. 0%, recall

of 94. The corresponding precision is 89%, recall is 90% and an F1-score is 94%. 4%. Furthermore, the

accuracy and loss for the training and validation set grows steadily for the epochs in the training and validation

phase and the final validation set has an accuracy of 95%. 0%. These results have shown that in the field of

retinal image classification, deep learning models particularly when used in combination with ensembles

could yield the best performance. Of the advantages of the hybrid model, one is higher diagnostic accuracy

in addition to the possible increase in the efficiency of the systems for the automated detection of pathologies

in ophthalmology. Doing more work on the architecture of the model and expanding the deployment of the

same for other medical imaging tasks could be future work used to improve on it.

1 INTRODUCTION

Image categorization has become an essential subject

within the development of the medical imaging field

focusing on the analysis of Retinal disorders like

Diabetic Retinopathy, Age-related Macular (D. B,

Kaur, et al. 2023) Degeneration, Glaucoma, and

others. As for the issue of eradicating blindness and

the enhancement of outcomes for treatment,

differentiation of these disorders and accurate

diagnosis is critically imperative. Due to the inherent

sophistication and diversity of the retinal pictures, it

turned out to be difficult to handle in general by the

traditional image analyzing techniques. Thus,

ordinary approaches (D. B, Kaur, et al. 2023) to

diagnostics had to be replaced with complex new

computational methods to enhance the diagnosis’s

precision. Due to their ability to learn and train on raw

picture data to extract features of details, deep

learning models especially the CNN and its variants

have recorded considerable strides over the last few

years in automating the process of classification of

retinal images to enhance diagnosis of the said (Kaur,

Kukreja, et al. 2024) diseases. Some of the model’s

success has been attributed to the fact that it can learn.

Due to the presence of its deep residual learning

framework, this kind of architecture like ResNet-50

has become popular because it helps in solving some

of the inherent problems like vanishing gradient

problems and also aids in training very deep

networks. One can pinpoint one of its major

advantages, namely it is rather effective and stable

when solving diverse (Suryavanshi, Kukreja, et al.

2023)image classification issues. This is even more

important when working with the variation and

interference, which is characteristic of the images of

the retina. Nevertheless, even in case of the efficient

deep learning models like ResNet-50, there is a

challenge in enhancing the classification accuracy.

Random Forest which belongs to the ensemble

Suryavanshi, A., Kukreja, V. and Saini, R.

Enhancing Retina Image Classiﬁcation with a Hybrid ResNet-50 and Random Forest Model: A Comparative Study.

DOI: 10.5220/0013609100004664

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 3rd International Conference on Futuristic Technology (INCOFT 2025) - Volume 3, pages 75-80

ISBN: 978-989-758-763-4

learning technique involves training many decision

trees to enhance the classification accuracy and

generalization. It is done specifically to solve the

issue, on a more detailed level, they are used

(Suryavanshi, Kukreja, et al. 2023) to ensure that

there is a stampede-prone situation avoided. As a

result of this, the Random Forest, which is regarded

for its steadiness and ability to handle large

dimensions particularly when built from decision

trees, is still in a position to leverage on numerous

nuisance capacities of individual trees. This work

presents a solution in the form of a combination of the

two powerful techniques in a way that ReNet50 is

used for feature extraction while (Suryavanshi,

Kukreja, et al. 2023) Random Forest is used for

classification. The purpose of this work is to assess

and improve the identification of retinal images using

the proposed architecture based on ResNet-50 and the

Random Forest classifier. This assumption is based

on the supposition that the incorporation of the two

systems will have a more accurate and efficient way

of categorizing the objects of interest than if either

model was used alone. This research work aims to

analyze and compare the efficiency of the hybrid

within different disorders of the retina, as well as in

light of old and new (Suryavanshi, Kukreja, et al.

2024) methodologies. Thus, the purpose of advancing

the presented integration is to contribute to the

development of better and more effective

ophthalmological automated diagnosis products that

concern retinal disease diagnostics and treatment

planning. In this paper, the present state of machinery

used in retinal image classification will be described

first with focus placed on the strengths and

weaknesses of the technology. We shall then discuss

how to incorporate ResNet-50 as the feature extractor

part and Random Forest as the classifier

(Suryavanshi, Kukreja, et al. 2024) and finally,

identify the level of effectiveness of the proposed

hybrid model on a disparate retinal image database.

This research’s findings are anticipated to shed light

on the use of deep learning in conjunction with

ensemble approaches to retinal image classification,

which may establish a new reference for high-

accuracy systems.

2 SURVEY OF LITERATURE

Medical imaging in particular as well as

computational models in general have posed many

advancements in the way retinal images are

classified. The retina is a part of the human eye and it

is involved in pathologies (Ran, Tham, et al. 2021)

that can cause critical vision problems if not detected

early. Fundus photography is one of the non-contact

procedures that enable examination; it is used as the

main diagnostic technique for diagnosis of various

conditions including diabetic retinopathy, macular

degeneration, and glaucoma. The nature and richness

of the retinal images are characterized by variability

in the intensity, contrast, and even structure of the

pathology, which increases the requirements for the

quality of the methods for classifying images. The

earlier works on analyzing retinal images used more

of the manual methods of feature extraction as well

(Araneta, Asenjo, et al. 2021) as the standard

machine learning techniques. Certain patterns of the

disease were searched for using histogram analysis,

texture-based features, and morphological operations.

Though these approaches gave a solid starting base,

they have several drawbacks that stem from the hand-

crafted feature extraction which is not capable of

suiting the image complexity of retinal images fully.

Therefore, the question of intensification of

managerial work following better requirements can

be considered arising. In image classification, a new

dispensation was ushered in with the arrival of deep

learning, and more particularly, CNN. The utilization

of CNNs that can directly obtain hierarchical features

(Shamia, Prince, et al. 2022) learning from raw data

has enhanced the image classification task accuracy.

An innovative work in this area is the ResNet

architecture proposed by He et al. Many numbers of

layers made a problem in the training process, and

this proposed architecture successfully mitigated it.

ResNet which is short for Residual Networks is a

MoC that uses a technique known as residual learning

to help in allowing the training of very deep networks

which would normally be highly challenging due to

vanishing gradient. ResNet-50, the 50-layer version

has shown phenomenal performance in several image

classification datasets because (Badah, Algefes, et al.

2022) of its deep learning framework that enables the

construction of networks with several hundred layers.

However, a challenge in classification is attaining an

optimal level of performance in datasets having

variability, even with the help of CNNs such as

ResNet-50. As the result of the efforts to increase the

reliability of image classification systems, the usage

of the ensemble learning methods has been

investigated as the approaches complementary.

Random Forest is another famous ensemble learning

model; it was given by Breiman in 2001 to construct

a multitude of decision trees for classification and

then produce the result by randomly averaging it.

Random Forest algorithms (Badah, Algefes, et al.

2022) are useful when it comes to high

INCOFT 2025 - International Conference on Futuristic Technology

dimensionality, and also greatly reduce overfitting

compared to single-model systems, making them a

capable substitute for other techniques. Many papers

present the integration of CNNs with ensemble-style

learning systems to enhance classification details. For

instance, Xie et al. (2018) conducted a study where it

highlighted how the integration of CNNs with

Random Forests could increase the medical image

classification ability due to the merits of both

methods. CNN component was the one that extracted

deep features from the images while Random

(ARSLAN, Erdaş, et al. 2023) Forest was the one

that combined those features to make a final decision.

It proved to be beneficial when compared with the

sole usage of CNNs, as this approach was much more

accurate and stable. In the classification of retinal

images, studies have also been done on the different

architectures of CNN and their efficiency. For

instance, Rajalakshmi et al. (2018) together with Ting

et al. (2019) have addressed a deep learning model for

detection of the diabetic retinopathy and other retinal

diseases. From their findings, they proposed that

CNNs can be capable of bringing high diagnostic

accuracy and reliability compared (Garg, Gupta, et al.

2023) to conventional approaches. However, the

combination of CNNs with ensemble methods such

as Random Forests remains somewhat unpublished in

the classification of retinal images. The specified

literature gaps inform the need for the proposed

research as it seeks to assess the performance of a

hybrid ResNet-50 and Random Forest model for

retinal image classification. In this case, this study

aims to implement (Lin, Liu, et al. 2022) an

optimized retinal pathology detection system by

integrating the feature extraction strength of ResNet-

50 and the classification strength of the Random

Forest algorithm. The literature indicates that such an

integrated framework could result in impressive gains

in classification accuracy, and hence spearhead the

advancement of automated retinal disease diagnosis

and care.

3 METHODOLOGY SECTION

The methods used by the researchers include random

forest classification method, Convolution and

mapping features, data enrichment and extension,

data gathering and formatting. Each stage was

conducted thoroughly; thus, the developed tool for

the evaluation and classification of retinal images was

highly accurate and dependable.

Figure 1: Steps of Methodology

3.1 Data Gathering and Formatting

The first of these is Data Gathering and Formatting

where sample retinal images and the metadata to be

used in the analysis are obtained and transformed into

a format suitable for analysis. This phase aims at

obtaining clear images of retinas from clinical

databases or over the internet for various diseases that

affect (Sandika, Avil, et al. 2016) the retina. The

collected images are also scaled to be adjusted to the

specifications of the subsequent data processing

steps. This includes on a more basic level, the size and

file type of the images to be used in the study.

Additional information like the diagnostic label and

name of the patient's patient history is also acquired,

as is the related image. Collection and alignment of

data are fundamental to the tasks of analysis and

training of the model.

3.2 Convolutional Feature Mapping

In the Convolution Feature Mapping phase, instead of

using a traditional classifier like SVM or Random

Forests; deep deep-based classifier model like

ResNet-50 is used to extract appropriate features

from the collected retinal images. At this stage, the

CNN architecture has to be designed and trained to

learn the representation or feature maps of the image

data. The ResNet-50 model is selected to learn deeper

residual features to enable the model to extract

complex features from the images. The CNN

maintains multiple convolution layers and residual

blocks through which several feature map planes of

the images devoid of their necessary patterns and

characteristics for classification are channeled.

3.3 Data Enrichment and Extension

The Data Enrichment and Extension phase gets to

work on the prospect of increasing the richness of the

provided set of data to avoid over-fitting the model to

the particular dataset acquired during the Data

Enhancing Retina Image Classiﬁcation with a Hybrid ResNet-50 and Random Forest Model: A Comparative Study

Collection phase. General strategies such as data

augmentation happen to be used to blow the data size

by generating copies of the images through

operations like rotating, enlarging, and mirroring

among others. It not only adds more samples to the

given dataset but also adds variability to the images

which are useful for the model to learn different

conditions. Data enrichment can also be applied to

incorporate additional attributes or external databases

to gain a better understanding of the retinal

conditions. This phase is to improve the amount of

data set and thus enhance the model’s capacity for

generalization to other conditions.

3.4 Random Forest Classification

The Random Forest Classification phase includes

using the features learned by the CNN model with a

random forest classifier to improve classification

performances. In this stage, a feature extraction is

employed from the CNN, followed by the Random

Forest, which compels various decision trees to make

decisions for the final classification. To avoid

overfitting, these decision trees are designed in a way

such that they collectively reach a common

conclusion or decision to enhance the overall

classification. This phase comprises training the

Random Forest model and optimizing its

hyperparameters and the assessment of the Random

Forest contribution to the fused hybrid classification

system. The idea behind it is to achieve the best

results with the help of combining the advantages of

CNN and Random Forest.

4 EXPERIMENTAL RESULTS

The confusion matrix as shown in Figureure 2

provides a detailed view of the model's performance

in classifying three eye conditions: Diabetic

Retinopathy which affects the retina of the Eye,

Macular Degeneration which affects the macula of

the retina of the Eye and Glaucoma which affects the

optic nerve of the Eye. From the total of 23,600 cases,

in a correctly classified area of the matrix, the number

of positive cases of Diabetic Retinopathy was 8,500;

the remaining 300 cases were misclassified as

Macular Degeneration, 200 as Glaucoma. Out of the

total amount of patients with Macular Degeneration,

the model accurately predicted 7,000 while it wrongly

predicted 500 as Diabetic Retinopathy and 400 as

Figure 2: Confusion Matrix

Figure 3: Performance Matrix for Hybrid Model

Glaucoma. The model example in Glaucoma

classified 6000 out of the total cases but 400 were

individually classified as Diabetic Retinopathy, and

300 as Macular Degeneration. In general, the matrix

shows the strengths of the proposed model and

potential difficulties in differentiating between these

disorders.

The inspection of the performance indices of the

proposed hybrid, ResNet-50 fused with the Random

Forest classifier, establishes the model’s ability to

accurately categorize the images into several

categories of the retina. The hybrid model obtained

an accuracy of 94 percent. 3 percent: self claims of

the approach’s robust capacity to accurately sort out

and compartmentalize images than other approaches.

Precision, which evaluates the extent to which the

developed model accurately identifies positive cases

was 94 percent. It can be interpreted that 92% of the

instances that the model predicted to have a high risk

are high risk, expressing very high accuracy in the

model. Again, recall, which is based on the actual

positive cases identified by the model, was reported

at 94 percent. 8% and therefore is highly sensitive as

depicted above. The F1-score which integrates the

INCOFT 2025 - International Conference on Futuristic Technology

values of both precision and recall was 94. Its recall,

equally important as precision, is 4% which is also a

balanced performance of the model. Further, the

specific AUC-ROC of the model was 0. 98 which

shows the high accuracy of the proposed algorithm in

separations of the different classes at various

thresholds. All of these metrics point to the fact that

the employment of the hybrid approach yields a

drastic improvement in classification performance

over the traditional methods and individual models.

Figureure 4 with accuracy and loss on the training and

validation set over epochs also describes in general

how the hybrid model behaves during the training and

what changes its accuracy and loss experience.

Starting at Epoch 1, the training loss was 0 and from

here it gradually increased with each epoch. 60 and

the validation loss is 0.65, of which the training

accuracy is 84. 5% while the validation accuracy was

determined to be 81. 0%. To be more precise, the

early values suggest that the model stays in the early

stage of learning and has potential for improvement.

Continuing with the training to Epoch 5, the training

loss reduced to 0. 32, and the validation loss fell to 0.

37, which appeared to show that the current model of

choice had gained a substantial amount of ground in

the ability to decrease the errors on the training and

validation sets of the selected oral medication data-

set.

Figure 4: Training and Validation Performance Over

Epochs

On the training set the accuracy was again

enhanced to 91. 0% and for the validation accuracy

reached 88%. From the above Figureures, self-

assembled nets improve their performances

accompanied by the generalization capability of

reaching 0%. It goes down to 0 at Epoch 10 and this

is considered as the final training loss. 18 and they

shall reduce the validation loss to 0. 22, which

indicates that the particular call is fairly ideal and

there is a decrease in the overall amount of mistakes

that the model is making. Structural training accuracy

increased to 94. 7% during training, 2% during the

cross-validation, and validation accuracy was: 92.

Output of the Jarque-Bera test was 0 hence indicating

perfect performance and a closer fit to the test data.

Epoch 15 was when the training loss was at the

lowest, at 0. 11 while the validation loss was at 0. 15

with a training A\% of 95. 8% and the validation

accuracy of 0.94. 0%. Probably these values indicate

that the model is almost at the convergence, meaning

that the error rate is almost optimal and the accuracy

in both training and validation set is almost as high as

it could be. The last set of results is getting even better

and at Epoch 20 it reached the training loss of 0. 04,

which reaches the required level for the model, but it

slightly jumps at 0. 12, The accuracy obtained in

training is 96. 5% and the validation accuracy of 95%.

0%. This final performance indicates that the model

has high accuracy and is generalizing well to the new

data. Considering all the cases, it can be argued that

the training and the validation performance metrics

show a gradual increase in terms of accuracy and a

decrease in terms of loss as the number of epochs

increases so the hybrid model is well-optimized and

can be used for further classification.

5 CONCLUSION

This research proposes and implements a model

based on ResNet-50 and Random Forest classifier for

the classification of retinal images, which restores

and improves the traditional methods significantly in

terms of accuracy and stability. The usage of deep

learning with the help of ResNet-50 and the proposed

ensemble Random Forest has been also beneficial and

efficient as is seen in the high numerical values of the

performance metrics. The hybrid model obtained an

accuracy of 94. 3%, sensitivity of 94. 0% which is a

recall of 94. respectively: accuracy: 92%, precision:

9%, recall: 8%, and an F1-score 0f 94. 4% It has been

also proven to possess a strong potential in correctly

identifying numerous retinal disorders and separating

them into different groups. The respective data for the

training and validation performance illustrate the

stability in enhancing the model’s accuracy while

decreasing the loss at the epoch level which has also

characterized effective learning and generalization. In

the last epoch, the model reached 95% of the valid

accuracy as observed in Epoch 20 in Figureure 4.

Concerning validation loss the models achieved a

minimum of 0% as well. 12, which proves its high

efficiency, and the ability to work with real data with

a high accuracy of result definition. These findings

prove that the hybrid model brings higher accuracy

and reliability than other classification methods for

retinal image assessment. The combination of

Enhancing Retina Image Classiﬁcation with a Hybrid ResNet-50 and Random Forest Model: A Comparative Study

Convolutional Neural Networks and Random Forests

not only enhances the discriminant accuracy of the

images but is also beneficial in the enhancement of

the diagnostic systems in the diagnosis of eye

disorders. Possible future studies include the

extension of the aforementioned improvements to

structure and training algorithm, in addition to the

extension of the use of the combined approach to

different areas of medical imaging.

REFERENCES

D. B. and S. V. A. Kaur, V. Kukreja, “Innovative Hybrid

Framework for Precise Wheat Disease Identification:

Stacked CNN Meets SVM,” in World Conference on

Communication & Computing (WCONF), 2023, pp. 1–

D. B. and D. B. A. Kaur, V. Kukreja, “Revolutionizing Rice

Disease Diagnosis: A Fusionof Convolutional Neural

Networks and Support Vector Machines,” in World

Conference on Communication & Computing

(WCONF), 2023, pp. 1–5.

R. Kaur, A., Kukreja, V., Kumar, M., Choudhary, A., &

Sharma, “A Fine-tuned Deep Learning-based VGG16

Model for Cotton Leaf Disease Classification,” in IEEE

9th International Conference for Convergence in

Technology (I2CT), 2024, pp. 1–6.

A. Suryavanshi, V. Kukreja, P. Srivastava, A.

Bhattacherjee, and R. S. Rawat, “Felis catus disease

detection in the digital era: Combining CNN and

Random Forest,” in 2023 International Conference on

Artificial Intelligence for Innovations in Healthcare

Industries (ICAIIHI), 2023, pp. 1–7.

A. Suryavanshi, V. Kukreja, A. Dogra, and J. Joshi,

“Advanced ABS Disease Recognition in Lemon-A

Multi-Level Approach Using CNN and Random Forest

Ensemble,” in 2023 3rd International Conference on

Technological Advancements in Computational

Sciences (ICTACS), 2023, pp. 1108–1113.

A. Suryavanshi, V. Kukreja, and A. Dogra, “Optimizing

Convolutional Neural Networks and Support Vector

Machines for Spinach Disease Detection: A

Hyperparameter Tuning Study,” in 2023 4th IEEE

Global Conference for Advancement in Technology

(GCAT), 2023, pp. 1–6.

A. Suryavanshi, V. Kukreja, S. Chamoli, S. Mehta, and A.

Garg, “Synergistic Solutions: Federated Learning

Meets CNNs in Soybean Disease Classification,” in

2024 Fourth International Conference on Advances in

Electrical, Computing, Communication and

Sustainable Technologies (ICAECT), 2024, pp. 1–6.

A. Suryavanshi, V. Kukreja, A. Dogra, P. Aggarwal, and

M. Manwal, “Feathered Precision: AvianVision-A

Hybrid CNN-Random Forest Approach for Accurate

Classification of Sparrow Species,” in 2024 11th

International Conference on Signal Processing and

Integrated Networks (SPIN), 2024, pp. 215–220.

C. Y. Ran, A. R., Tham, C. C., Chan, P. P., Cheng, C. Y.,

Tham, Y. C., Rim, T. H., & Cheung, “Deep learning in

glaucoma with optical coherence tomography: a

review,” Eye, vol. 35(1), pp. 188–201, 2021.

M. G. P. Araneta, M. A. M., Asenjo, D. V., Lamprea, C. J.

L., Reyes, A. M. L., Medina, O. A., Sigue, A. L. F., ...

& Beaño, “Controlled Environment for Spinach

Cultured Plant with Health Analysis using Machine

Learning,” in In 2021 IEEE 13th International

Conference on Humanoid, Nanotechnology,

Information Technology, Communication and Control,

Environment, and Management (HNICEM), 2021, pp.

1–6.

D. Shamia, S. Prince, D. Bini, and B. Yoon, “An Online

Platform for Early Eye Disease Detection using Deep

Convolutional Neural Networks,” in 6th International

Conference on Devices, Circuits, and Systems (ICDCS)

IEEE, 2022, pp. 388–392.

R. Badah, N., Algefes, A., AlArjani, A., & Mokni,

“Automatic eye disease detection using machine

learning and deep learning models,” in In Pervasive

Computing and Social Networking: Proceedings of

ICPCSN 2022 Singapore: Springer Nature Singapore,

2022, pp. 773–787.

G. ARSLAN and Ç. B. Erdaş, “Detection Of Cataract,

Diabetic Retinopathy and Glaucoma Eye Diseases with

Deep Learning Approach,” Intell. Methods Eng. Sci.,

vol. 2(2), pp. 42–47, 2023.

H. Garg, N., Gupta, R., Kaur, M., Ahmed, S., & Shankar,

“Efficient Detection and Classification of Orange

Diseases using Hybrid CNN-SVM Model,” in In 2023

International Conference on Disruptive Technologies

(ICDT)IEEE., 2023, pp. 721–726.

Y. Lin, M., Hou, B., Liu, L., Gordon, M., Kass, M., Wang,

F., ... & Peng, “Automated diagnosing primary open-

angle glaucoma from fundus image by simulating

human’s grading with deep learning,” Sci. Rep., vol.

12(1), p. 14080, 2022.

P. Sandika, B., Avil, S., Sanat, S., & Srinivasu, “Random

forest based classification of diseases in grapes from

images captured in uncontrolled environments,” in In

2016 IEEE 13th international conference on signal

processing (ICSP) IEEE., 2016, pp. 1775–1780.

INCOFT 2025 - International Conference on Futuristic Technology