Instance Selection Framework for Alzheimer’s Disease Classiﬁcation

Using Multiple Regions of Interest and Atlas Integration

Juan A. Castro-Silva

1,2

, Maria N. Moreno-Garcia

, Lorena Guachi-Guachi

and

Diego H. Peluffo-Ordo

nez

4,5

Universidad de Salamanca, Salamanca, Spain

Universidad Surcolombiana, Neiva, Colombia

Department of Mechatronics, International University of Ecuador, Simon Bolivar Avenue 170411, Quito, Ecuador

College of Computing, Mohammed VI Polytechnic University, Lot 660, Hay Moulay Rachid Ben Guerir, 43150, Morocco

SDAS Research Group

Keywords:

Alzheimer’s Disease, Swin Transformer, Weighted Ensemble, Instance Selection, Multiple Region of Interest.

Abstract:

Optimal selection of informative instances from a dataset is critical for constructing accurate predictive mod-

els. As databases expand, leveraging instance selection techniques becomes imperative to condense data into a

more manageable size. This research unveils a novel framework designed to strategically identify and choose

the most informative 2D brain image slices for Alzheimer’s disease classiﬁcation. Such a framework integrates

annotations from multiple regions of interest across multiple atlases. The proposed framework consists of six

core components: 1) Atlas merging for ROI annotation and hemisphere separation. 2) Image preprocessing

to extract informative slices. 3) Dataset construction to prevent data leakage, select subjects, and split data.

4) Data generation for memory-efﬁcient batches. 5) Model construction for diverse classiﬁcation training and

testing. 6) Weighted ensemble for combining predictions from multiple models with a single learning algo-

rithm. Our instance selection framework was applied to construct Transformer-based classiﬁcation models,

demonstrating an overall accuracy of approximately 98.33% in distinguishing between Cognitively Normal

and Alzheimer’s cases at the subject level. It exhibited enhancements of 3.68%, 3.01%, 3.62% for sagittal,

coronal, and axial planes respectively in comparison with the percentile technique.

1 INTRODUCTION

Alzheimer’s disease (AD) is the leading cause of de-

mentia in older adults. It is a progressive brain dis-

order that causes nerve cells to die, leading to sig-

niﬁcant brain volume reduction and affecting almost

all brain functions (WHO, 2023). AD affects crucial

brain regions, such as the Entorhinal Cortex, Fornix,

Hippocampus, Frontal lobe, Temporal lobe, and Pari-

etal lobe, impacting spatiotemporal orientation, cog-

nition, memory, intelligence, judgment, behavior, and

language (Yu and Lee, 2020).

Medical imaging, including modalities like MRI,

PET, and DTI, aids in diagnosis and treatment by vi-

sualizing brain structures. The performance of AD

classiﬁcation models is inﬂuenced by data quality and

quantity. Instance selection methods, varying in strat-

egy and selected slice numbers, may involve remov-

ing slices based on position or informative content

(Castro-Silva. et al., 2022).

The mentioned instance selection methods have

limitations. Including less informative content can in-

crease computational time and introduce noise, harm-

ing model performance. Selecting a low or ﬁxed num-

ber of slices per volume may not guarantee MRI rep-

resentativeness and may exclude AD-related or infor-

mative instances.

Data leakage occurs when test data is used in the

training process, leading to bias. An incorrect split

dataset, lack of independent testing data, or biased

transfer learning can cause it (Wen et al., 2020).

Deep learning in computer vision excels at detect-

ing brain structural changes via MRI, using 2D or 3D

models for image or ROI analysis. Originally de-

signed for sequence-to-sequence tasks like machine

translation, the widely adopted Transformer (Vaswani

et al., 2017) is now applied in diverse domains, in-

cluding NLP, CV, and speech processing. The Swin

Transformer (Liu et al., 2021), with its novel hierar-

chical architecture and Shifted windows, is known for

Castro-Silva, J., Moreno-Garcia, M., Guachi-Guachi, L. and Peluffo-Ordoñez, D.

Instance Selection Framework for Alzheimer’s Disease Classiﬁcation Using Multiple Regions of Interest and Atlas Integration.

DOI: 10.5220/0012469600003654

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 13th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2024), pages 453-460

ISBN: 978-989-758-684-2; ISSN: 2184-4313

453

efﬁcient computation, making it suitable for tasks like

image classiﬁcation.

Ensemble methods can outperform single classi-

ﬁers for AD diagnosis (Young et al., 2018) Homoge-

neous ensembles use one base classiﬁer with diverse

training data, and weighted ensembles assign weights

based on member performance.

To overcome the limitations mentioned above, in

this paper, we propose a novel framework for select-

ing the most informative 2D image slices based on

the annotations fusion of multiple Regions of Inter-

est (ROIs) such as the entorhinal cortex, fornix, hip-

pocampus, frontal, parietal, and temporal lobes. The

ROI analysis involves using multiple atlases or maps

to divide the brain into distinct regions, leading to a

method called Multiple ROI and Multiple Atlas-based

Instance Selection, which classiﬁes Cognitively Nor-

mal (CN) and AD cases.

The proposal consists of several components:

• A volume dataset builder to prevent data leakage,

conducting an early subject dataset split for creat-

ing independent training, validation, and test sets.

• A data generator in charge of producing multiple-

view instances from a volume belonging to a pre-

processed dataset with skull-stripping and regis-

tration.

• A weighted ensemble builder that combines ho-

mogeneous methods. This ensemble employs

a single base classiﬁer 2D Swin Transformer

model, trained on various datasets encompassing

ROIs, planes, and hemispheres.

The major contributions of this novel proposal can

be summarized as follows.

1. To capture the most informative images, we pro-

pose an ROI content extraction method. Using the

mode, it identiﬁes the centroid position (x, y) and

crops the 2D slice image accordingly.

2. To enhance diagnostic accuracy in AD tasks, we

propose Multiple ROI and Multiple Atlas-based

Instance Selection. It combines ROI annotations

from multiple atlases to select informative ROI

slice images and remove useless instances.

3. To tackle insufﬁcient sample utilization and lim-

itations of a single classiﬁer, we propose a ho-

mogeneous weighted ensemble using a 2D Swin

Transformer model. This method uses multiple-

view samples in a single classiﬁer with different

weights according to their accuracy performance.

4. The experimental results demonstrate that the pro-

posed method’s accuracy outperforms the state-

of-the-art related works.

The remainder of this paper is structured as fol-

lows: Section 2 presents some related works. The

materials and methods used for preprocessing and in-

stance selection are included in Section 3. Section

4 provides a detailed description of the experiments

conducted in this work and the parameter settings

used. The results of the experiments are discussed

in Section 5. Finally, Section 6 summarizes the con-

cluding remarks of this work.

2 RELATED WORKS

Most proposed approaches for image instance selec-

tion for AD classiﬁcation differ in the number of

slices selected, and the technique used to obtain the

most representative or discard the least informative

slices.

In previous studies (Choi and Lee, 2020), a com-

mon method involves computing entropy values for

slices, sorting by descending entropy, and selecting

a ﬁxed number (typically 8 to 32) of top slices. In

(Qin et al., 2022), a pre-trained U-Net is used for skull

stripping, resulting in 3D MRI images (64 × 64 × 64).

In (Hu et al., 2023; Altay et al., 2021), 96 axial slices

are selected after skull stripping and volume registra-

tion. In (Lyu et al., 2022), images are resized and

down-sampled (70×75×50) after preprocessing. Fi-

nally, in (Castro-Silva. et al., 2022), instance selec-

tion uses percentile positions for 32 slices.

ROI extraction, a crucial image processing task,

varies in approaches. In (Zaabi et al., 2020), the MRI

scan is divided into 32*32-pixel blocks, extracting

only those containing the hippocampus as ROIs. In

(Bae et al., 2020), the complete hippocampus, ana-

lyzed with 30 MRI coronal slices. Other studies, such

as (Pan et al., 2022; Li et al., 2022), create an ensem-

ble classiﬁer by extracting ROI-based patches from

different brain regions, including the hippocampus,

amygdala, and insulae.

The previously mentioned instance selection

methods have some limitations. The model’s perfor-

mance greatly depends on the number of slices per

volume (Castro-Silva. et al., 2022). Adding more

image slices with less informative content can result

in redundant or less representative information, in-

creases the computational cost (time) of training, in-

troduce noise, and deteriorates model performance.

On the other hand, selecting a ﬁxed number of slice

images could exclude AD-related or more informative

instances. A low number of slices per volume, for ex-

ample (1, 8), does not ensure the representativeness

of the 170-256 slice instances that comprise an MRI

volume.

ICPRAM 2024 - 13th International Conference on Pattern Recognition Applications and Methods

454

Classiﬁcation algorithms must be evaluated with-

out bias to ensure clinical relevance. Biased evalua-

tions caused by data leakage issues, such as incorrect

dataset splitting, absence of an independent test set,

delayed splitting, and biased transfer can lead to mis-

leading results by inﬂating model performance (Wen

et al., 2020).

Vision and Swin Transformers are applied to AD

classiﬁcation using MRI data. Examples include the

hybrid 2D model Conv-Swinformer in (Hu et al.,

2023), merging a CNN module (VGGNet-16) with

a Swin Vision Transformer. In (Lyu et al., 2022),

a ViT model pre-trained on ImageNet-21K is used

for AD/CN classiﬁcation. Additionally, (Zhang and

Khalvati, 2022) proposes a Convolutional Voxel Vi-

sion Transformer (CVVT) for 3D MRI scans, and

(Altay et al., 2021) presents a 3D Recurrent Visual At-

tention Model and an Attention Transformer. Finally,

(Xin et al., 2023) introduces ECSnet, a two-stream

model combining CNN and Swin Transformer using

the 2.5D-subject approach.

This work proposes an innovative framework for

selecting informative 2D image slices. It merges

annotations from various ROIs (entorhinal cortex,

fornix, hippocampus, frontal, parietal, and temporal

lobes) from several atlases. To prevent data leak-

age, the dataset is separated early on at both the sub-

ject (volume) and slice levels. A data generator pro-

duces multiple perspectives from a volume, using pre-

processed datasets with skull-stripping and registra-

tion. The network’s input diversity is accomplished

by combining 2D perspectives, utilizing the cropped

ROI. A weighted classiﬁcation ensemble employs a

single Swin Transformer base classiﬁer trained on

various datasets to optimize performance and model

reliability.

3 MATERIALS AND METHODS

This section presents the dataset used and the frame-

work proposed for building classiﬁcation models. It

also describes the image preprocessing techniques,

instance selection, data generator, swin model of

transformer classiﬁers, ensembles, and model perfor-

mance evaluation used in this work.

3.1 Datasets

This study employs T1-weighted structural MRI im-

ages, merging diverse datasets—Alzheimer’s Dis-

ease Neuroimaging Initiative (ADNI, 2023), Aus-

tralian Imaging, Biomarker & Lifestyle Flagship

Study of Ageing (AIBL, 2023), and Open Access

Series of Imaging Studies (OASIS, 2023)—for en-

hanced model robustness and generalization. The

multicenter dataset characterizes subjects using the

Clinical Dementia Rating (CDR) scale (ranging from

0 to 3) to determine dementia severity. Cases with

CDR zero are Cognitively Normal (CN), while those

with CDR one or greater are identiﬁed as Alzheimer’s

Disease (AD) cases. Demographic information is

summarized in Table 1.

Table 1: Summary of participant demographics and global

clinical dementia rating (CDR) scores of all the study

datasets.

Dataset Class Subjects Age Gender Total

F / M Subjects

ADNI CN 70 78.63 ± 5.82 34/36 140

AD 70 78.63±6.50 31/39

AIBL CN 70 74.56 ± 5.81 37/33 140

AD 70 74.87±7.57 43/27

OASIS CN 70 69.89 ± 9.38 39/31 140

AD 70 76.36±9.15 34/36

ALL-420 CN 210 74.36 ± 8.01 110/100 420

AD 210 76.62 ± 7.93 108/102

3.2 Proposed Framework

This framework involves six components: 1) An at-

las merging in charge of fusing ROI annotations from

multiple atlases; 2) An image preprocessor responsi-

ble for obtaining the most informative content of each

slice; 3) A dataset builder in charge of avoiding data

leakage by early selecting subjects, volumes, and im-

age slice; splitting the data into train, validation, and

test sets; 4) A data generator that provides batch-by-

batch data to ﬁt in memory; 5) A model builder to

train and test different classiﬁcation models; and 6)

A weighted ensemble builder that combines the pre-

dictions from two or more models trained on multiple

datasets. The proposed framework is shown in Figure

3.2.1 Atlas Merging

In this study, various regions, including the Entorhi-

nal Cortex, Fornix, Hippocampus, and the Frontal,

Parietal, and Temporal Lobes, which are associated

with cognitive decline in Alzheimer’s Disease, are

employed. However, due to variations in voxel con-

tent within a given ROI across different atlases, our

proposal involves the fusion of multiple atlas annota-

tions to address this issue.

The instance selection proposal, which is based on

the fusion of ROI annotations from multiple atlases,

involves selecting appropriate atlases, merging them,

and separating left and right hemisphere structures,

Instance Selection Framework for Alzheimer’s Disease Classiﬁcation Using Multiple Regions of Interest and Atlas Integration

455

Figure 1: Proposed framework.

outlined as follows:

[Step-1] Atlas Fusion. In this research, a set of at-

lases (A

A) has been integrated, including JHU DTI-

based white-matter, J

ulich histological, Talairach, and

Harvard-Oxford cortical and subcortical structural at-

lases (FSL, 2023). These atlases, featuring ROI an-

notations for both hemispheres, are registered into

MNI152 space. The (n) selected atlases (A

A) contain-

ing a speciﬁc region of interest are merged into a sin-

gle mean values map (M

∗

∈ R

w×h×d

The merged atlas (M

∗

) is binarized, converting to

a “1” any voxel having a numerical value greater than

zero, as follows: Let (M

M ∈ R

w×h×d

) the binarized

merged atlas given by:

∀

{

i, j, k ∈ N, 0 ≤ i < h, 0 ≤ j < w, 0 ≤ k < d

}

i jk

(

1, ifM

∗

i jk

> 0,

0, otherwise,

(1)

where (h) is the slice image height, (w) is the slice

image width, and (d) is the number of slices of the

M) volume.

[Step-2] Atlas Volume Bounding. The volume

whd

) is traversed by each plane in ascending order

to obtain the initial slice numbers (x

, y

, and z

) and

in descending order to obtain the ﬁnal slice numbers

, y

, and z

) for sagittal, coronal, and axial planes

respectively.

The slice number (i) of the image

) with a mean greater than zero is in-

cluded in the boundaries list, (A

A) for initial

slices and (Ω

Ω

Ω) for ﬁnals, using Equation 2.

n+1

= i, ∀

{

i ∈ N, 0 ≤ i < d

}

, and

Ω

m+1

= i, ∀

{

i ∈ N, d > i ≥ 0

}

∑

j=1

∑

k=1

i jk

w × h

> 0,

with x

= A

and x

= Ω

Ω

(2)

where (d) is the number of slice images for a particu-

lar plane. The (n) and (m) variables are the number of

elements in the A

A and Ω

Ω

Ω lists. (x

) is the initial slice

number and (x

) the ﬁnal.

3.2.2 Image Preprocessing

The raw volumes undergo preprocessing, involving

skull stripping and registration. The skull-stripped

dataset volumes are then registered to the MNI152 T1

template MRI scan, ensuring uniformity in shape, po-

sition, and alignment. The resulting scans have nor-

malized intensity, dimensions of 182 × 218 × 182, and

a resolution of 1 mm.

3.2.3 Dataset Builder

[Step-1] Subject - Volume Dataset Building. The

volumes for each subject are arranged in chronologi-

cal order based on their visit dates, and the ﬁlename

of the T1-weighted MRI volume from the last visit is

incorporated into the subjects’ dataset. This approach

guarantees the inclusion of only one volume per sub-

ject in the dataset. The subject dataset is balanced

using a simple random sampling, including the same

(k) number of subjects per class, where (k) is less than

or equal to the number of samples from the minority

class, thus avoiding class imbalance problems.

ICPRAM 2024 - 13th International Conference on Pattern Recognition Applications and Methods

456

[Step-2] Dataset Splitting. The dataset-splitting

process ensures reproducible testing and prevents data

leakage by ﬁrst splitting the subject dataset to create

independent training, validation, and test sets. The

datasets are split randomly, ensuring that an MRI vol-

ume per subject is included in only one distribution

(training, validation, or test).

[Step-3] Dataset Metadata. Metadata information

is generated for each slice containing the ROI, cat-

egorized by plane and hemisphere. This metadata

includes details such as the group (ADNI, AIBL, or

OASIS), volume ﬁlename, label, and the position of

the ROI’s center. The ROI position is represented as

a voxel (x, y, z), which includes the slice number (z)

and the centroid position (x, y). This centroid position

is crucial in cropping the 2D slice image to extract the

most informative content from the ROI.

The process involves several steps, including the

creation of a list of slice numbers containing the ROI

S), the generation of lists for pixel positions along

the x and y axes, and the determination of the centroid

position (x, y) based on the mode. The ROI center po-

sition (Φ

, Λ

), is calculated for each slice (i) included

in the ROI slice list S

S, using a mode-based approach.

The function ( f (χ)) obtains the mode from the X

X and

ϒ list of pixel positions, as follows:

∀

{

i ∈ N, i ∈ S

}

= f (X

= f (ϒ

)

(3)

where the f (χ) function gets the mode of the χ list.

(Φ

, Λ

) represent the x and y mode values for each

slice in the ROI slice list (S

S).

3.2.4 Data Generator

This component prepares training data for batch load-

ing by extracting metadata from the instance dataset.

It uses the image preprocessor to create an instance

image batch in memory based on speciﬁed output

preferences. The dataset batch size is limited by com-

putational resources like GPU and memory.

3.2.5 Model Builder

This component builds diverse Transformer classi-

ﬁcation models using datasets that include ROIs,

planes, and hemispheres. This task also includes eval-

uating model performance, as follows:

Swin Transformers. The 2D Swin Transformer

(Liu et al., 2021) serves as a weak learner, boosting

training speed and diagnostic accuracy. It augments

the number of instances and facilitates the use of pre-

trained models through transfer learning.

Model Performance Evaluation. The Transformer

model’s performance in Alzheimer’s case classiﬁ-

cation is evaluated through average accuracy and

its standard deviation. Assessment is conducted at

the subject-patient level by combining classiﬁcations

from a subject’s slice level using majority voting.

3.2.6 Weighted Ensemble Builder

This component selects the highest-accuracy single

base classiﬁers (2D Swin Transformer), trained on di-

verse datasets, to form a homogeneous weighted en-

semble model. The goal is to improve performance,

robustness, and reliability in AD classiﬁcation.

4 EXPERIMENTAL SETUP

The proposed Multiple ROI and Multiple Atlas-Based

Instance Selection method is rigorously examined

through four experiments.

4.1 Instance Selection

This experiment compares our framework to existing

instance selection methods, using a single base classi-

ﬁer (Swin Transformer). The 2D slice image datasets

are uniformly derived from the same volumes.

[Method 1] - Percentile Fixed Number. Following

(Altay et al., 2021) and (Hu et al., 2023), 96 MRI

slices from the middle of all anatomical planes, pre-

cisely at the 50th percentile, are chosen to evaluate the

instance selection method.

[Method 2] - Multiple ROI and Multiple

Atlas-Based Instance Selection (Our Proposal).

Atlas annotations were amalgamated to form merged

ROIs. The position and the number of slice instances

(n) for this experiment vary based on the ROI,

anatomical plane, and hemisphere.

4.2 Regions of Interest Datasets

Diverse perspectives are attained by consolidating

information from various sources, including ROIs,

brain hemispheres, and anatomical planes. This ex-

periment aims to identify the most informative ROI

datasets using the proposed framework.

Instance Selection Framework for Alzheimer’s Disease Classiﬁcation Using Multiple Regions of Interest and Atlas Integration

457

4.3 Weighted Ensemble

Diverse instance datasets, combining data from var-

ious sources like ROIs, brain hemispheres, and

anatomical planes, train a single 2D Swin Trans-

former classiﬁer for enhanced diversity. The weighted

ensemble model in this experiment selects the most

accurate and diverse models, employing a weighted

approach for ﬁnal classiﬁcation based on each mem-

ber’s performance.

4.4 Performance Comparison

This experiment compares the model performance

obtained using the proposed Multiple ROI and

Multiple Atlas-based Instance Selection framework

with that of state-of-the-art methods. The related

works analyzed in this experiment use diverse

datasets (ADNI, AIBL, OASIS), input types (2D

and 3D), model architectures (Vision Transformer,

Swin Transformer, and mixed models combining

Convolutional Neural Networks with Transformers),

ROIs, and instance selection techniques (Cropping,

ROI extraction).

All experiments are reported at the subject level.

The volumetric dataset comprises 420 subjects,

divided into 70% (300) for training, 15% (60) for

validation, and an additional 15% (60) for testing.

This subject-volume dataset has been previously

randomly selected and partitioned. The instance

datasets exclusively contain slices of speciﬁc ROIs,

planes, and hemispheres. Cropped ROI images

consistently maintain a size of 32 x 32 x 3 (width,

height, channels). The number of slices per subject

varies depending on the speciﬁc ROI, plane, and

hemisphere.

A single base classiﬁer model, trained on various

datasets, is utilized to evaluate their impact on the pro-

posed framework. The 2D Swin Transformer (Lyu

et al., 2022) was selected for its effectiveness in im-

age classiﬁcation.

Hyperparameter optimization was conducted us-

ing Hyperband. The values employed to train the pro-

posed framework are as follows: Optimizer Name:

Adam, Learning Rate: 1e − 04, Clip Value Rate: 0.5,

Dropout: 0.15, Batch Size: 10, and Epochs: 100.

Python libraries NiBabel, TorchIO, PIL, and

NumPy preprocess the images. The FreeSurfer tools

are used for skull stripping and MRI registration us-

ing the MNI152 template. The Keras library is used

to build the classiﬁcation models. All experiments are

repeated three times. We carry out the experiments

using ten workstations with an Intel Core i9 9900K

processor, 32 GB RAM, and 11 GB NVIDIA RTX

2080Ti GPU.

5 RESULTS AND DISCUSSION

Four experiments are conducted to test the pro-

posed framework: I) Compares state-of-the-art in-

stance selection techniques with the proposed Mul-

tiple ROI and Multiple Atlas-Based Instance Selec-

tion; II) Evaluates the effect of diverse datasets trained

with the same base classiﬁer; III) Tests the weighted

ensemble model, combining a single base classiﬁer

trained on multiple-view datasets; and IV) Compares

the models’ performance of the proposed method

with state-of-the-art related works. The presented ex-

perimental results correspond to the model accuracy

mean.

5.1 Instance Selection

Since selecting the most informative slices from the

original dataset may improve the overall performance

of the prediction model (Khan et al., 2019), this exper-

iment compares the proposed instance selection based

on a multiple region of interest and multiple atlas with

techniques based on percentiles, as shown in Table 2.

Table 2: Accuracy summary from different instance se-

lection techniques, using the same subject-volume dataset

(ALL-420). Multi-ROI-Atlas is our proposal.

Technique Sagittal % Coronal % Axial %

Percentile 92.99 ± 0.96 91.43 ± 0.79 91.941 ± 0.79

Multi-ROI-Atlas 96.67 ± 0.00 94.44 ± 0.79 95.56 ± 0.79

Hippocampus(Right)

Table 2 shows that all instance selection tech-

niques achieve the highest average accuracy values

for the sagittal plane, capturing the most critical in-

formation about the regions affected by AD. On the

other hand, the average accuracy values for each tech-

nique show that our proposed Multiple ROI and Mul-

tiple Atlas-Based Instance Selection technique en-

sures higher accuracy for all three planes (sagittal,

coronal, and axial).

5.2 Region of Interest Datasets

In homogeneous ensembles, the main difﬁculty is

generating diversity, despite using the same learning

algorithm (Sabzevari et al., 2022). This work experi-

mentally evaluates different MRI datasets trained us-

ing a single base classiﬁer.

Table 3 demonstrate that the most informative

ICPRAM 2024 - 13th International Conference on Pattern Recognition Applications and Methods

458

ROIs are as follows: a) The Parietal Lobe and the

Hippocampus exhibit the highest accuracies, with val-

ues of 94.63% and 94.35%, respectively. This is

attributed to these ROIs capturing crucial informa-

tion from regions affected by AD. b) The right hemi-

sphere (93.70%) and the left hemisphere (92.81%)

display a marginal difference of 0.89% in their mean

accuracy across all planes. This disparity indicates

structural asymmetry between the left and right brain

hemispheres. Furthermore, c) the sagittal right plane

achieves the highest accuracy at 94.26%. This can

be attributed to the sagittal plane effectively capturing

critical information from regions affected by AD.

These multiple-view dataset models provide vari-

ety to build the weighted ensemble model.

Table 3: Summary of accuracy from different ROIs, planes,

and hemispheres.

Left Right

ROI Sagittal Coronal Axial Sagittal Coronal Axial Mean

Entorhinal Cortex 92.78 93.89 95.00 93.89 90.00 95.00 93.43

Fornix 94.44 96.11 93.33 93.33 93.89 92.78 93.98

Frontal Lobe 91.11 93.33 91.67 91.11 90.56 91.67 91.58

Hippocampus 91.67 93.89 93.89 96.67 94.44 95.56 94.35

Parietal Lobe 95.56 97.22 91.67 95.56 95.56 92.22 94.63

Temporal Lobe 88.33 88.33 88.33 95.00 94.44 95.00 91.57

Mean 92.32 93.80 92.32 94.26 93.15 93.71 93.26

5.3 Weighted Ensemble

The optimal ensemble composition is problem-

dependent, and determining the number of classiﬁers

for each type remains an open question (Sabzevari

et al., 2022). Models from the previous experiments

(II) with the highest accuracy and variety are used to

create a weighted ensemble, as shown in Table 4.

The contribution of each ensemble member is

weighted proportionally to the member’s performance

to obtain the ﬁnal classiﬁcation, creating a weighted

ensemble.

Finally, Table 4 shows that combining homoge-

neous ensemble methods produces accuracy perfor-

mance results signiﬁcantly higher (98.33%) than a

single learning classiﬁcation model, providing variety

to the weighted ensemble classiﬁer.

5.4 Performance Comparison

Table 5 presents a comparison of the proposed method

with the state-of-the-art related works in terms of CN

versus AD classiﬁcation performance.

The experimental results indicate that the pro-

posed Multiple ROI and Multiple Atlas-based In-

stance Selection method using a weighted ensemble

(98.33%) and a single base classiﬁer, such as 2D Swin

Table 4: Summary of accuracy from different model mem-

bers of the weighted ensemble.

Ensemble model members accuracy

Model (ROI) Plane Hemisphere Accuracy % Weight %

Hippocampus Sagittal Right 96.67 8.32

Parietal Lobe Sagittal Right 96.67 8.32

Temporal Lobe Sagittal Left 96.67 8.32

Temporal Lobe Sagittal Right 96.67 8.32

Entorhinal Cortex Coronal Left 96.67 8.32

Fornix Coronal Left 96.67 8.32

Hippocampus Coronal Left 96.67 8.32

Parietal Lobe Coronal Left 98.33 8.46

Parietal Lobe Coronal Right 96.67 8.32

Entorhinal Cortex Axial Right 96.67 8.32

Hippocampus Axial Right 96.67 8.32

Temporal Lobe Axial Right 96.67 8.32

Ensemble 98.33

Transformer, slightly outperforms the state-of-the-art

instance selection methods regarding overall results.

This behavior can be attributed to the careful as-

sembly of subject and slice distribution sets, optimal

selection of the most signiﬁcant slice instances, and

the most informative content from the ROIs.

Table 5: Performance comparison of the proposed method

with other related works for the classiﬁcation tasks (AD vs.

CN).

Author Model Dataset Accuracy (%)

(Lyu et al., 2022) Vision Transformer ADNI 96.80

(Hu et al., 2023) 2D CNN+Transformer ADNI, OASIS 93.56

(Altay et al., 2021) Vision Transformer OASIS 91.18

(Xin et al., 2023) CNN+Swin-Transformer ADNI, AIBL 93.90

(Huang and Li, 2023) Swin Transformer ADNI+AIBL 94.05

(Mora-Rubio et al., 2023) Vision Transformer ADNI+OASIS 89.02

Hybrid Ensemble Swin Transformer ADNI+AIBL 98.33

Our Proposal +OASIS

6 CONCLUSIONS AND FUTURE

WORK

This work introduces a novel framework for strate-

gically identifying and selecting the most informa-

tive 2D image slices based on the fusion of multiple

regions of interest annotations from multiple atlases

(Multiple ROI and Multiple Atlas-based Instance Se-

lection).

The proposed framework’s impact on

Transformer-based classiﬁcation models is ex-

perimentally explored. The performance of the 2D

Swin Transformer model varies with the dataset

(region of interest, plane, and hemisphere), and

using 2D slices increases instances, allowing for

training with transfer learning or from scratch. The

classiﬁcations obtained at the slice level are fused to

obtain a classiﬁcation at the subject level. Finally,

the weighted ensemble improves the classiﬁcation

model performance and reliability by combining

Instance Selection Framework for Alzheimer’s Disease Classiﬁcation Using Multiple Regions of Interest and Atlas Integration

459

homogeneous ensemble methods.

For future work, researchers should consider us-

ing multiple inputs, mixed data, and 3D transformer

model ensembles based on multiple ROIs to enhance

the classiﬁcation model performance and reliability.

REFERENCES

ADNI (2023). Alzheimer’s Disease Neuroimaging Initia-

tive. http://adni.loni.usc.edu.

AIBL (2023). Australian Imaging, Biomarker & Lifestyle

Flagship Study of Ageing. https://aibl.csiro.au.

Altay, F., S

anchez, G. R., James, Y., Faraone, S. V., Veli-

pasalar, S., and Salekin, A. (2021). Preclinical stage

alzheimer’s disease detection using magnetic reso-

nance image scans. Proceedings of the AAAI Confer-

ence on Artiﬁcial Intelligence, 35(17):15088–15097.

Bae, J. B., Lee, S., Jung, W., Park, S., Kim, W., Oh, H., Han,

J. W., Kim, G. E., Kim, J. S., Kim, J. H., and Kim,

K. W. (2020). Identiﬁcation of Alzheimer’s disease

using a convolutional neural network model based on

T1-weighted magnetic resonance imaging. Scientiﬁc

Reports, 10(1):1–10.

Castro-Silva., J. A., Moreno-Garc

ıa., M. N., Guachi-

Guachi., L., and Peluffo-Ord

nez., D. H. (2022). In-

stance selection on cnns for alzheimer’s disease clas-

siﬁcation from mri. In Proceedings of the 11th Inter-

national Conference on Pattern Recognition Applica-

tions and Methods - ICPRAM,, pages 330–337. IN-

STICC, SciTePress.

Choi, J. Y. and Lee, B. (2020). Combining of multiple

deep networks via ensemble generalization loss, based

on mri images, for alzheimer’s disease classiﬁcation.

IEEE Signal Processing Letters, 27:206–210.

FSL (2023). Templates and Atlases included with FSL),

https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/Atlases.

Hu, Z., Li, Y., Wang, Z., Zhang, S., and Hou, W. (2023).

Conv-swinformer: Integration of cnn and shift win-

dow attention for alzheimer’s disease classiﬁcation.

Computers in Biology and Medicine, 164.

Huang, Y. and Li, W. (2023). Resizer swin transformer-

based classiﬁcation using smri for alzheimer’s dis-

ease. Applied Sciences (Switzerland), 13.

Khan, N. M., Abraham, N., and Hon, M. (2019). Trans-

fer Learning with Intelligent Training Data Selection

for Prediction of Alzheimer’s Disease. IEEE Access,

7:72726–72735.

Li, C., Cui, Y., Luo, N., Liu, Y., Bourgeat, P., Fripp, J., and

Jiang, T. (2022). Trans-resnet: Integrating transform-

ers and cnns for alzheimer’s disease classiﬁcation. In

2022 IEEE 19th International Symposium on Biomed-

ical Imaging (ISBI), pages 1–5.

Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin,

S., and Guo, B. (2021). Swin transformer: Hierarchi-

cal vision transformer using shifted windows. In 2021

IEEE/CVF International Conference on Computer Vi-

sion (ICCV), pages 9992–10002.

Lyu, Y., Yu, X., Zhu, D., and Zhang, L. (2022). Classiﬁ-

cation of alzheimer’s disease via vision transformer:

Classiﬁcation of alzheimer’s disease via vision trans-

former. ACM International Conference Proceeding

Series, pages 463–468.

Mora-Rubio, A., Bravo-Ort

ız, M. A., Arredondo, S. Q., Tor-

res, J. M. S., Ruz, G. A., and Tabares-Soto, R. (2023).

Classiﬁcation of alzheimer’s disease stages from mag-

netic resonance images using deep learning. PeerJ

Computer Science, 9.

OASIS (2023). Open Access Series of Imaging Studies.

http://www.oasis-brains.org.

Pan, D., Luo, G., Zeng, A., Zou, C., Liang, H., Wang,

J., Zhang, T., Yang, B., and the Alzheimer’s Dis-

ease Neuroimaging Initiative (2022). Adaptive 3dcnn-

based interpretable ensemble model for early diag-

nosis of alzheimer’s disease. IEEE Transactions on

Computational Social Systems, pages 1–20.

Qin, Z., Liu, Z., Guo, Q., and Zhu, P. (2022). 3d

convolutional neural networks with hybrid attention

mechanism for early diagnosis of alzheimer’s dis-

ease. Biomedical Signal Processing and Control,

77:103828.

Sabzevari, M., Mart

ınez-Mu

noz, G., and Su

arez, A. (2022).

Building heterogeneous ensembles by pooling homo-

geneous ensembles. International Journal of Machine

Learning and Cybernetics, 13(2):551–558.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones,

L., Gomez, A. N., Kaiser, L., and Polosukhin, I.

(2017). Attention is all you need.

Wen, J., Thibeau-Sutre, E., Diaz-Melo, M., Samper-

Gonz

alez, J., Routier, A., Bottani, S., Dormont, D.,

Durrleman, S., Burgos, N., and Colliot, O. (2020).

Overview of classiﬁcation of Alzheimer’s disease.

Medical Image Analysis, 63.

WHO (2023). World Health Organization. https://www.wh

o.int/news-room/fact-sheets/detail/dementia.

Xin, J., Wang, A., Guo, R., Liu, W., and Tang, X. (2023).

Cnn and swin-transformer based efﬁcient model for

alzheimer’s disease diagnosis with smri. Biomedical

Signal Processing and Control, 86.

Young, S., Abdou, T., and Bener, A. (2018). Deep super

learner: A deep ensemble for classiﬁcation problems.

Lecture Notes in Computer Science (including sub-

series Lecture Notes in Artiﬁcial Intelligence and Lec-

ture Notes in Bioinformatics), 10832 LNAI:84–95.

Yu, J. and Lee, T. M. (2020). Verbal memory and hippocam-

pal volume predict subsequent fornix microstructure

in those at risk for alzheimer’s disease. Brain Imaging

and Behavior, 14:2311–2322.

Zaabi, M., Smaoui, N., Derbel, H., and Hariri, W. (2020).

Alzheimer’s disease detection using convolutional

neural networks and transfer learning based methods.

In 2020 17th International Multi-Conference on Sys-

tems, Signals & Devices (SSD), pages 939–943.

Zhang, Z. and Khalvati, F. (2022). Introducing vision trans-

former for alzheimer’s disease classiﬁcation task with

3d input.

ICPRAM 2024 - 13th International Conference on Pattern Recognition Applications and Methods

460