
5 CONCLUSION
In summary, this work illustrates the effectiveness of
a multi-view clustering approach combined with en-
semble classification for high-dimensional gene ex-
pression data analysis, especially in the case of Colon
Tumor classification. The best results were achieved
by the configuration of 1000 clusters and 200 fea-
tures per view. Both the Weighted Voting and Ma-
jority Voting ensemble methods obtained an accuracy
of 84.21% with robust supporting metrics such as a
precision of 0.88, recall of 0.84, and an F1-score of
0.83. These results emphasize the robustness of en-
semble techniques in aggregating predictions from
diverse classifiers to enhance performance, making
them highly suitable for biomedical applications.
This research provides a powerful framework for
handling high-dimensional datasets, using clustering
and ensemble methods to reduce feature redundancy
while retaining meaningful patterns. The methodol-
ogy will be extended in future work to other datasets
beyond Colon Tumor to test generalizability and
adaptability across different biological and biomedi-
cal challenges. Alternative clustering techniques and
ensemble strategies may further optimize the classi-
fication accuracy and computational efficiency of the
approach. These directions will further strengthen the
potential of machine learning for advancing precision
medicine and bioinformatics.
REFERENCES
Aburass, S., Dorgham, O., and Shaqsi, J. A. (2024). A
hybrid machine learning model for classifying gene
mutations in cancer using lstm, bilstm, cnn, gru, and
glove. Systems and Soft Computing, 6:200110.
Aydın, F. and Aslan, Z. (2019). The construction of a
majority-voting ensemble based on the interrelation
and amount of information of features. The Computer
Journal, 63(11):1756–1774.
Ben Brahim, A. and Limam, M. (2018). Ensemble feature
selection for high dimensional data: a new method and
a comparative study. Advances in Data Analysis and
Classification, 12(4):937–952.
Bhavitha, Lekshmi Prasad, P., Ani, R., and Deepa, O.
(2023). Machine learning based admet prediction in
drug discovery. In 2023 4th IEEE Global Conference
for Advancement in Technology (GCAT), pages 1–9.
Deng, X., Li, M., Deng, S., and Wang, L. (2022). Hy-
brid gene selection approach using xgboost and multi-
objective genetic algorithm for cancer classification.
Medical & Biological Engineering & Computing,
60(3):663–681.
D
´
ıaz-Uriarte, R. and Alvarez de Andr
´
es, S. (2006). Gene
selection and classification of microarray data using
random forest. BMC Bioinformatics, 7(1):3.
Guyon, I., Weston, J., Barnhill, S., and Vapnik, V. (2002).
Gene selection for cancer classification using support
vector machines. Machine Learning, 46(1):389–422.
Kavitha, K. R., Sajith, S., and Variar, N. H. (2022). An effi-
cient boruta-based feature selection and classification
of gene expression data. In 2022 IEEE 3rd Global
Conference for Advancement in Technology (GCAT),
pages 1–6.
Kumar, A. and Yadav, J. (2023). A review of feature set
partitioning methods for multi-view ensemble learn-
ing. Information Fusion, 100:101959.
Kumar, V. and Minz, S. (2015). Multi-view ensemble learn-
ing: A supervised feature set partitioning for high di-
mensional data classification. In Proceedings of the
Third International Symposium on Women in Comput-
ing and Informatics, WCI ’15, page 31–37, New York,
NY, USA. Association for Computing Machinery.
Kumar, V. and Minz, S. (2016). Multi-view ensemble
learning: an optimal feature set partitioning for high-
dimensional data classification. Knowledge and Infor-
mation Systems, 49(1):1–59.
Menon, R. R., Gayathri, S., and Amina, A. (2023). Rep-
resentation of documents using minimal dictionary of
embeddings. volume 2023-June, page 1897 – 1903.
Cited by: 1.
Nidheesh, N., Abdul Nazeer, K., and Ameer, P. (2017). An
enhanced deterministic k-means clustering algorithm
for cancer subtype prediction from gene expression
data. Computers in Biology and Medicine, 91:213–
221.
R, K. K., Kumar, R. A., and C, M. M. (2024). A maximum
relevance minimum redundancy and random forest
based feature selection and classification of gene ex-
pression data. In 2024 5th International Conference
for Emerging Technology (INCET), pages 1–5.
R, K. K., S, A. S., and Rasheed, R. (2021). Ensemble-based
feature selection using symmetric uncertainty and svm
classification. In 2021 2nd Global Conference for Ad-
vancement in Technology (GCAT), pages 1–6.
Singh, R. and Kumar, V. (2024). Ensemble multi-view
feature set partitioning method for effective multi-
view learning. Knowledge and Information Systems,
66(8):4957–5001.
Skabar, A., Wollersheim, D., and Whitfort, T. (2006).
Multi-label classification of gene function using mlps.
In The 2006 IEEE International Joint Conference on
Neural Network Proceedings, pages 2234–2240.
Sreejesh Kumar, V. S., Aparna, K., Ani, R., and Deepa,
O. (2021). Ensemble machine learning approaches in
molecular fingerprint based virtual screening. In 2021
2nd Global Conference for Advancement in Technol-
ogy (GCAT), pages 1–6.
Xu, Y., Yu, Z., and Chen, C. L. P. (2024). Classifier
ensemble based on multiview optimization for high-
dimensional imbalanced data classification. IEEE
Transactions on Neural Networks and Learning Sys-
tems, 35(1):870–883.
Zhang, Y., Zhang, H., Cai, J., and Yang, B. (2014). A
weighted voting classifier based on differential evolu-
tion. Abstract and Applied Analysis, 2014(1):376950.
A Novel Multi-View Partitioning and Ensembled-Based Cancer Classification Using Gene Expression Data
443