approach, covering data preprocessing, feature
selection, and model development. Section 4 analyzes
the experimental results and compares classifier
performance, and Section 5 wraps up the study with
key insights and directions for future research.
2 RELATED WORKS
Nozad Bonab et al. proposed SSL-WSC, a semi-
supervised method for categorizing web services
using service performance metrics. Their approach
utilized self-training, integrating both annotated and
unannotated data to enhance the accuracy of
categorization. Utilizing the QWS dataset, the
proposed method achieved improvements in F1-score
(11.26%), accuracy (9.43%), and precision (9.53%)
compared to conventional supervised learning
techniques. By dynamically selecting and pseudo-
labeling unlabeled data, SSL-WSC reduced reliance
on manually labeled datasets and improved
scalability. Crasso et al. developed the Automated
Web Service Classification (AWSC) framework,
which leverages machine learning and text mining to
enhance web service discovery. Their research
showed that SVM (Support Vector Machines) and
Naïve Bayes classifiers efficiently categorized
services based on semantic descriptions, leading to
enhanced retrieval precision and classification
accuracy.
Shafiq et al. proposed a hybrid classification
model that combined lightweight semantics with a
Bayesian classifier to enhance web service discovery.
Their approach adaptively categorized web services
using non-functional attributes, leading to fewer
misclassification errors and improved retrieval
accuracy. Wong and Liu applied text mining methods
to generate feature vector representations of web
services, which were then clustered based on
similarity measures.
Wang et al. developed a hierarchical classification
model based on the standardized coding framework
used for categorizing products and services globally.
Their framework utilized Support Vector Machines
(SVM) to categorize services within a multi-level tree
structure, improving classification precision and
reducing misclassification errors. Chipa et al.
examined various supervised learning approaches
that utilize pattern recognition and statistical analysis
to classify web services effectively. Their findings
highlighted the effectiveness of these classifiers in
accurately categorizing services based on QoS
metrics, enabling better service ranking and selection.
El-Sayyad et al. proposed a semantic similarity-based
classification algorithm utilizing domain ontology to
improve service categorization. Their method
reduced ambiguity in service descriptions and
significantly improved classification accuracy by
considering contextual relationships between
services.
Li et al. developed a Graph Convolutional Neural
Network (GCN) using residual learning and an
attention mechanism for web service classification.
Their approach dynamically assigned weights to
features, enhancing classification accuracy in large-
scale web service environments. Kamath et al.
proposed a crawler-based system that automatically
labeled web services based on similarity analysis
techniques. Their method optimized search efficiency
and classification precision using machine learning-
based hierarchical clustering. Moreno-Vallejo et al.
leveraged Artificial Neural Networks (ANNs) for
detecting fraudulent and low-quality web services.
Their study demonstrated that deep learning models
could efficiently classify web services based on
behavioral patterns, highlighting the need for
continuous monitoring and adaptive classification
models.
3 METHODOLOGY
The proposed framework employs a machine
learning-driven approach to classify online services
according to performance-related attributes. It
integrates clustering techniques with supervised
learning models, including Extra Trees Classifier,
Logistic Regression, SVM, KNN, and GNB. By
applying advanced clustering techniques, the system
classifies web services into predefined quality
categories, evaluating service performance based on
attributes like response time, availability, and
reliability.
The system incorporates feature selection
techniques, including clustering-based pseudo-
labeling, to improve classification accuracy and
scalability. This method enables the model to process
dynamic and unlabeled data efficiently, ensuring
accurate classification results even as datasets evolve.
To ensure robust and reliable performance, the
system applies generalized preprocessing steps, such
as handling missing data, normalizing QoS metrics,
and encoding categorical features. These procedures
are intended to ready the data for robust analysis and
boost the model's capacity to generalize across
various web services and QoS scenarios.
This comprehensive machine learning framework
provides an adaptive and scalable solution for