
defined categories like cars, buses, motorcycles, and
trucks. By evaluating the number of detected objects
within a frame, the system effectively assesses and
predicts traffic levels. Based on predefined thresh-
olds of object counts, traffic can be classified as either
heavy or light, enabling traffic management systems
to make real-time decisions.The YOLO11(Khanam
and Hussain, 2024) model needs to be trained on a
well-curated dataset which contains wide range of
traffic scenarios. It is essential to capture images from
a wide range of environments to account for various
factors, including different weather conditions, light-
ing, and diverse perspectives. These factors can affect
how objects appear in images, so training the model
on varied data helps improve its robustness.
To increase dataset diversity and improve the
model’s adaptability to real-world scenarios, tech-
niques such as data augmentation—like flipping, ro-
tating, and altering brightness—are crucial. These
techniques enable models to manage variations, such
as differences between day and night traffic condi-
tions. Additionally, transfer learning, which involves
fine-tuning a pre-trained model on a smaller, domain-
specific dataset, helps speed up training and enhances
accuracy. This method utilizes the generalized knowl-
edge of pre-trained models, tailoring it to address spe-
cific challenges, such as India’s heterogeneous traffic
conditions. Customizing models for regional differ-
ences ensures precise vehicle detection and classifi-
cation in intricate environments. By integrating these
strategies, systems like YOLO11 can deliver real-time
insights, facilitating smarter traffic management, re-
ducing congestion, and improving road safety in ur-
ban settings(Chen, 2021)(Howard, 2018).
The paper is organized into five distinct sections,
each holding significance in explaining our work. The
first section focuses on the motivation behind select-
ing the problem statement and provides a detailed in-
troduction. The second section reviews the work pre-
viously carried out in this field. The third section
covers the methodology, beginning with the data col-
lection phase, followed by an in-depth discussion on
the model architecture, model development, and con-
cluding with the evaluation and performance analysis.
This section also provides a detailed explanation of
the YOLO11 model architecture. The fourth section
showcases the results obtained from the model, while
the fifth section concludes the paper by summarizing
the key features that contributed to the improved ac-
curacy of vehicle detection.
2 LITERATURE SURVEY
Intelligent Transportation Systems (ITS) have
adopted various methods to detect and predict traffic
flow, aiming to enhance traffic management, reduce
congestion, and improve mobility (Qureshi and
Abdullah, 2013). One widely used method involves
edge computing combined with YOLOv4 for vehicle
detection and DeepSORT for tracking multiple
objects (Bin Zuraimi and Kamaru Zaman, 2021).
This approach processes video feeds locally at edge
nodes, minimizing delays and reducing dependence
on cloud-based systems. While it improves real-time
accuracy in vehicle detection and tracking, it faces
challenges such as occlusions, poor lighting con-
ditions, and environmental variability, particularly
in high-density traffic scenarios. Additionally, this
method primarily focuses on detecting vehicles and
lacks the capability to effectively predict traffic flow,
limiting its utility for congestion forecasting.
A recent study combines deep learning models
like CNN-LSTM to predict traffic flow, utilizing cel-
lular automata-based simulations to generate train-
ing datasets. This approach addresses the issue of
insufficient real-world data for model training. The
CNN captures spatial patterns, while the LSTM de-
tects temporal trends, making it effective for short-
term traffic forecasting. However, this model is heav-
ily dependent on simulated data, which may not fully
reflect real-world conditions. Additionally, it does not
incorporate real-time vehicle detection, which is es-
sential for adaptive and dynamic traffic management
(Yang and Jerath, 2024).
Studies have also investigated the integration of
YOLOv4 with DeepSORT for vehicle detection and
tracking in traffic footage. This combination im-
proves the robustness of tracking, ensuring accurate
detection even when vehicles are partially obscured
or overlapping. Although it performs well for count-
ing and tracking vehicles, it lacks the functionality to
predict traffic density or trends, limiting its effective-
ness for proactive traffic management and congestion
forecasting (Ranjitha et al., 2023).
A study on one more machine learning algorithm
known as Single Shot MultiBox Detector (SSD) algo-
rithm(Su and Shu, 2024) provides an insights about
the traffic flow detection. The SSD algorithm has
gained popularity for its ability to perform object de-
tection in real-time. However, its traditional imple-
mentation struggles with feature extraction for small
or occluded vehicles, primarily due to the shallow
convolutional structure and lack of adaptability to di-
verse traffic scenarios. Combining the improved SSD
with the DeepSort tracking algorithm allows for ro-
INCOFT 2025 - International Conference on Futuristic Technology
336