Design of DRL Based Adaptive Routing Protocol for Bidirectional
Communication Between UAVs and UGVs
Prabhakar Saxena and Gayatri M. Phade
Sandip Institute of Technology and Research Centre, Nashik, India
Keywords: Deep Reinforcement Learning (DRL), Greedy Perimeter Routing Protocol (GPSR), Mobile Ad-Hoc
Networks (MANETs), Unmanned Aerial Vehicle (UAV), Unmanned Ground Vehicle (UGV)
Abstract: The recent developments in coordinated networks, specifically of Unmanned Aerial Vehicles (UAVs) and
Unmanned Ground Vehicles (UGVs) are found as game changer for autonomous systems. Existing routing
protocols are typically designed for UAV networks or for UGV networks, separately. The seamless integration
of these networks is essential to enhance situational awareness as UAVs can provide bird’s-eye view of the
surrounding and UGVs can gather detailed ground level data. Deployment of these networks requires
designing of the customized routing protocol enabling flawless communication between coordinated UAV
network and UGV network platforms and a simulation framework to test it. This paper presents a design,
implementation, and optimization of routing protocol engineered for specific requirements of coordinated
network consisting of UAV and UGV. This novel protocol design integrates the Greedy Perimeter Stateless
Routing (GPSR), for combining GPSR strategies, and Deep Reinforcement Learning (DRL) to optimize
packet routing. A simulator is developed in Python to simulate and test the proposed protocol. Simulation
result confirms that the proposed protocol establishes the shortest and most efficient paths making it suitable
for the many applications. By addressing the critical challenges in routing strategies for integrated UAV-UGV
network, this research work paves the way for intelligent and adaptive communication solutions in dynamic
environment.
1 INTRODUCTION
UAV (Unmanned Aerial Vehicles) are widely
referred to as drones (Laghari, Jumani, et al. , 2023)
and UGV (Unmanned Ground Vehicle) are known as
mobile robots. Network of UAV is commonly known
as FANETs (Flying Ad-Hoc Networks) and network
of UGV is called RANETs (Robotic Ad-Hoc
Networks). FANET and RANET are advanced
iterations of Mobile Ad-Hoc Networks (MANETs)
(Ahmed, Mohanta, et al. , 2022). These platforms
leverage a variety of wireless communication
technologies, including Bluetooth, Wi-Fi, and
cellular networks for communication among them
and form ad-hoc network autonomously(Sharma, et
al. , 2020), (Hussein, Yaw, et al. , 2022). UGV are
widely employed in reconnaissance, surveillance,
traffic monitoring applications, border patrol and
search and rescue operations. UAVs have become
essential tools in various fields, ranging from military
and law enforcement operations to civilian
applications such as disaster response, agriculture,
and filmmaking. Their versatility and adaptability
make them invaluable assets for tasks requiring aerial
surveillance, data collection, and monitoring in both
urban and remote environments (Altshuler, Pentland,
et al. , 2018).
Integrating UAVs and UGVs will revolutionized
various industries, ranging from surveillance and
reconnaissance to disaster response and
transportation. In dynamic and three-dimensional
environments, traditional routing protocols often fail
to adapt to the unique requirements of UAV-UGV
networks, leading to inefficient communication,
increased latency, and potential safety hazards.
One of the main advantages of integrated UAVs
and UGVs communication is improved situational
awareness and decision-making capabilities. UAVs
equipped with sensors can be used to collect and
transmit data to UGV, which can then use the data to
generate maps and models of the affected area.
Mobile UGV equipped with sensors and actuators can
be deployed to perform tasks such as delivering
144
Saxena, P. and Phade, G. M.
Design of DRL Based Adaptive Routing Protocol for Bidirectional Communication Between UAVs and UGVs.
DOI: 10.5220/0013588400004664
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 3rd International Conference on Futuristic Technology (INCOFT 2025) - Volume 2, pages 144-154
ISBN: 978-989-758-763-4
Proceedings Copyright © 2025 by SCITEPRESS Science and Technology Publications, Lda.
medical supplies or clearing debris, based on the
information received from the UAV (Rubio, Valero,
et al. , 2019), (Gielis, Shankar, et al. , 2022).
Integrated UAV-UGV communication can also
enhance network coverage. UAVs can serve as flying
stations, providing aerial communication to UGV and
other ground-based devices that may not have direct
line-of-sight communication with each other. This
can help to overcome obstacles and terrain challenges
that may limit the range and performance of the
individual UGV networks.
Another significant advantage of integrated UAV-
UGV communication is the ability to optimize
resource usage and energy efficiency. UAVs can use
their mobility to optimize the network topology and
larger area while UGV can be used to offload
computational tasks from UAVs, reducing their
workload and energy consumption (Hua, Wang, et al.
, 2019).
Integrated UAV-UGV communication will
enable new and innovative applications and services
that were previously impossible or impractical to
achieve. The combination of UAVs and UGVs
communication allows the efficient coordination of
UAVs and UGVs, enabling several uses, including
land mines map generation in battle field and package
delivery monitoring and rescue operations (García,
Reina, et al. , 2018).
A literature review is conducted to investigate the
potential of existing routing protocols in effectively
managing integrated communication between UAVs
and UGVs. During the process of review it is
observed that there is a gap of non-availability of
routing protocols for communication between UAVs
and UGVs is an emerging challenge in the field of
autonomous and robotic systems. This issue arises
due to the unique characteristics and operational
environments of UAVs and UGVs, which create
distinct networking requirements that are not fully
addressed by existing routing protocols.
In this work, we aim to address the issue by
enhancing the existing position-based routing
protocol, GPSR (Greedy Perimeter Stateless
Routing), originally used for UAV-UAV
communication. The newly developed protocol will
enable seamless data transfer in integrated UAV-
UGV networks, meeting the unique demands of these
heterogeneous systems. Furthermore, the proposed
protocol will be adaptive, energy-efficient, and
capable of managing dynamic topologies effectively.
In summary, this work will focus on integrating
advanced technologies such as Machine Learning
(ML) and Artificial Intelligence (AI) with position-
based protocols like GPSR. The proposed solutions
will be validated in practical scenarios to optimize the
performance, energy efficiency, and adaptability of
networks comprising UAVs and UGVs.
Section II discuss the related work accomplished
by the research in the domain of modifying network
routing protocol for UAV network or UGV network
individually. Section III provides a brief introduction
of GPSR routing protocol and DRL. Section IV
outlines the network model and methodology
employed. The results are presented in section V
followed by discussion on result in section VI. Finally
conclusions are drawn in section VII.
2 RELATED WORK
Literature review is conducted to assess the suitability
of position-based routing protocols for integrated
UAV-UGV networks and to explore the integration
of machine learning algorithms into existing
protocols.
Salazar (Salazar, et al. , 2023) recommend
exploring additional Ad Hoc routing protocols, such
as AODV and DSDV, alongside OLSR to compare
their efficiency in a FANET network for traffic
monitoring. Conducting this comparative analysis
would provide a more comprehensive understanding
of the performance of different protocols in various
scenarios. Future research could focus on
incorporating advanced technologies like machine
learning algorithms or artificial intelligence to
enhance the decision-making processes within the
FANET network. These technologies have the
potential to optimize route planning, resource
allocation, and overall network performance.
Additionally, further studies could involve real-world
implementation and testing of the proposed FANET
network in a smart city environment. This would help
validate the simulation results and evaluate its
practical feasibility and performance under realistic
conditions.
Kumar (Kumar, Raw, et al. , 2023), modified the
Greedy Perimeter Stateless Routing (GPSR) protocol
to develop the Utility Function-based Greedy
Perimeter Stateless Routing (UF-GPSR) protocol for
Flying Ad-hoc Networks (FANETs). This
modification aimed to optimize the greedy
forwarding strategy by considering multiple crucial
parameters of UAVs, such as residual energy ratio,
distance degree, movement direction, link risk
degree, and speed. Future research could involve
integrating machine learning techniques to enhance
the routing decision process in UF-GPSR. By
leveraging machine learning algorithms, the protocol
Design of DRL Based Adaptive Routing Protocol for Bidirectional Communication Between UAVs and UGVs
145
could adapt to dynamic network conditions more
effectively, leading to improved routing performance.
Additionally, future studies could focus on deploying
UF-GPSR in practical FANET environments to
assess its performance under realistic conditions.
Field trials and experiments would provide valuable
feedback on the protocol's effectiveness and
feasibility for diverse applications.
Charles (Charles, 2022) proposed a method to
enhance network quality and throughput while
simultaneously creating an energy-saving approach
with excellent quality of service. This paper
introduces an energy-efficient protocol designed to
develop a faster, miniaturized, and more efficient
routing method compared to existing protocols. The
proposed routing protocol, EEQRP, is evaluated and
compared using Network Simulator-2 (NS2). The
results demonstrate that EEQRP provides lower
average latency, greater power savings, and a higher
packet delivery rate than current protocols. Future
research could explore integrating machine learning
or artificial intelligence techniques into the EEQRP
protocol to enhance its decision-making capabilities
and further optimize energy efficiency.
Karp and Kung (Karp, Kung, et al. , 2003)
introduced GPSR (Greedy Perimeter Stateless
Routing) as an innovative routing protocol for
wireless datagram networks. GPSR makes
forwarding decisions based on the positions of routers
and the destinations of packets, utilizing greedy
forwarding with local topology information. When
greedy forwarding is not feasible, the protocol
switches to perimeter routing. GPSR is designed to
scale efficiently with the number of network
destinations, outperforming shortest-path and ad-hoc
routing protocols in this regard. In environments with
frequent topology changes due to mobility, GPSR
leverages local topology information to quickly adapt
and establish new routes. Extensive simulations in
mobile wireless networks demonstrate the scalability
of GPSR, especially in densely deployed wireless
networks. Future research could focus on
investigating the implementation and performance of
GPSR in real-world wireless network scenarios to
validate its effectiveness beyond simulations.
Additionally, exploring enhancements to GPSR to
improve its adaptability to dynamic network
conditions and optimize routing decisions in diverse
environments using machine learning techniques
could be highly beneficial.
Namdev (Namdev, Goyal, et al. , 2021) proposed
a Whale Optimization Algorithm-based Optimized
Link State Routing (WOA-OLSR) for Flying Ad-hoc
Networks (FANET) to address the challenges of
energy efficiency and communication security. The
study concluded that the WOA-OLSR
communication scheme offers a more efficient and
secure solution for FANETs, enhancing performance
and reliability in communication networks involving
drones and UAVs. Future improvements could
involve integrating machine learning or artificial
intelligence techniques with the WOA-OLSR
algorithm to further optimize routing decisions,
enhance network performance, and adapt to evolving
network dynamics in FANET environments.
Cappello (Cappello, , et al. , 2022) presented a
comprehensive framework that integrates Flying Ad
Hoc Networks (FANET) with 5G networks to
provide interconnected services that can be
sequenced, taking into account physical device
constraints and traffic flow requirements. Future
research could explore the integration of machine
learning algorithms to enhance the optimization
model for Virtual Function placement and chaining,
with the aim of further improving energy efficiency
and service satisfaction probabilities.
Hosseinzadeh (Hosseinzadeh, , et al. , 2023)
proposed a position forecast-based greedy perimeter
stateless routing approach called GPSR+ for Flying
Ad Hoc Networks (FANETs). This approach consists
of two main steps: neighbor discovery and a greedy
forwarding algorithm. In the neighbor discovery
phase, GPSR+ employs a position prediction
mechanism based on historical data. To predict
positions, weighted linear regressions utilized. Future
research could focus on enhancing this prediction
technique by incorporating lightweight machine
learning methods. Machine learning could provide
more accurate and adaptive position forecasts,
thereby improving the overall performance of GPSR+
in dynamic FANET environments. By leveraging
advanced machine learning algorithms, GPSR+ could
better handle the mobility and variability inherent in
FANETs, leading to more efficient routing decisions
and increased network reliability.
Future research should prioritize the development
of ad-hoc networks tailored for integrated UAV-UGV
systems. This includes the integration of advanced
technologies such as machine learning and artificial
intelligence with position-based protocols like GPSR.
Emphasis should be placed on validating these
solutions in practical scenarios to enhance the
network's performance, energy efficiency, and
adaptability. In highly dynamic nature of networks
involving UAVs and UGVs, position-based routing
protocols like GPSR are particularly well-suited for
such applications.
INCOFT 2025 - International Conference on Futuristic Technology
146
3 BRIEF ABOUT GPSR AND DRL
GPSR is a routing protocol offering Greedy and
Perimeter Forwarding modes for efficient packet
delivery. By selecting the nearest node to the
destination and employing the right-hand rule when
necessary, GPSR aims to optimize routing paths.
However, its reliance solely on distance metrics can
result in elevated delivery failures. While GPSR
assigns global routes to nodes through a greedy
algorithm, it's associated with high overhead and
diminished delivery ratio beyond a certain threshold.
Notably, GPSR's suitability diminishes in urban
settings characterized by local loops, limiting its
effectiveness in such environments (Abbas, Ahmed,
et al. , 2022), (Choi, Hussen, et al. , 2018), (Sugranes,
Razi, et al. , 2022), (Wen, Huang, et al. , 2018). The
GPSR algorithm includes two different packet
forwarding techniques. In GPSR, a greedy
forwarding technique is widely applied technique
while perimeter forwarding is best suitable where a
greedy technique cannot be applied(Houssaini,
Zaimi, et al. , 2017).
In GPSR each packet is marked with its
destination's location by its originator. When a node
receives a packet, it examines the geographic location
of the destination and compares it with the positions
of its neighbouring nodes. The node then makes a
locally optimal choice of the next hop by selecting the
neighbor that is geographically closest to the packet's
destination among its radio neighbors. The process
continues iteratively as each node forwards the packet
to the next hop that is closer to the destination until
the packet reaches its intended destination. This
method aims to curtail the number of hops essential
for packet delivery and refine the routing path based
on geographic location. GPSR includes three basic
routing algorithms: Distance Vector (DV), Link State
(LS), and Path Vector. In the DV method, every node
recognizes its route to a desired destination by
seeking details shared on regular interval by its
neighbouring LS approach involved nodes, to
broadcast about status alteration across the whole
network topology. As stated by researchers, both the
DV and LS strategies might undergo from minor
inaccuracies in a router's perceived network
condition, likely leading to routing loops or
connectivity issues. Moreover, the intricacy of
messages within the DV and LS routing algorithms
can be effected by two factors: rate of change of
network topology and the number of routers
operating within the routing zone (Karp, Kung, et al.
, 2003). Greedy forwarding is an effective technique
but it may generate suboptimal routes in network
topologies, where packets temporarily move farther
from the destination (Feng, Zhang, et al. , 2016).
DRL (Azar, et al. , 2021) engages in training a
computational agent to communicate with an
environment to optimize cumulative rewards through
iterative process. In the framework of UAVs and
UGVs network, DRL models are trained to cater real-
time packet routing decisions depending on routing
tables and sensor data. It considers various factors
such as velocities of UAVs or UGVs, obstacles,
positions of UAVs or UGVs and other relevant
environmental factors. It requires a suitable
representation of the state space. For taking real time
packet routing decisions state space consists of the
current positions, velocities, and orientations of UAV
and UGV.
In summary, GPSR's dual approach of greedy
forwarding and perimeter forwarding offers a
comprehensive solution for routing in dynamic ad-
hoc networks. Upcoming studies could enhance
GPSR by integrating DRL methodologies. DRL
could facilitate GPSR to adaptively grasp optimal
routing schemes from dynamic network state, in-turn
improvement in decision-making processes and
routing performance. This unification of DRL with
GPSR forms a routing protocol suitable for
bidirectional communication between UAVs and
UGVs.
4 PROPOSED METHODOLOGY
While designing a bidirectional network model for
UAVs and UGVs communication, the unique
requirements of both the networks are considered.
Main objective is to create an integrated system that
facilitates seamless communication and collaboration
between UAVs and UGVs. Following are the design
considerations,
Let us consider a network, consisting of
UAVs and UGVs, each equipped with
Global Positioning System (GPS) and a
processor capable of data processing,
trans-receiver, caching and storage.
The proposed network consisting of a set
of ‘n’ flying nodes (UAV
i
i=1,2,3…n)
and ‘m’ mobile ground nodes (UGV
j
j=1,2,3…m).
Let ID
UAVi
and ID
UGVj
be a unique
identification assigned to UAV
i
and
UGV
j
respectively.
Transmission radius of UAV
i
and UGV
j
is considered as R
i
.
Design of DRL Based Adaptive Routing Protocol for Bidirectional Communication Between UAVs and UGVs
147
Communication within the network of
UAVs and UGVs is initiated using a
specially designed Hello packet tailored
for the network.
Position of flying nodes and unmanned
ground vehicle are respectively (x
i
,y
i
,z
i
)
and (x
j
, y
j
).
Seamless communication between UAV and
UGV networks requires periodic updates of network
topology information, which can be achieved using
Hello Packets. Hello Packet is typically used in a
network to establish and maintain neighbor
relationship between nodes and continuous
information exchange (Aljabry and Suhail, 2022). In
the proposed network, nodes are initially categorized
and identified as GP-AV for UAV nodes and GP-GV
for UGV nodes. The GPSR method starts by
identifying neighbouring nodes by means of the
transmission of hello packets. During the hand
shaking of hello packet, each node collects data about
its neighbors such as geographic positions, validity
time and node types (GP-AV or GP-GV). This data
helps the GPSR protocol to form a local map of the
network topology, which is essential for decision
making for forwarding the packet. The neighbor
discovery process assures that every node maintains
an updated table of nearby nodes and their
corresponding positions, required for effective greedy
forwarding. In greedy forwarding, every node
leverage position details to forward data packets to
the neighbor closest to the destination. If greedy
forwarding fails (e.g., when a packet reaches a local
maximum with no neighbor closer to the destination),
the protocol switches to perimeter routing to navigate
around obstacles and continue forwarding the packet
towards its destination. Subsequently, the ID of both
UAVs and UGVs, referred to as ID
UAVi
and ID
UGVj
respectively, disseminates a hello message within its
transmission radius (R
i
) to inform neighbouring
nodes, including UAVs and UGVs, about its
remaining energy and position. Following the
exchange of hello messages referred in figure 1.
Sourc
e ID
Node
Type
(NT)
Coordinate
s
UAV
i
(x
i
,y
i
, z
i
)
/UGV
j
(x
j
,y
j
)
Time
Stamp(TS
)
Current
Energy
(E)
ID
Hear
(
ID
H
)
Validit
y Time
(
VT
)
SI (Sharing
Information)
Hello
ID(IDH
)
Figure 1: Hello Packet Format for Proposed Network.
Following are the fields of Hello Packet
Source ID (IDS): Unique identifier for
the UAV/UGV sending the "Hello"
packet.
ID Hear (IDH): Identification
information hearing other UAV/UGV.
Node type (NT): Identification whether
hello packet generated from UAV or
UGV (IDP).
Validity Time (VT): Time for which
hello packet is valid.
Position (P): Include the current
geographic coordinates (latitude,
longitude, altitude) of the UAV/UGV.
Sharing Information (SI): Sharing
information with nodes.
Time Stamp (TS): Indicate the time at
which the "Hello" packet was sent.
Energy (E): Residual Energy of UAV or
UGV.
IDH: Hello identification number.
Table 1: Neighbor table format of nodes.
I
D
NT
Hello Message
Information
PP VT
He
llo
ID
Re
g
Ti
me
Positi
on
R
E
I
D
UA
V /
UG
V
ID
Ho
t
o
x
0
j
y
0
j
z
0
j
R
E
j
x
j
(t),
y
j
(t),
z
j
(t)
VT
j
ID
H1
t
1
x
1
j
y
1
j
z
1
j
R
E
j
.
ID
Hn
t
n
x
n
j
y
n
j
z
n
j
R
E
j
These hello packets, containing the node
information such as identity, position, type, residual
energy, and neighbours, are broadcasted at regular
interval to notify other nodes of their presence and
comprise of vital information such as node type,
unique ID of the transmitting node, time stamp etc.
Based on the information, extracted from the hello
messages received from adjacent UAVs and UGVs
compiles a list of its neighbouring nodes. This
information is then registered in its neighbouring
table of nodes, as illustrated in the format depicted in
Table 1. Neighbouring table holds information about
adjacent nodes.
ID: Unique ID of adjacent UAV/UGV.
NT: Type of adjacent neighbor i.e.
UAV/UGV.
INCOFT 2025 - International Conference on Futuristic Technology
148
Hello ID: Hello packet identification
no.
Reg Time: Time at which
corresponding hello packet transmitted.
Position: Geographical Position of
UAV/UGV.
RE: Residual Energy of the
neighbouring UAV/UGV.
Predicted Position (PP): Predicted
Position of UAV/UGV with the help of
DRL model.
Validity Time (VT): Time for which
Hello Packet is valid. It is an indicator
of network life.
Developing an ad-hoc network for bidirectional
communication between coordinated UAV and UGV
involves a systematic approach to ensure effective
collaboration between these autonomous systems. Let
the network comprised of UAVs and UGVs as nodes,
communication need to be established through
various channels, including UAV-UAV, UGV-UGV,
and UAV-UGV. For establishing the communication
between the nodes, integration of Greedy Perimeter
Stateless Routing (GPSR) and Deep Reinforcement
Learning (DRL) protocols are used. Initially, GPSR
determines the routing paths based on the geographic
positions of nodes and the destination. Each node in
the network represented by the ‘Node class’,
comprising of the unique attributes such as ID,
coordinates (x, y, z) in 3D space, and node type
indicating whether it is a UAV (i) or UGV (j).
The communication process is initiated by
transmitting a hello packet. Further, the routing
function identifies the nearest neighbours based on
Euclidean distance, which is essential for establishing
communication paths. To optimize routing decisions,
the network uses an integration of GPSR and DRL.
GPSR utilizes greedy forwarding decisions, choosing
the nearest neighbor as the next hop. In cases where
direct greedy forwarding is not possible due to
obstacles or unreachable areas, GPSR switches to
perimeter forwarding mode. This determines the
routing path, total distance, hop count, forwarding
strategy, and node types involved in the
communication process. DRL augment GPSR by
continuously learning from the network conditions
and environment. It adapts to changes in topology,
traffic, and obstacles in-turn enhancing
communication efficiency and reliability. The DRL
model uses a Sequential architecture having dense
layers to learn and fine-tune routing decisions.
A simulator is developed to test the proposed
communication protocol. The predefined function
chooses an action index, based on anticipated action
probabilities from the DRL model, facilitating in
decision-making during routing. The ‘angle function’
calculates angles between vectors and supports
Perimeter Forwarding which is triggered if Greedy
Forwarding fails, ‘packet handling function’ divides
the message into individual words and generates a
packet for transmission. Meanwhile, ‘network update
functions’ dynamically updates the node coordinates
and emulate the communication process by updating
node coordinates and printing information about
nodes within the communication range.
Through the collaborative working of GPSR and
DRL, UAVs and UGVs establish robust
communication links, enabling seamless data
exchange and coordination within the network. The
simulation of coordinated communication between
UAVs and UGVs networks engage in designing and
testing a framework having the capability to interact
and collaborate with both aerial and ground vehicle.
5 RESULTS
Testing of the proposed protocol is conducted in two
modes through the specifically developed simulator.
In the first mode, a data string is forwarded through
the network, and various network parameters, such as
hop count, latency are measured. In the second mode,
the number of transmitted data packets is varied, and
corresponding network parameters are calculated.
For the simulation purposes in mode one, let the
node 2(UAV) wishes to communicate with the node
12(UGV) as illustrated in figure 2. The message
"hello test message is transmitted between node two
and node twelve" is transmitted, which is segmented
as: “hello”, “test”, “message”, “is”, “transmitted”,
“between”, “node”, “two”, “and”, “node” and
“twelve”. After initializing the network, each node
sends a hello packet to share information about itself.
When a packet is transmitted, the script prints routing
details, including the forwarding strategy used (either
'Greedy' or 'Perimeter') and the types of nodes
involved in the forwarding process.
Simulato
r
Python
No of UAV Nodes 6
No of UGV Nodes 6
Mobility Model Random Way Point
Communication radius 150 meter
Speed of UGV Nodes 15-25 Km/hour
Speed of UAV nodes 60-90 Km/hour
Design of DRL Based Adaptive Routing Protocol for Bidirectional Communication Between UAVs and UGVs
149
Figure 2: Simulation environment for integrated UAV and
UGV network.
For each transmitted packet, the script provides
detailed information on the packet number and its
data, the path taken by the packet, and the hop count,
as shown in the Table 2. It consists of packet data,
details of the chosen path, and hoping count taken by
the individual packet, to reach the destination. It is
observed that packets with sequence numbers 4 and 6
require the highest number of hops to transmit data
from node 2 to node 12. It indicates that these packets
become trapped in a loop, leading to what is known
as a deadlock loop. After the completion of each
packet transmission network parameters are
measured for each transmission, as shown below in
the Table 3. From Table 4 it is seen that are out of 11
packets, 8 were successfully delivered, while 3
experienced failures. The failures occurred with
packets having Sr. No 4, 7, and 10.
Table 2: Path of test message indicating nodes involved in
the network
Pa
ck
et
Packet
Data
Path Chosen Hop
count
1 hello [2, 4, 6, 4, 11, 9, 2, 1, 6, 2,
3, 10]
11
2 test [2, 7, 1, 5, 6, 11, 9, 2, 8, 10] 9
3 message [2, 10] 1
4 is [2, 3, 9, 7, 4, 8, 6, 12, 2, 3,
12, 1, 11, 8, 9, 2, 12, 7, 12,
9, 3, 12, 11, 12, 3, 1, 5, 6, 1,
11, 8, 2, 7, 6, 12, 8, 4, 12, 6,
7, 12, 7, 2, 9, 6, 9, 7, 9, 12,
4, 7, 1, 12, 5, 12, 1, 12, 9, 4,
11, 2, 6, 12, 1, 11, 5, 10]
66
5 transmitt
e
d
[2, 1, 6, 3, 11, 2, 11, 1, 3, 2,
12, 9, 5, 10]
13
6 between [2, 3, 6, 12, 7, 1, 9, 7, 9, 1,
8, 12, 8, 11, 1, 12, 11, 6, 1,
4, 9, 12, 11, 3, 2, 12, 5, 4, 3,
6, 9, 8, 5, 1, 8, 12, 1, 2, 8, 3,
7, 9, 12, 8, 7, 6, 10]
46
7 node [2, 9, 4, 9, 2, 9, 12, 7, 3, 7,
8, 7, 1, 5, 9, 11, 12, 11, 4, 7,
4, 3, 12, 10]
23
8 two [2, 3, 4, 12, 2, 7, 10] 6
9 and [2, 9, 11, 2, 12, 11, 2, 11, 6,
4, 1, 11, 2, 11, 10]
14
10 node [2, 3, 11, 2, 10] 4
11 twelve [2, 9, 2, 11, 3, 12, 2, 5, 11,
12, 11, 1, 10]
12
The delivery time fluctuates based on distance and
other network conditions. Packets with longer
distances (e.g., packet 6: 4.20 km, 5.25 sec) generally
took more time to deliver. The longest delivery time
was for packet 6 (5.25 sec) over a distance of 4.2 km.
Latency is generally equal to or slightly greater than
the delivery time for successful packets. Failed
packets showed higher latency values, indicating
potential retransmission. After concluding the
simulation, the outputs provide the final path
traversed by the packets.
Table 3: Performance of routing protocol in transmission of
test message for each packet
Pac
ket
Packet
Data
Total
Dista
nce(k
m
)
Deliver
y Status
Delive
ry
time
(
sec
)
Latency
(sec)
1 hello 1.26 Success 1.62 1.62
2 test 0.94
2
Success 1.06 1.06
3 messag
e
0.10
7
Success 0.15 0.157
4 is 5.75 Failure 7.04 7.04
5 Trans-
mitte
1.39 Success 1.51 1.51
6 betwee
n
4.20 Success 5.25 5.25
7 node 1.82 Failure 2.65 2.65
8 two 0.37
5
Success 0.89 0.89
9 and 0.98
2
Success 1.83 1.84
10 node 0.31
5
Failure 0.61 0.61
11 twelve 1.22 Success 1.54 1.55
INCOFT 2025 - International Conference on Futuristic Technology
150
Figure 3: Delivery Time Statistics
Additionally, it provides delivery time statistics
for successful packet transmissions, highlighting the
minimum, maximum, and average times required for
packet delivery, as shown in figure 3. The figure
illustrates that the average time for successful packet
delivery is less than 2 seconds, with the maximum
delivery time being 5.25 seconds for packet sequence
number 6, and the minimum delivery time being 0.61
seconds for packet sequence number 10. Finally, the
packet delivery results, including successful
deliveries, total number of packets, and packet
delivery ratio are obtained. Table 4 summarizes the
packet delivery results, showing that out of 11
transmitted packets, 8 were successfully delivered,
resulting in a packet delivery ratio of 0.72.
Table 4: Packet Delivery Results for Greedy Node
Forwarding Strategy including UGV and UAV
Paramete
r
Result
Successful Deliveries: 8
Total Packets Transmitted: 11
Packet Deliver
y
Ratio: 0.72
To evaluate the performance of routing protocol
in second mode distinct number of packets
transmitted over the network and network parameters
are evaluated. For experimentation 5, 10, 15, 20, 25
number of packets are transmitted over the network
and its performance is evaluated w.r.t maximum hop
count, average hop count, maximum latency, average
latency and PDR between nodes two and node
twelve. Node two is a UAV node and node 12 is UGV
node. From figure 4, it is observed that the maximum
hop count reaches 34 when transmitting 25 data
packets, whereas it is reduced to 18 when transmitting
only 5 data packets. Average hop count lies between
5 to 7.
Figure 4: Maximum and average hop count under different
number of data packets.
Figure 5: Maximum and average latency under different
number of data packets
Figure 5 shows that the maximum latency is 4.2
seconds when transmitting 25 data packets, while it
decreases to 2.2 seconds for 5 data packets. The
average latency is approximately 1 second.Network
achieves a PDR of more than 0.75 with varying
number of packets as shown in figure 6.
Figure 6: PDR under different number of data packets
The results demonstrate that average hop count
for transmitting data between UAV and UGV nodes
with varying no of packets lies between 5 to 7 and
average latency with varying no of packets is less than
2 seconds.
Design of DRL Based Adaptive Routing Protocol for Bidirectional Communication Between UAVs and UGVs
151
6 DISCUSSION
Designing a bidirectional network for UAVs and
UGVs involves integrating GPSR with DRL to
enhance bi directional communication. The network
includes UAVs and UGVs equipped with GPS and
data processing capabilities, each identified by
unique IDs. GPSR utilizes hello packets to establish
neighbor relationships and build a local topology map
for routing. DRL augments GPSR by adapting to
network changes, optimizing routing paths, and
improving communication efficiency. This
integration ensures seamless data exchange and
coordination in UAV-UGV networks. This technique
proves efficient offering paths with fewer hops,
thereby reducing End-to-End Delay. Final path
selection involves evaluating each node for overall
end-to-end delay, considering the position, current
energy and timestamp. Upon network initialization,
every node transmits a hello packet to exchange self-
information. The test message's path within the
network indicates involved nodes, packet data, the
path taken, and hop count.
Moreover, the routing protocol's performance in
transmitting test messages for each packet is assessed
using diverse parameters. The evaluation reveals
variations in the total distance covered for successful
delivery of packet, with the “is packet” traversing the
longest distance at 4.2 km. Result reveals that
maximum hop count for successful deliveries is 46
which is corresponding to between packet” and least
hop count is 1 which is corresponding to “message
packet”. Delivery Status confirms successful delivery
for all packets except “is packet”, “node packets”.
Delivery Time exhibits variability, with the “between
packet” recording the lengthiest delivery time at 5.25
seconds. From Table 3 it is seen that failed packets
showed higher latency values as compared to delivery
time, indicating potential retransmissions. The
performance of the routing protocol during each
packet transmission is illustrated in Figure 6.
Delivery time statistics highlight the minimum,
maximum, and average time required for packet
delivery. The network achieved a packet delivery
ratio of 0.72 (refer to Table 4). The integration of the
GPSR protocol with the DRL technique demonstrates
its efficiency in the coordinated UAV and UGV
network, offering paths with fewer hops, thereby
reducing End-to-End Delay. Final path selection
involves evaluating each node for the overall end-to-
end delay, considering position, current energy, and
timestamp.
To further assess the performance of the routing
protocol, the number of packets transmitted over the
network and various network parameters are
analyzed. For the experiment, 5, 10, 15, 20, and 25
packets are transmitted across the network, and their
performance is evaluated. Figure 4 and figure 5
indicate that an increase in the number of packets,
which corresponds to a larger data size or
retransmissions due to packet loss, can result in a
higher hop count and increased latency. Figure 6
shows that the Packet Delivery Ratio (PDR) is 0.88
for 25 data packets, while it is 0.8 and 0.75 for 15 and
20 data packets, respectively. This indicates that
retransmissions are higher for 15 and 20 data packets
compared to 25 data packets. Additionally, the
increase in PDR for 25 number of data packets can be
attributed to the network by acquiring more detailed
information about the topology through exchange of
hello packets.
7 CONCLUSIONS
In this paper, a network is designed to test the
proposed UAV to UGV communication protocol. A
Simulator is designed, developed and tested to
establish the Bi- Directional Communication between
UAV and UGV. A test message is segmented into
eleven packets and transmitted between nodes 2 and
12 using integrated GPSR- DRL strategy to evaluate
the performance of simulated network comprising of
UAVs and UGVs. By integrating DRL, UAVs and
UGVs can learn optimal routing strategies that adapt
to changing environments and network conditions,
improving packet delivery rates and reducing
communication latency. The network's performance
was evaluated based on the hop count, delivery time,
and success or failure of each packet transmission.
Out of the 11 packets, 8 were successfully delivered,
resulting in a packet delivery ratio of 0.72. The
analysis also provided detailed statistics on delivery
times, mentioning minimum, maximum, and average
values. By analyzing delivery time data, network
operators and engineers can predict a range of critical
performance parameters such as latency, congestion
and QoS. This data helps to assess the reliability of
the routing protocol in the simulated environment, if
the delivery time consistently increases for certain
routes, it may indicate suboptimal path selection.
Moreover, sudden spikes in delivery time may
indicate underlying issues with network stability.
Benefits of unification of GPSR with DRL is the
competency to evaluate historical and real-time data
to predict traffic flow, change in network density and
INCOFT 2025 - International Conference on Futuristic Technology
152
node mobility, enabling GPSR to take decisions and
choose optimal routes.
Table 2 indicates that some of the packets are
taking five times more hops than the total number of
UAV or UGV nodes, suggesting that these packets
are getting trapped in a loop. This leads to
significantly higher network latency, a phenomenon
known as a deadlock loop. There is a scope of
minimizing deadlock loops to improve the efficiency
of network resource utilization.
The integration of GPSR with DRL in designing
a routing protocol for bidirectional communication
between UAVs and UGVs marks an innovative
development in network communication, bringing
intelligence and improved adaptability. This method
not only deals with the limitations of conventional
routing protocols in dynamic and unpredictable
environments but also unlocks new opportunities for
more efficient and reliable communication in UAV-
UGV networks. By continuously learning from the
environment and adapting to the changing conditions,
this DRL-based routing protocol ensures flawless and
resilient communication between aerial and ground
vehicles, even in dynamic scenarios.
REFERENCES
A. A. Laghari, A. K. Jumani, R. A. Laghari, and H. Nawaz,
“Unmanned aerial vehicles: A review,” Cognitive
Robotics, vol. 3, pp. 8–22, 2023, doi:
10.1016/j.cogr.2022.12.004.
F. Ahmed, J. C. Mohanta, A. Keshari, and P. S. Yadav,
“Recent Advances in Unmanned Aerial Vehicles: A
Review,” Arab J Sci Eng, vol. 47, no. 7, pp. 7963–7984,
Jul. 2022, doi: 10.1007/s13369-022-06738-0.
A. Sharma et al., “Communication and networking
technologies for UAVs: A survey,” Journal of Network
and Computer Applications, vol. 168, p. 102739, Oct.
2020, doi: 10.1016/j.jnca.2020.102739.
N. H. Hussein, C. T. Yaw, S. P. Koh, S. K. Tiong, and K.
H. Chong, “A Comprehensive Survey on Vehicular
Networking: Communications, Applications,
Challenges, and Upcoming Research Directions,” IEEE
Access, vol. 10, pp. 86127–86180, 2022, doi:
10.1109/ACCESS.2022.3198656.
Smt. T. R, Dr. S. D. K. A, and S. P M.,Energy Efficient
Routing Protocols for Mobile Robots: A Review,”
Saudi J. Eng. Technol., vol. 8, no. 07, pp. 163–170, Jul.
2023, doi: 10.36348/sjet.2023.v08i07.002.
Y. Altshuler, A. Pentland, and A. M. Bruckstein, “Optimal
Dynamic Coverage Infrastructure for Large-Scale
Fleets of Reconnaissance UAVs,” in Swarms and
Network Intelligence in Search, vol. 729, in Studies in
Computational Intelligence, vol. 729. , Cham: Springer
International Publishing, 2018, pp. 207–238. doi:
10.1007/978-3-319-63604-7_8.
F. Rubio, F. Valero, and C. Llopis-Albert, “A review of
mobile robots: Concepts, methods, theoretical
framework, and applications,” International Journal of
Advanced Robotic Systems, vol. 16, no. 2, p.
172988141983959, Mar. 2019, doi:
10.1177/1729881419839596.
J. Gielis, A. Shankar, and A. Prorok, “A Critical Review of
Communications in Multi-robot Systems,” Curr Robot
Rep, vol. 3, no. 4, pp. 213–225, Aug. 2022, doi:
10.1007/s43154-022-00090-9.
M. Hua, Y. Wang, Q. Wu, H. Dai, Y. Huang, and L. Yang,
“Energy-Efficient Cooperative Secure Transmission in
Multi-UAV-Enabled Wireless Networks,” IEEE Trans.
Veh. Technol., vol. 68, no. 8, pp. 7761–7775, Aug.
2019, doi: 10.1109/TVT.2019.2924180.
J. Sánchez-García, D. G. Reina, and S. L. Toral, “A
distributed PSO-based exploration algorithm for a
UAV network assisting a disaster scenario,” Future
Generation Computer Systems, vol. 90, pp. 129–148,
Jan. 2019, doi: 10.1016/j.future.2018.07.048.
F. Salazar et al., “Drone Collaboration Using OLSR
Protocol in a FANET Network for Traffic Monitoring
in a Smart City Environment,” in CSEI: International
Conference on Computer Science, Electronics and
Industrial Engineering (CSEI), vol. 678, M. V. Garcia
and C. Gordón-Gallegos, Eds., in Lecture Notes in
Networks and Systems, vol. 678. , Cham: Springer
Nature Switzerland, 2023, pp. 278–295. doi:
10.1007/978-3-031-30592-4_20.
S. Kumar, R. S. Raw, A. Bansal, and P. Singh, “UF‐GPSR :
Modified geographical routing protocol for flying ad‐
hoc networks,” Trans Emerging Tel Tech, vol. 34, no.
8, p. e4813, Aug. 2023, doi: 10.1002/ett.4813.
A. Charles, “Energy efficient quality routing protocol for
WSNs,” JWCN, vol. 11, no. 1, p. 9, 2022, doi:
10.26634/jwcn.11.1.18935.
B. Karp and H. T. Kung, “GPSR: greedy perimeter stateless
routing for wireless networks,” in Proceedings of the
6th annual international conference on Mobile
computing and networking, Boston Massachusetts
USA: ACM, Aug. 2000, pp. 243–254. doi:
10.1145/345910.345953.
M. Namdev, S. Goyal, and R. Agarwal, “An Optimized
Communication Scheme for Energy Efficient and
Secure Flying Ad-hoc Network (FANET),” Wireless
Pers Commun, vol. 120, no. 2, pp. 1291–1312, Sep.
2021, doi: 10.1007/s11277-021-08515-y.
G. M. Cappello et al., Optimizing FANET Lifetime for 5G
Softwarized Network Provisioning,” IEEE Trans.
Netw. Serv. Manage., vol. 19, no. 4, pp. 4629–4649,
Dec. 2022, doi: 10.1109/TNSM.2022.3193883.
M. Hosseinzadeh et al., “A greedy perimeter stateless
routing method based on a position prediction
mechanism for flying ad hoc networks,” Journal of
King Saud University - Computer and Information
Sciences, vol. 35, no. 8, p. 101712, Sep. 2023, doi:
10.1016/j.jksuci.2023.101712.
Design of DRL Based Adaptive Routing Protocol for Bidirectional Communication Between UAVs and UGVs
153
A. H. Abbas, A. J. Ahmed, and S. A. Rashid,A Cross-
Layer Approach MAC/NET with Updated-GA
(MNUG-CLA)-Based Routing Protocol for VANET
Network,” WEVJ, vol. 13, no. 5, p. 87, May 2022, doi:
10.3390/wevj13050087.
S.-C. Choi, H. R. Hussen, J.-H. Park, and J. Kim,
“Geolocation-Based Routing Protocol for Flying Ad
Hoc Networks (FANETs),” in 2018 Tenth International
Conference on Ubiquitous and Future Networks
(ICUFN), Prague, Czech Republic: IEEE, Jul. 2018, pp.
50–52. doi: 10.1109/ICUFN.2018.8436724.
A. Rovira-Sugranes, A. Razi, F. Afghah, and J. Chakareski,
“A review of AI-enabled routing protocols for UAV
networks: Trends, challenges, and future outlook,” Ad
Hoc Networks, vol. 130, p. 102790, May 2022, doi:
10.1016/j.adhoc.2022.102790.
S. Wen, C. Huang, X. Chen, J. Ma, N. Xiong, and Z. Li,
“Energy-efficient and delay-aware distributed routing
with cooperative transmission for Internet of Things,”
Journal of Parallel and Distributed Computing, vol.
118, pp. 46–56, Aug. 2018, doi:
10.1016/j.jpdc.2017.08.002.
Z. S. Houssaini, I. Zaimi, M. Oumsis, and S. E. A. Ouatik,
“GPSR+Predict: An Enhancement for GPSR to Make
Smart Routing Decision by Anticipating Movement of
Vehicles in VANETs, Adv. sci. technol. eng. syst. j.,
vol. 2, no. 3, pp. 137–146, Apr. 2017, doi:
10.25046/aj020318.
Z. Liu, X. Feng, J. Zhang, T. Li, and Y. Wang, “An
Improved GPSR Algorithm Based on Energy Gradient
and APIT Grid,” Journal of Sensors, vol. 2016, pp. 1–
7, 2016, doi: 10.1155/2016/2519714.
A. T. Azar et al., “Drone Deep Reinforcement Learning: A
Review,” Electronics, vol. 10, no. 9, p. 999, Apr. 2021,
doi: 10.3390/electronics10090999.
I. A. Aljabry and G. A. Al-Suhail, “Improving the Route
Selection for Geographic Routing Using Fuzzy-Logic
in VANET,” in Intelligent Computing & Optimization,
vol. 371, P. Vasant, I. Zelinka, and G.-W. Weber, Eds.,
in Lecture Notes in Networks and Systems, vol. 371. ,
Cham: Springer International Publishing, 2022, pp.
958–967. doi: 10.1007/978-3-030-93247-3_91.
INCOFT 2025 - International Conference on Futuristic Technology
154