Design of DRL Based Adaptive Routing Protocol for Bidirectional

Communication Between UAVs and UGVs

Prabhakar Saxena and Gayatri M. Phade

Sandip Institute of Technology and Research Centre, Nashik, India

Keywords: Deep Reinforcement Learning (DRL), Greedy Perimeter Routing Protocol (GPSR), Mobile Ad-Hoc

Networks (MANETs), Unmanned Aerial Vehicle (UAV), Unmanned Ground Vehicle (UGV)

Abstract: The recent developments in coordinated networks, specifically of Unmanned Aerial Vehicles (UAVs) and

Unmanned Ground Vehicles (UGVs) are found as game changer for autonomous systems. Existing routing

protocols are typically designed for UAV networks or for UGV networks, separately. The seamless integration

of these networks is essential to enhance situational awareness as UAVs can provide bird’s-eye view of the

surrounding and UGVs can gather detailed ground level data. Deployment of these networks requires

designing of the customized routing protocol enabling flawless communication between coordinated UAV

network and UGV network platforms and a simulation framework to test it. This paper presents a design,

implementation, and optimization of routing protocol engineered for specific requirements of coordinated

network consisting of UAV and UGV. This novel protocol design integrates the Greedy Perimeter Stateless

Routing (GPSR), for combining GPSR strategies, and Deep Reinforcement Learning (DRL) to optimize

packet routing. A simulator is developed in Python to simulate and test the proposed protocol. Simulation

result confirms that the proposed protocol establishes the shortest and most efficient paths making it suitable

for the many applications. By addressing the critical challenges in routing strategies for integrated UAV-UGV

network, this research work paves the way for intelligent and adaptive communication solutions in dynamic

environment.

1 INTRODUCTION

UAV (Unmanned Aerial Vehicles) are widely

referred to as drones (Laghari, Jumani, et al. , 2023)

and UGV (Unmanned Ground Vehicle) are known as

mobile robots. Network of UAV is commonly known

as FANETs (Flying Ad-Hoc Networks) and network

of UGV is called RANETs (Robotic Ad-Hoc

Networks). FANET and RANET are advanced

iterations of Mobile Ad-Hoc Networks (MANETs)

(Ahmed, Mohanta, et al. , 2022). These platforms

leverage a variety of wireless communication

technologies, including Bluetooth, Wi-Fi, and

cellular networks for communication among them

and form ad-hoc network autonomously(Sharma, et

al. , 2020), (Hussein, Yaw, et al. , 2022). UGV are

widely employed in reconnaissance, surveillance,

traffic monitoring applications, border patrol and

search and rescue operations. UAVs have become

essential tools in various fields, ranging from military

and law enforcement operations to civilian

applications such as disaster response, agriculture,

and filmmaking. Their versatility and adaptability

make them invaluable assets for tasks requiring aerial

surveillance, data collection, and monitoring in both

urban and remote environments (Altshuler, Pentland,

et al. , 2018).

Integrating UAVs and UGVs will revolutionized

various industries, ranging from surveillance and

reconnaissance to disaster response and

transportation. In dynamic and three-dimensional

environments, traditional routing protocols often fail

to adapt to the unique requirements of UAV-UGV

networks, leading to inefficient communication,

increased latency, and potential safety hazards.

One of the main advantages of integrated UAVs

and UGVs communication is improved situational

awareness and decision-making capabilities. UAVs

equipped with sensors can be used to collect and

transmit data to UGV, which can then use the data to

generate maps and models of the affected area.

Mobile UGV equipped with sensors and actuators can

be deployed to perform tasks such as delivering

144

Saxena, P. and Phade, G. M.

Design of DRL Based Adaptive Routing Protocol for Bidirectional Communication Between UAVs and UGVs.

DOI: 10.5220/0013588400004664

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 3rd International Conference on Futuristic Technology (INCOFT 2025) - Volume 2, pages 144-154

ISBN: 978-989-758-763-4

medical supplies or clearing debris, based on the

information received from the UAV (Rubio, Valero,

et al. , 2019), (Gielis, Shankar, et al. , 2022).

Integrated UAV-UGV communication can also

enhance network coverage. UAVs can serve as flying

stations, providing aerial communication to UGV and

other ground-based devices that may not have direct

line-of-sight communication with each other. This

can help to overcome obstacles and terrain challenges

that may limit the range and performance of the

individual UGV networks.

Another significant advantage of integrated UAV-

UGV communication is the ability to optimize

resource usage and energy efficiency. UAVs can use

their mobility to optimize the network topology and

larger area while UGV can be used to offload

computational tasks from UAVs, reducing their

workload and energy consumption (Hua, Wang, et al.

, 2019).

Integrated UAV-UGV communication will

enable new and innovative applications and services

that were previously impossible or impractical to

achieve. The combination of UAVs and UGVs

communication allows the efficient coordination of

UAVs and UGVs, enabling several uses, including

land mines map generation in battle field and package

delivery monitoring and rescue operations (García,

Reina, et al. , 2018).

A literature review is conducted to investigate the

potential of existing routing protocols in effectively

managing integrated communication between UAVs

and UGVs. During the process of review it is

observed that there is a gap of non-availability of

routing protocols for communication between UAVs

and UGVs is an emerging challenge in the field of

autonomous and robotic systems. This issue arises

due to the unique characteristics and operational

environments of UAVs and UGVs, which create

distinct networking requirements that are not fully

addressed by existing routing protocols.

In this work, we aim to address the issue by

enhancing the existing position-based routing

protocol, GPSR (Greedy Perimeter Stateless

Routing), originally used for UAV-UAV

communication. The newly developed protocol will

enable seamless data transfer in integrated UAV-

UGV networks, meeting the unique demands of these

heterogeneous systems. Furthermore, the proposed

protocol will be adaptive, energy-efficient, and

capable of managing dynamic topologies effectively.

In summary, this work will focus on integrating

advanced technologies such as Machine Learning

(ML) and Artificial Intelligence (AI) with position-

based protocols like GPSR. The proposed solutions

will be validated in practical scenarios to optimize the

performance, energy efficiency, and adaptability of

networks comprising UAVs and UGVs.

Section II discuss the related work accomplished

by the research in the domain of modifying network

routing protocol for UAV network or UGV network

individually. Section III provides a brief introduction

of GPSR routing protocol and DRL. Section IV

outlines the network model and methodology

employed. The results are presented in section V

followed by discussion on result in section VI. Finally

conclusions are drawn in section VII.

2 RELATED WORK

Literature review is conducted to assess the suitability

of position-based routing protocols for integrated

UAV-UGV networks and to explore the integration

of machine learning algorithms into existing

protocols.

Salazar (Salazar, et al. , 2023) recommend

exploring additional Ad Hoc routing protocols, such

as AODV and DSDV, alongside OLSR to compare

their efficiency in a FANET network for traffic

monitoring. Conducting this comparative analysis

would provide a more comprehensive understanding

of the performance of different protocols in various

scenarios. Future research could focus on

incorporating advanced technologies like machine

learning algorithms or artificial intelligence to

enhance the decision-making processes within the

FANET network. These technologies have the

potential to optimize route planning, resource

allocation, and overall network performance.

Additionally, further studies could involve real-world

implementation and testing of the proposed FANET

network in a smart city environment. This would help

validate the simulation results and evaluate its

practical feasibility and performance under realistic

conditions.

Kumar (Kumar, Raw, et al. , 2023), modified the

Greedy Perimeter Stateless Routing (GPSR) protocol

to develop the Utility Function-based Greedy

Perimeter Stateless Routing (UF-GPSR) protocol for

Flying Ad-hoc Networks (FANETs). This

modification aimed to optimize the greedy

forwarding strategy by considering multiple crucial

parameters of UAVs, such as residual energy ratio,

distance degree, movement direction, link risk

degree, and speed. Future research could involve

integrating machine learning techniques to enhance

the routing decision process in UF-GPSR. By

leveraging machine learning algorithms, the protocol

Design of DRL Based Adaptive Routing Protocol for Bidirectional Communication Between UAVs and UGVs

145

could adapt to dynamic network conditions more

effectively, leading to improved routing performance.

Additionally, future studies could focus on deploying

UF-GPSR in practical FANET environments to

assess its performance under realistic conditions.

Field trials and experiments would provide valuable

feedback on the protocol's effectiveness and

feasibility for diverse applications.

Charles (Charles, 2022) proposed a method to

enhance network quality and throughput while

simultaneously creating an energy-saving approach

with excellent quality of service. This paper

introduces an energy-efficient protocol designed to

develop a faster, miniaturized, and more efficient

routing method compared to existing protocols. The

proposed routing protocol, EEQRP, is evaluated and

compared using Network Simulator-2 (NS2). The

results demonstrate that EEQRP provides lower

average latency, greater power savings, and a higher

packet delivery rate than current protocols. Future

research could explore integrating machine learning

or artificial intelligence techniques into the EEQRP

protocol to enhance its decision-making capabilities

and further optimize energy efficiency.

Karp and Kung (Karp, Kung, et al. , 2003)

introduced GPSR (Greedy Perimeter Stateless

Routing) as an innovative routing protocol for

wireless datagram networks. GPSR makes

forwarding decisions based on the positions of routers

and the destinations of packets, utilizing greedy

forwarding with local topology information. When

greedy forwarding is not feasible, the protocol

switches to perimeter routing. GPSR is designed to

scale efficiently with the number of network

destinations, outperforming shortest-path and ad-hoc

routing protocols in this regard. In environments with

frequent topology changes due to mobility, GPSR

leverages local topology information to quickly adapt

and establish new routes. Extensive simulations in

mobile wireless networks demonstrate the scalability

of GPSR, especially in densely deployed wireless

networks. Future research could focus on

investigating the implementation and performance of

GPSR in real-world wireless network scenarios to

validate its effectiveness beyond simulations.

Additionally, exploring enhancements to GPSR to

improve its adaptability to dynamic network

conditions and optimize routing decisions in diverse

environments using machine learning techniques

could be highly beneficial.

Namdev (Namdev, Goyal, et al. , 2021) proposed

a Whale Optimization Algorithm-based Optimized

Link State Routing (WOA-OLSR) for Flying Ad-hoc

Networks (FANET) to address the challenges of

energy efficiency and communication security. The

study concluded that the WOA-OLSR

communication scheme offers a more efficient and

secure solution for FANETs, enhancing performance

and reliability in communication networks involving

drones and UAVs. Future improvements could

involve integrating machine learning or artificial

intelligence techniques with the WOA-OLSR

algorithm to further optimize routing decisions,

enhance network performance, and adapt to evolving

network dynamics in FANET environments.

Cappello (Cappello, , et al. , 2022) presented a

comprehensive framework that integrates Flying Ad

Hoc Networks (FANET) with 5G networks to

provide interconnected services that can be

sequenced, taking into account physical device

constraints and traffic flow requirements. Future

research could explore the integration of machine

learning algorithms to enhance the optimization

model for Virtual Function placement and chaining,

with the aim of further improving energy efficiency

and service satisfaction probabilities.

Hosseinzadeh (Hosseinzadeh, , et al. , 2023)

proposed a position forecast-based greedy perimeter

stateless routing approach called GPSR+ for Flying

Ad Hoc Networks (FANETs). This approach consists

of two main steps: neighbor discovery and a greedy

forwarding algorithm. In the neighbor discovery

phase, GPSR+ employs a position prediction

mechanism based on historical data. To predict

positions, weighted linear regressions utilized. Future

research could focus on enhancing this prediction

technique by incorporating lightweight machine

learning methods. Machine learning could provide

more accurate and adaptive position forecasts,

thereby improving the overall performance of GPSR+

in dynamic FANET environments. By leveraging

advanced machine learning algorithms, GPSR+ could

better handle the mobility and variability inherent in

FANETs, leading to more efficient routing decisions

and increased network reliability.

Future research should prioritize the development

of ad-hoc networks tailored for integrated UAV-UGV

systems. This includes the integration of advanced

technologies such as machine learning and artificial

intelligence with position-based protocols like GPSR.

Emphasis should be placed on validating these

solutions in practical scenarios to enhance the

network's performance, energy efficiency, and

adaptability. In highly dynamic nature of networks

involving UAVs and UGVs, position-based routing

protocols like GPSR are particularly well-suited for

such applications.

INCOFT 2025 - International Conference on Futuristic Technology

146

3 BRIEF ABOUT GPSR AND DRL

GPSR is a routing protocol offering Greedy and

Perimeter Forwarding modes for efficient packet

delivery. By selecting the nearest node to the

destination and employing the right-hand rule when

necessary, GPSR aims to optimize routing paths.

However, its reliance solely on distance metrics can

result in elevated delivery failures. While GPSR

assigns global routes to nodes through a greedy

algorithm, it's associated with high overhead and

diminished delivery ratio beyond a certain threshold.

Notably, GPSR's suitability diminishes in urban

settings characterized by local loops, limiting its

effectiveness in such environments (Abbas, Ahmed,

et al. , 2022), (Choi, Hussen, et al. , 2018), (Sugranes,

Razi, et al. , 2022), (Wen, Huang, et al. , 2018). The

GPSR algorithm includes two different packet

forwarding techniques. In GPSR, a greedy

forwarding technique is widely applied technique

while perimeter forwarding is best suitable where a

greedy technique cannot be applied(Houssaini,

Zaimi, et al. , 2017).

In GPSR each packet is marked with its

destination's location by its originator. When a node

receives a packet, it examines the geographic location

of the destination and compares it with the positions

of its neighbouring nodes. The node then makes a

locally optimal choice of the next hop by selecting the

neighbor that is geographically closest to the packet's

destination among its radio neighbors. The process

continues iteratively as each node forwards the packet

to the next hop that is closer to the destination until

the packet reaches its intended destination. This

method aims to curtail the number of hops essential

for packet delivery and refine the routing path based

on geographic location. GPSR includes three basic

routing algorithms: Distance Vector (DV), Link State

(LS), and Path Vector. In the DV method, every node

recognizes its route to a desired destination by

seeking details shared on regular interval by its

neighbouring LS approach involved nodes, to

broadcast about status alteration across the whole

network topology. As stated by researchers, both the

DV and LS strategies might undergo from minor

inaccuracies in a router's perceived network

condition, likely leading to routing loops or

connectivity issues. Moreover, the intricacy of

messages within the DV and LS routing algorithms

can be effected by two factors: rate of change of

network topology and the number of routers

operating within the routing zone (Karp, Kung, et al.

, 2003). Greedy forwarding is an effective technique

but it may generate suboptimal routes in network

topologies, where packets temporarily move farther

from the destination (Feng, Zhang, et al. , 2016).

DRL (Azar, et al. , 2021) engages in training a

computational agent to communicate with an

environment to optimize cumulative rewards through

iterative process. In the framework of UAVs and

UGVs network, DRL models are trained to cater real-

time packet routing decisions depending on routing

tables and sensor data. It considers various factors

such as velocities of UAVs or UGVs, obstacles,

positions of UAVs or UGVs and other relevant

environmental factors. It requires a suitable

representation of the state space. For taking real time

packet routing decisions state space consists of the

current positions, velocities, and orientations of UAV

and UGV.

In summary, GPSR's dual approach of greedy

forwarding and perimeter forwarding offers a

comprehensive solution for routing in dynamic ad-

hoc networks. Upcoming studies could enhance

GPSR by integrating DRL methodologies. DRL

could facilitate GPSR to adaptively grasp optimal

routing schemes from dynamic network state, in-turn

improvement in decision-making processes and

routing performance. This unification of DRL with

GPSR forms a routing protocol suitable for

bidirectional communication between UAVs and

UGVs.

4 PROPOSED METHODOLOGY

While designing a bidirectional network model for

UAVs and UGVs communication, the unique

requirements of both the networks are considered.

Main objective is to create an integrated system that

facilitates seamless communication and collaboration

between UAVs and UGVs. Following are the design

considerations,

• Let us consider a network, consisting of

UAVs and UGVs, each equipped with

Global Positioning System (GPS) and a

processor capable of data processing,

trans-receiver, caching and storage.

• The proposed network consisting of a set

of ‘n’ flying nodes (UAV

i=1,2,3…n)

and ‘m’ mobile ground nodes (UGV

j=1,2,3…m).

• Let ID

UAVi

and ID

UGVj

be a unique

identification assigned to UAV

and

UGV

respectively.

• Transmission radius of UAV

and UGV

is considered as R

Design of DRL Based Adaptive Routing Protocol for Bidirectional Communication Between UAVs and UGVs

147

• Communication within the network of

UAVs and UGVs is initiated using a

specially designed Hello packet tailored

for the network.

• Position of flying nodes and unmanned

ground vehicle are respectively (x

)

and (x

, y

Seamless communication between UAV and

UGV networks requires periodic updates of network

topology information, which can be achieved using

Hello Packets. Hello Packet is typically used in a

network to establish and maintain neighbor

relationship between nodes and continuous

information exchange (Aljabry and Suhail, 2022). In

the proposed network, nodes are initially categorized

and identified as GP-AV for UAV nodes and GP-GV

for UGV nodes. The GPSR method starts by

identifying neighbouring nodes by means of the

transmission of hello packets. During the hand

shaking of hello packet, each node collects data about

its neighbors such as geographic positions, validity

time and node types (GP-AV or GP-GV). This data

helps the GPSR protocol to form a local map of the

network topology, which is essential for decision

making for forwarding the packet. The neighbor

discovery process assures that every node maintains

an updated table of nearby nodes and their

corresponding positions, required for effective greedy

forwarding. In greedy forwarding, every node

leverage position details to forward data packets to

the neighbor closest to the destination. If greedy

forwarding fails (e.g., when a packet reaches a local

maximum with no neighbor closer to the destination),

the protocol switches to perimeter routing to navigate

around obstacles and continue forwarding the packet

towards its destination. Subsequently, the ID of both

UAVs and UGVs, referred to as ID

UAVi

and ID

UGVj

respectively, disseminates a hello message within its

transmission radius (R

) to inform neighbouring

nodes, including UAVs and UGVs, about its

remaining energy and position. Following the

exchange of hello messages referred in figure 1.

Sourc

e ID

Node

Type

(NT)

Coordinate

UAV

, z

)

/UGV

)

Time

Stamp(TS

)

Current

Energy

(E)

Hear

(

)

Validit

y Time

(

)

SI (Sharing

Information)

Hello

ID(IDH

)

Figure 1: Hello Packet Format for Proposed Network.

Following are the fields of Hello Packet

• Source ID (IDS): Unique identifier for

the UAV/UGV sending the "Hello"

packet.

• ID Hear (IDH): Identification

information hearing other UAV/UGV.

• Node type (NT): Identification whether

hello packet generated from UAV or

UGV (IDP).

• Validity Time (VT): Time for which

hello packet is valid.

• Position (P): Include the current

geographic coordinates (latitude,

longitude, altitude) of the UAV/UGV.

• Sharing Information (SI): Sharing

information with nodes.

• Time Stamp (TS): Indicate the time at

which the "Hello" packet was sent.

• Energy (E): Residual Energy of UAV or

UGV.

• IDH: Hello identification number.

Table 1: Neighbor table format of nodes.

Hello Message

Information

PP VT

llo

Positi

V /

(t),

(t)

…

… … …

These hello packets, containing the node

information such as identity, position, type, residual

energy, and neighbours, are broadcasted at regular

interval to notify other nodes of their presence and

comprise of vital information such as node type,

unique ID of the transmitting node, time stamp etc.

Based on the information, extracted from the hello

messages received from adjacent UAVs and UGVs

compiles a list of its neighbouring nodes. This

information is then registered in its neighbouring

table of nodes, as illustrated in the format depicted in

Table 1. Neighbouring table holds information about

adjacent nodes.

• ID: Unique ID of adjacent UAV/UGV.

• NT: Type of adjacent neighbor i.e.

UAV/UGV.

INCOFT 2025 - International Conference on Futuristic Technology

148

• Hello ID: Hello packet identification

no.

• Reg Time: Time at which

corresponding hello packet transmitted.

• Position: Geographical Position of

UAV/UGV.

• RE: Residual Energy of the

neighbouring UAV/UGV.

• Predicted Position (PP): Predicted

Position of UAV/UGV with the help of

DRL model.

• Validity Time (VT): Time for which

Hello Packet is valid. It is an indicator

of network life.

Developing an ad-hoc network for bidirectional

communication between coordinated UAV and UGV

involves a systematic approach to ensure effective

collaboration between these autonomous systems. Let

the network comprised of UAVs and UGVs as nodes,

communication need to be established through

various channels, including UAV-UAV, UGV-UGV,

and UAV-UGV. For establishing the communication

between the nodes, integration of Greedy Perimeter

Stateless Routing (GPSR) and Deep Reinforcement

Learning (DRL) protocols are used. Initially, GPSR

determines the routing paths based on the geographic

positions of nodes and the destination. Each node in

the network represented by the ‘Node class’,

comprising of the unique attributes such as ID,

coordinates (x, y, z) in 3D space, and node type

indicating whether it is a UAV (i) or UGV (j).

The communication process is initiated by

transmitting a hello packet. Further, the routing

function identifies the nearest neighbours based on

Euclidean distance, which is essential for establishing

communication paths. To optimize routing decisions,

the network uses an integration of GPSR and DRL.

GPSR utilizes greedy forwarding decisions, choosing

the nearest neighbor as the next hop. In cases where

direct greedy forwarding is not possible due to

obstacles or unreachable areas, GPSR switches to

perimeter forwarding mode. This determines the

routing path, total distance, hop count, forwarding

strategy, and node types involved in the

communication process. DRL augment GPSR by

continuously learning from the network conditions

and environment. It adapts to changes in topology,

traffic, and obstacles in-turn enhancing

communication efficiency and reliability. The DRL

model uses a Sequential architecture having dense

layers to learn and fine-tune routing decisions.

A simulator is developed to test the proposed

communication protocol. The predefined function

chooses an action index, based on anticipated action

probabilities from the DRL model, facilitating in

decision-making during routing. The ‘angle function’

calculates angles between vectors and supports

Perimeter Forwarding which is triggered if Greedy

Forwarding fails, ‘packet handling function’ divides

the message into individual words and generates a

packet for transmission. Meanwhile, ‘network update

functions’ dynamically updates the node coordinates

and emulate the communication process by updating

node coordinates and printing information about

nodes within the communication range.

Through the collaborative working of GPSR and

DRL, UAVs and UGVs establish robust

communication links, enabling seamless data

exchange and coordination within the network. The

simulation of coordinated communication between

UAVs and UGVs networks engage in designing and

testing a framework having the capability to interact

and collaborate with both aerial and ground vehicle.

5 RESULTS

Testing of the proposed protocol is conducted in two

modes through the specifically developed simulator.

In the first mode, a data string is forwarded through

the network, and various network parameters, such as

hop count, latency are measured. In the second mode,

the number of transmitted data packets is varied, and

corresponding network parameters are calculated.

For the simulation purposes in mode one, let the

node 2(UAV) wishes to communicate with the node

12(UGV) as illustrated in figure 2. The message

"hello test message is transmitted between node two

and node twelve" is transmitted, which is segmented

as: “hello”, “test”, “message”, “is”, “transmitted”,

“between”, “node”, “two”, “and”, “node” and

“twelve”. After initializing the network, each node

sends a hello packet to share information about itself.

When a packet is transmitted, the script prints routing

details, including the forwarding strategy used (either

'Greedy' or 'Perimeter') and the types of nodes

involved in the forwarding process.

Simulato

Python

No of UAV Nodes 6

No of UGV Nodes 6

Mobility Model Random Way Point

Communication radius 150 meter

Speed of UGV Nodes 15-25 Km/hour

Speed of UAV nodes 60-90 Km/hour

Design of DRL Based Adaptive Routing Protocol for Bidirectional Communication Between UAVs and UGVs

149

Figure 2: Simulation environment for integrated UAV and

UGV network.

For each transmitted packet, the script provides

detailed information on the packet number and its

data, the path taken by the packet, and the hop count,

as shown in the Table 2. It consists of packet data,

details of the chosen path, and hoping count taken by

the individual packet, to reach the destination. It is

observed that packets with sequence numbers 4 and 6

require the highest number of hops to transmit data

from node 2 to node 12. It indicates that these packets

become trapped in a loop, leading to what is known

as a deadlock loop. After the completion of each

packet transmission network parameters are

measured for each transmission, as shown below in

the Table 3. From Table 4 it is seen that are out of 11

packets, 8 were successfully delivered, while 3

experienced failures. The failures occurred with

packets having Sr. No 4, 7, and 10.

Table 2: Path of test message indicating nodes involved in

the network

Packet

Data

Path Chosen Hop

count

1 hello [2, 4, 6, 4, 11, 9, 2, 1, 6, 2,

3, 10]

2 test [2, 7, 1, 5, 6, 11, 9, 2, 8, 10] 9

3 message [2, 10] 1

4 is [2, 3, 9, 7, 4, 8, 6, 12, 2, 3,

12, 1, 11, 8, 9, 2, 12, 7, 12,

9, 3, 12, 11, 12, 3, 1, 5, 6, 1,

11, 8, 2, 7, 6, 12, 8, 4, 12, 6,

7, 12, 7, 2, 9, 6, 9, 7, 9, 12,

4, 7, 1, 12, 5, 12, 1, 12, 9, 4,

11, 2, 6, 12, 1, 11, 5, 10]

5 transmitt

[2, 1, 6, 3, 11, 2, 11, 1, 3, 2,

12, 9, 5, 10]

6 between [2, 3, 6, 12, 7, 1, 9, 7, 9, 1,

8, 12, 8, 11, 1, 12, 11, 6, 1,

4, 9, 12, 11, 3, 2, 12, 5, 4, 3,

6, 9, 8, 5, 1, 8, 12, 1, 2, 8, 3,

7, 9, 12, 8, 7, 6, 10]

7 node [2, 9, 4, 9, 2, 9, 12, 7, 3, 7,

8, 7, 1, 5, 9, 11, 12, 11, 4, 7,

4, 3, 12, 10]

8 two [2, 3, 4, 12, 2, 7, 10] 6

9 and [2, 9, 11, 2, 12, 11, 2, 11, 6,

4, 1, 11, 2, 11, 10]

10 node [2, 3, 11, 2, 10] 4

11 twelve [2, 9, 2, 11, 3, 12, 2, 5, 11,

12, 11, 1, 10]

The delivery time fluctuates based on distance and

other network conditions. Packets with longer

distances (e.g., packet 6: 4.20 km, 5.25 sec) generally

took more time to deliver. The longest delivery time

was for packet 6 (5.25 sec) over a distance of 4.2 km.

Latency is generally equal to or slightly greater than

the delivery time for successful packets. Failed

packets showed higher latency values, indicating

potential retransmission. After concluding the

simulation, the outputs provide the final path

traversed by the packets.

Table 3: Performance of routing protocol in transmission of

test message for each packet

Pac

ket

Packet

Data

Total

Dista

nce(k

)

Deliver

y Status

Delive

time

(

sec

)

Latency

(sec)

1 hello 1.26 Success 1.62 1.62

2 test 0.94

Success 1.06 1.06

3 messag

0.10

Success 0.15 0.157

4 is 5.75 Failure 7.04 7.04

5 Trans-

mitte

1.39 Success 1.51 1.51

6 betwee

4.20 Success 5.25 5.25

7 node 1.82 Failure 2.65 2.65

8 two 0.37

Success 0.89 0.89

9 and 0.98

Success 1.83 1.84

10 node 0.31

Failure 0.61 0.61

11 twelve 1.22 Success 1.54 1.55

INCOFT 2025 - International Conference on Futuristic Technology

150

Figure 3: Delivery Time Statistics

Additionally, it provides delivery time statistics

for successful packet transmissions, highlighting the

minimum, maximum, and average times required for

packet delivery, as shown in figure 3. The figure

illustrates that the average time for successful packet

delivery is less than 2 seconds, with the maximum

delivery time being 5.25 seconds for packet sequence

number 6, and the minimum delivery time being 0.61

seconds for packet sequence number 10. Finally, the

packet delivery results, including successful

deliveries, total number of packets, and packet

delivery ratio are obtained. Table 4 summarizes the

packet delivery results, showing that out of 11

transmitted packets, 8 were successfully delivered,

resulting in a packet delivery ratio of 0.72.

Table 4: Packet Delivery Results for Greedy Node

Forwarding Strategy including UGV and UAV

Paramete

Result

Successful Deliveries: 8

Total Packets Transmitted: 11

Packet Deliver

Ratio: 0.72

To evaluate the performance of routing protocol

in second mode distinct number of packets

transmitted over the network and network parameters

are evaluated. For experimentation 5, 10, 15, 20, 25

number of packets are transmitted over the network

and its performance is evaluated w.r.t maximum hop

count, average hop count, maximum latency, average

latency and PDR between nodes two and node

twelve. Node two is a UAV node and node 12 is UGV

node. From figure 4, it is observed that the maximum

hop count reaches 34 when transmitting 25 data

packets, whereas it is reduced to 18 when transmitting

only 5 data packets. Average hop count lies between

5 to 7.

Figure 4: Maximum and average hop count under different

number of data packets.

Figure 5: Maximum and average latency under different

number of data packets

Figure 5 shows that the maximum latency is 4.2

seconds when transmitting 25 data packets, while it

decreases to 2.2 seconds for 5 data packets. The

average latency is approximately 1 second.Network

achieves a PDR of more than 0.75 with varying

number of packets as shown in figure 6.

Figure 6: PDR under different number of data packets

The results demonstrate that average hop count

for transmitting data between UAV and UGV nodes

with varying no of packets lies between 5 to 7 and

average latency with varying no of packets is less than

2 seconds.

Design of DRL Based Adaptive Routing Protocol for Bidirectional Communication Between UAVs and UGVs

151

6 DISCUSSION

Designing a bidirectional network for UAVs and

UGVs involves integrating GPSR with DRL to

enhance bi directional communication. The network

includes UAVs and UGVs equipped with GPS and

data processing capabilities, each identified by

unique IDs. GPSR utilizes hello packets to establish

neighbor relationships and build a local topology map

for routing. DRL augments GPSR by adapting to

network changes, optimizing routing paths, and

improving communication efficiency. This

integration ensures seamless data exchange and

coordination in UAV-UGV networks. This technique

proves efficient offering paths with fewer hops,

thereby reducing End-to-End Delay. Final path

selection involves evaluating each node for overall

end-to-end delay, considering the position, current

energy and timestamp. Upon network initialization,

every node transmits a hello packet to exchange self-

information. The test message's path within the

network indicates involved nodes, packet data, the

path taken, and hop count.

Moreover, the routing protocol's performance in

transmitting test messages for each packet is assessed

using diverse parameters. The evaluation reveals

variations in the total distance covered for successful

delivery of packet, with the “is packet” traversing the

longest distance at 4.2 km. Result reveals that

maximum hop count for successful deliveries is 46

which is corresponding to “ between packet” and least

hop count is 1 which is corresponding to “message

packet”. Delivery Status confirms successful delivery

for all packets except “is packet”, “node packets”.

Delivery Time exhibits variability, with the “between

packet” recording the lengthiest delivery time at 5.25

seconds. From Table 3 it is seen that failed packets

showed higher latency values as compared to delivery

time, indicating potential retransmissions. The

performance of the routing protocol during each

packet transmission is illustrated in Figure 6.

Delivery time statistics highlight the minimum,

maximum, and average time required for packet

delivery. The network achieved a packet delivery

ratio of 0.72 (refer to Table 4). The integration of the

GPSR protocol with the DRL technique demonstrates

its efficiency in the coordinated UAV and UGV

network, offering paths with fewer hops, thereby

reducing End-to-End Delay. Final path selection

involves evaluating each node for the overall end-to-

end delay, considering position, current energy, and

timestamp.

To further assess the performance of the routing

protocol, the number of packets transmitted over the

network and various network parameters are

analyzed. For the experiment, 5, 10, 15, 20, and 25

packets are transmitted across the network, and their

performance is evaluated. Figure 4 and figure 5

indicate that an increase in the number of packets,

which corresponds to a larger data size or

retransmissions due to packet loss, can result in a

higher hop count and increased latency. Figure 6

shows that the Packet Delivery Ratio (PDR) is 0.88

for 25 data packets, while it is 0.8 and 0.75 for 15 and

20 data packets, respectively. This indicates that

retransmissions are higher for 15 and 20 data packets

compared to 25 data packets. Additionally, the

increase in PDR for 25 number of data packets can be

attributed to the network by acquiring more detailed

information about the topology through exchange of

hello packets.

7 CONCLUSIONS

In this paper, a network is designed to test the

proposed UAV to UGV communication protocol. A

Simulator is designed, developed and tested to

establish the Bi- Directional Communication between

UAV and UGV. A test message is segmented into

eleven packets and transmitted between nodes 2 and

12 using integrated GPSR- DRL strategy to evaluate

the performance of simulated network comprising of

UAVs and UGVs. By integrating DRL, UAVs and

UGVs can learn optimal routing strategies that adapt

to changing environments and network conditions,

improving packet delivery rates and reducing

communication latency. The network's performance

was evaluated based on the hop count, delivery time,

and success or failure of each packet transmission.

Out of the 11 packets, 8 were successfully delivered,

resulting in a packet delivery ratio of 0.72. The

analysis also provided detailed statistics on delivery

times, mentioning minimum, maximum, and average

values. By analyzing delivery time data, network

operators and engineers can predict a range of critical

performance parameters such as latency, congestion

and QoS. This data helps to assess the reliability of

the routing protocol in the simulated environment, if

the delivery time consistently increases for certain

routes, it may indicate suboptimal path selection.

Moreover, sudden spikes in delivery time may

indicate underlying issues with network stability.

Benefits of unification of GPSR with DRL is the

competency to evaluate historical and real-time data

to predict traffic flow, change in network density and

INCOFT 2025 - International Conference on Futuristic Technology

152

node mobility, enabling GPSR to take decisions and

choose optimal routes.

Table 2 indicates that some of the packets are

taking five times more hops than the total number of

UAV or UGV nodes, suggesting that these packets

are getting trapped in a loop. This leads to

significantly higher network latency, a phenomenon

known as a deadlock loop. There is a scope of

minimizing deadlock loops to improve the efficiency

of network resource utilization.

The integration of GPSR with DRL in designing

a routing protocol for bidirectional communication

between UAVs and UGVs marks an innovative

development in network communication, bringing

intelligence and improved adaptability. This method

not only deals with the limitations of conventional

routing protocols in dynamic and unpredictable

environments but also unlocks new opportunities for

more efficient and reliable communication in UAV-

UGV networks. By continuously learning from the

environment and adapting to the changing conditions,

this DRL-based routing protocol ensures flawless and

resilient communication between aerial and ground

vehicles, even in dynamic scenarios.

REFERENCES

A. A. Laghari, A. K. Jumani, R. A. Laghari, and H. Nawaz,

“Unmanned aerial vehicles: A review,” Cognitive

Robotics, vol. 3, pp. 8–22, 2023, doi:

10.1016/j.cogr.2022.12.004.

F. Ahmed, J. C. Mohanta, A. Keshari, and P. S. Yadav,

“Recent Advances in Unmanned Aerial Vehicles: A

Review,” Arab J Sci Eng, vol. 47, no. 7, pp. 7963–7984,

Jul. 2022, doi: 10.1007/s13369-022-06738-0.

A. Sharma et al., “Communication and networking

technologies for UAVs: A survey,” Journal of Network

and Computer Applications, vol. 168, p. 102739, Oct.

2020, doi: 10.1016/j.jnca.2020.102739.

N. H. Hussein, C. T. Yaw, S. P. Koh, S. K. Tiong, and K.

H. Chong, “A Comprehensive Survey on Vehicular

Networking: Communications, Applications,

Challenges, and Upcoming Research Directions,” IEEE

Access, vol. 10, pp. 86127–86180, 2022, doi:

10.1109/ACCESS.2022.3198656.

Smt. T. R, Dr. S. D. K. A, and S. P M., “Energy Efficient

Routing Protocols for Mobile Robots: A Review,”

Saudi J. Eng. Technol., vol. 8, no. 07, pp. 163–170, Jul.

2023, doi: 10.36348/sjet.2023.v08i07.002.

Y. Altshuler, A. Pentland, and A. M. Bruckstein, “Optimal

Dynamic Coverage Infrastructure for Large-Scale

Fleets of Reconnaissance UAVs,” in Swarms and

Network Intelligence in Search, vol. 729, in Studies in

Computational Intelligence, vol. 729. , Cham: Springer

International Publishing, 2018, pp. 207–238. doi:

10.1007/978-3-319-63604-7_8.

F. Rubio, F. Valero, and C. Llopis-Albert, “A review of

mobile robots: Concepts, methods, theoretical

framework, and applications,” International Journal of

Advanced Robotic Systems, vol. 16, no. 2, p.

172988141983959, Mar. 2019, doi:

10.1177/1729881419839596.

J. Gielis, A. Shankar, and A. Prorok, “A Critical Review of

Communications in Multi-robot Systems,” Curr Robot

Rep, vol. 3, no. 4, pp. 213–225, Aug. 2022, doi:

10.1007/s43154-022-00090-9.

M. Hua, Y. Wang, Q. Wu, H. Dai, Y. Huang, and L. Yang,

“Energy-Efficient Cooperative Secure Transmission in

Multi-UAV-Enabled Wireless Networks,” IEEE Trans.

Veh. Technol., vol. 68, no. 8, pp. 7761–7775, Aug.

2019, doi: 10.1109/TVT.2019.2924180.

J. Sánchez-García, D. G. Reina, and S. L. Toral, “A

distributed PSO-based exploration algorithm for a

UAV network assisting a disaster scenario,” Future

Generation Computer Systems, vol. 90, pp. 129–148,

Jan. 2019, doi: 10.1016/j.future.2018.07.048.

F. Salazar et al., “Drone Collaboration Using OLSR

Protocol in a FANET Network for Traffic Monitoring

in a Smart City Environment,” in CSEI: International

Conference on Computer Science, Electronics and

Industrial Engineering (CSEI), vol. 678, M. V. Garcia

and C. Gordón-Gallegos, Eds., in Lecture Notes in

Networks and Systems, vol. 678. , Cham: Springer

Nature Switzerland, 2023, pp. 278–295. doi:

10.1007/978-3-031-30592-4_20.

S. Kumar, R. S. Raw, A. Bansal, and P. Singh, “UF‐GPSR :

Modified geographical routing protocol for flying ad‐

hoc networks,” Trans Emerging Tel Tech, vol. 34, no.

8, p. e4813, Aug. 2023, doi: 10.1002/ett.4813.

A. Charles, “Energy efficient quality routing protocol for

WSNs,” JWCN, vol. 11, no. 1, p. 9, 2022, doi:

10.26634/jwcn.11.1.18935.

B. Karp and H. T. Kung, “GPSR: greedy perimeter stateless

routing for wireless networks,” in Proceedings of the

6th annual international conference on Mobile

computing and networking, Boston Massachusetts

USA: ACM, Aug. 2000, pp. 243–254. doi:

10.1145/345910.345953.

M. Namdev, S. Goyal, and R. Agarwal, “An Optimized

Communication Scheme for Energy Efficient and

Secure Flying Ad-hoc Network (FANET),” Wireless

Pers Commun, vol. 120, no. 2, pp. 1291–1312, Sep.

2021, doi: 10.1007/s11277-021-08515-y.

G. M. Cappello et al., “Optimizing FANET Lifetime for 5G

Softwarized Network Provisioning,” IEEE Trans.

Netw. Serv. Manage., vol. 19, no. 4, pp. 4629–4649,

Dec. 2022, doi: 10.1109/TNSM.2022.3193883.

M. Hosseinzadeh et al., “A greedy perimeter stateless

routing method based on a position prediction

mechanism for flying ad hoc networks,” Journal of

King Saud University - Computer and Information

Sciences, vol. 35, no. 8, p. 101712, Sep. 2023, doi:

10.1016/j.jksuci.2023.101712.

Design of DRL Based Adaptive Routing Protocol for Bidirectional Communication Between UAVs and UGVs

153

A. H. Abbas, A. J. Ahmed, and S. A. Rashid, “A Cross-

Layer Approach MAC/NET with Updated-GA

(MNUG-CLA)-Based Routing Protocol for VANET

Network,” WEVJ, vol. 13, no. 5, p. 87, May 2022, doi:

10.3390/wevj13050087.

S.-C. Choi, H. R. Hussen, J.-H. Park, and J. Kim,

“Geolocation-Based Routing Protocol for Flying Ad

Hoc Networks (FANETs),” in 2018 Tenth International

Conference on Ubiquitous and Future Networks

(ICUFN), Prague, Czech Republic: IEEE, Jul. 2018, pp.

50–52. doi: 10.1109/ICUFN.2018.8436724.

A. Rovira-Sugranes, A. Razi, F. Afghah, and J. Chakareski,

“A review of AI-enabled routing protocols for UAV

networks: Trends, challenges, and future outlook,” Ad

Hoc Networks, vol. 130, p. 102790, May 2022, doi:

10.1016/j.adhoc.2022.102790.

S. Wen, C. Huang, X. Chen, J. Ma, N. Xiong, and Z. Li,

“Energy-efficient and delay-aware distributed routing

with cooperative transmission for Internet of Things,”

Journal of Parallel and Distributed Computing, vol.

118, pp. 46–56, Aug. 2018, doi:

10.1016/j.jpdc.2017.08.002.

Z. S. Houssaini, I. Zaimi, M. Oumsis, and S. E. A. Ouatik,

“GPSR+Predict: An Enhancement for GPSR to Make

Smart Routing Decision by Anticipating Movement of

Vehicles in VANETs,” Adv. sci. technol. eng. syst. j.,

vol. 2, no. 3, pp. 137–146, Apr. 2017, doi:

10.25046/aj020318.

Z. Liu, X. Feng, J. Zhang, T. Li, and Y. Wang, “An

Improved GPSR Algorithm Based on Energy Gradient

and APIT Grid,” Journal of Sensors, vol. 2016, pp. 1–

7, 2016, doi: 10.1155/2016/2519714.

A. T. Azar et al., “Drone Deep Reinforcement Learning: A

Review,” Electronics, vol. 10, no. 9, p. 999, Apr. 2021,

doi: 10.3390/electronics10090999.

I. A. Aljabry and G. A. Al-Suhail, “Improving the Route

Selection for Geographic Routing Using Fuzzy-Logic

in VANET,” in Intelligent Computing & Optimization,

vol. 371, P. Vasant, I. Zelinka, and G.-W. Weber, Eds.,

in Lecture Notes in Networks and Systems, vol. 371. ,

Cham: Springer International Publishing, 2022, pp.

958–967. doi: 10.1007/978-3-030-93247-3_91.

INCOFT 2025 - International Conference on Futuristic Technology

154