
as networks grow increasingly complex, traditional
methods struggle to converge due to their non-linear
and non-convex characteristics (Li et al., 2021). Non-
linear ACOPF problems are often approximated us-
ing linearized DCOPF solutions to derive real power
outcomes, where voltage angles and reactive power
flows are eliminated through substitution (thus re-
moving Alternating Current (AC) electrical behav-
ior). This approximation, however, becomes invalid
under heavy loading conditions in power grids (Ow-
erko et al., 2020). Additionally, the OPF problem
is inherently non-convex because of the sinusoidal
nature of electrical generation (Wood et al., 2013).
Alternative techniques seek to approximate the OPF
solution by relaxing this non-convex constraint, em-
ploying methods such as Second Order Cone Pro-
gramming (SOCP) (Wood et al., 2013). In daily op-
erations that necessitate solving OPF within a minute
every five minutes, TSO is compelled to depend on
linear approximations. The solutions derived from
these approximations tend to be inefficient, resulting
in power wastage and the overproduction of hundreds
of megatons of CO2-equivalent annually. Today, fifty
years after the problem was first formulated, we still
lack a fast, robust solution technique for the complete
Alternating Current Optimal Power Flow (Mary et al.,
2012). For large and intricate power system networks
with numerous variables and constraints, achieving
the optimal solution for real-time OPF in a timely
manner demands substantial computing power (Pan
et al., 2022), which continues to pose a significant
challenge.
In power systems, as in many other fields, algo-
rithms of ML have recently begun to be utilized. The
latest proposals employ Graph Neural Networks, a
neural network that naturally facilitates the process-
ing of graph data (Liao et al., 2022). An increasing
number of tasks in power systems are being addressed
with GNN, including time series prediction of loads
and RES, fault diagnosis, scenario generation, opera-
tional control, and more (Diehl, 2019). The primary
advantage is that by treating power grids as graphs,
GNN can be trained on specific grid topologies and
subsequently applied to different ones, thereby gener-
alizing results (Liao et al., 2022). Conversely, Deep
Reinforcement Learning is recognized for its abil-
ity to tackle complex decision-making problems in a
computationally efficient, scalable, and flexible man-
ner—problems that would otherwise be numerically
intractable (Li et al., 2021). It is regarded as one of the
state-of-the-art frameworks in Artificial Intelligence
(AI) for addressing sequential decision-making chal-
lenges (Munikoti et al., 2024). The DRL based ap-
proach seeks to progressively learn how to optimize
power flow in electrical networks and dynamically
identify the optimal operating point. While some ap-
proaches utilize various DRL algorithms, none have
integrated it with GNN, which limits their ability to
generalize and fully leverage the information regard-
ing connections between buses and the properties of
the electrical lines that connect them. Given this con-
text, and considering that the combination of DRL
and GNN has demonstrated improvements in general-
izability and reductions in computational complexity
in other domains (Munikoti et al., 2024), we explore
their implementation in this work.
Contribution: This paper presents a significant
advancement through the proposal of a novel archi-
tecture that integrates the Proximal Policy Optimiza-
tion algorithm with Graph Neural Networks to ad-
dress the Optimal Power Flow problem. To the best of
our knowledge, this unique architecture has not been
previously applied to this challenge. Our objective
is to rigorously test the design of our architecture,
demonstrating its capability to solve the optimization
problem by effectively learning the internal dynam-
ics of the power network. Additionally, we aim to
evaluate its ability to generalize to new scenarios that
were not encountered during the training process. We
compare our solution against the DCOPF in terms of
cost, following the training of our DRL agent on the
IEEE 30 bus system. Through various modifications
to the base network, including changes in the num-
ber of edges and loads, our approach yields superior
cost outcomes compared to the DCOPF, achieving a
reduction in generation costs of up to 30%.
2 RELATED WORK
Until this paper, there had been no solution for the
OPF problem that utilized GNN to handle graph-
type data and DRL, enabling generalization and un-
derstanding the internal dynamics of the power grid.
Nevertheless, methods can be found in the literature
that employ each of the approaches independently.
Data-driven methods based on deep learning have
been introduced to solve OPF in approaches such as
(Owerko et al., 2020), (Donon et al., 2019), (Donon
et al., 2020), (Pan et al., 2022), and (Donnot et al.,
2017), among others. However, these approaches re-
quire a substantial amount of historical data for train-
ing and necessitate the collection of extensive data
whenever there is a change in the grid. Conversely,
the DRL based approach aims to gradually learn how
to optimize power flow in electrical networks and dy-
namically identify the optimal operating point. Ap-
proaches like (Zhen et al., 2022), (Li et al., 2021),
DATA 2025 - 14th International Conference on Data Science, Technology and Applications
348