
5.2 Comparison of Loss in DQN and
Rainbow DQN
The optimized policy π
∗
using Rainbow DQN sig-
nificantly outperforms traditional scheduling meth-
ods DQN, reducing loss as shown in figure 4. This
enhances efficiency in execution time, CPU/memory
utilization, and job completion times. Rainbow
DQN effectively manages execution time uncertain-
ties (P(T | s, a)) and adapts dynamically to real-world
variability, surpassing deterministic approaches. It
also shows promising reward results, indicating op-
timal performance and effective learning from initial
negative rewards (Fig: 5).
Figure 5: Reward for each step for γ= 1.0
6 CONCLUSION AND FUTURE
DIRECTION
This study demonstrates the effectiveness of DRL in
enhancing Apache Spark scheduling within cloud en-
vironments. By modeling uncertainties in job exe-
cution times and dynamically adapting resource allo-
cations, DDRL-based policies significantly improved
cost efficiency and performance metrics. The reward
pattern highlights Rainbow DQN’s ability to optimize
both immediate and long-term rewards despite occa-
sional drops due to the epsilon-greedy policy. Ex-
periments showed that DDRL outperformed DQN
in minimizing loss and improving resource utiliza-
tion. The practical implications include cost savings
through efficient resource allocation, operational ef-
ficiencies via automated management, and scalability
to meet evolving computational demands. Leverag-
ing DRL for Spark scheduling offers a promising path
toward advancing distributed computing efficiency in
cloud environments. Future work will improve al-
gorithm convergence, refine reward functions, han-
dle dynamic workloads, and integrate factors such as
network bandwidth and disk I/O. Automated hyper-
parameter optimization will reduce complexity and
enhance DRL performance. Additionally, further ex-
ploration of DRL algorithms, deeper integration with
cloud-native features, and broader applicability across
diverse Spark workloads and infrastructures will be
prioritized.
REFERENCES
Alipourfard, O., Liu, H. H., Chen, J., Venkataraman, S., Yu,
M., and Zhang, M. (2017). Cherrypick: Adaptively
unearthing the best cloud configurations for big data
analytics. In 14th USENIX Symposium on Networked
Systems Design and Implementation (NSDI 17), pages
469–482.
Cao, Z., Deng, X., Yue, S., Jiang, P., Ren, J., and Gui, J.
(2024). Dependent task offloading in edge comput-
ing using gnn and deep reinforcement learning. IEEE
Internet of Things Journal.
Cheng, D., Chen, Y., Zhou, X., Gmach, D., and Miloji-
cic, D. (2017). Adaptive scheduling of parallel jobs
in spark streaming. In IEEE INFOCOM 2017-IEEE
Conference on Computer Communications, pages 1–
9. IEEE.
Cheng, D., Zhou, X., Wang, Y., and Jiang, C. (2018). Adap-
tive scheduling parallel jobs with dynamic batching in
spark streaming. IEEE Transactions on Parallel and
Distributed Systems, 29(12):2672–2685.
Fu, Z., Tang, Z., Yang, L., and Liu, C. (2020). An op-
timal locality-aware task scheduling algorithm based
on bipartite graph modelling for spark applications.
IEEE Transactions on Parallel and Distributed Sys-
tems, 31(10):2406–2420.
Gandomi, A., Reshadi, M., Movaghar, A., and
Khademzadeh, A. (2019). Hybsmrp: a hybrid
scheduling algorithm in hadoop mapreduce frame-
work. Journal of Big Data, 6(1):1–16.
Gu, H., Li, X., and Lu, Z. (2020). Scheduling spark tasks
with data skew and deadline constraints. IEEE Access,
9:2793–2804.
Islam, M. T., Karunasekera, S., and Buyya, R. (2021a).
Performance and cost-efficient spark job scheduling
based on deep reinforcement learning in cloud com-
puting environments. IEEE Transactions on Parallel
and Distributed Systems, 33(7):1695–1710.
Islam, M. T., Wu, H., Karunasekera, S., and Buyya, R.
(2021b). Sla-based scheduling of spark jobs in hybrid
cloud computing environments. IEEE Transactions on
Computers, 71(5):1117–1132.
Li, D., Hu, Z., Lai, Z., Zhang, Y., and Lu, K. (2020). Co-
ordinative scheduling of computation and communi-
cation in data-parallel systems. IEEE Transactions on
Computers, 70(12):2182–2197.
Li, S. E. (2023). Deep reinforcement learning. In Rein-
forcement learning for sequential decision and opti-
mal control, pages 365–402. Springer.
Shabestari, F. and Navimipour, N. J. (2023). An energy-
aware resource management strategy based on spark
and yarn in heterogeneous environments. IEEE Trans-
actions on Green Communications and Networking.
INCOFT 2025 - International Conference on Futuristic Technology
622