be more appropriate to add support and/or switch to a
different format for specifying job properties over
time? Support for user functionality at the compiled
code level, typically in the form of user objects
implementing a defined programming interface, is not
currently implemented in the generator, nor was this
form of support anticipated when the generator was
designed. Implementation of this support would
require major changes to the basic structure of the
generator and would in principle bring disadvantages,
such as the need to program user functionality
directly in the form of Java code and the dependence
of user code on programming interfaces defined by
the generator. The question of possible programming
support will therefore be the subject of future in-depth
analyses.
5 CONCLUSIONS
This paper presents a new dataset generator designed
to evaluate and compare the performance of job
scheduling algorithms on modern GPU-clusters with
MIG technology support. The properties of the
generated datasets, including their sizes, can be easily
set via the generator's configuration parameters. The
generated datasets are reproducible and publicly
available as well as the generator source code.
At the present time, the generator supports only a
few basic probability distributions. In the future, it is
considered to extend the generator's support by
specifying more complex cases of the properties of
the generated jobs, including the possibility of
defining histograms, adding user-defined
configuration items and functionalities, the
possibility of specifying the dynamics of the
frequency of incoming jobs during the monitored
period. An open problem is how to implement such
support given the user-friendliness, good
maintainability and reasonable complexity of the
generator.
ACKNOWLEDGEMENTS
This article was written with the financial support of
the Grant Agency of the University of South
Bohemia.
REFERENCES
Narayanan, D., Santhanam, K., Kazhamiaka, F.,
Phanishayee, A., & Zaharia, M. (2020). Heterogeneity-
Aware Cluster Scheduling Policies for Deep Learning
Workloads. 14th USENIX Symposium on Operating
Systems Design and Implementation (OSDI 20), 481–
498. https://www.usenix.org/conference/osdi20/presen
tation/narayanan-deepak
Li, J., Xu, H., Zhu, Y., Liu, Z., Guo, C., & Wang, C. (2023).
Lyra: Elastic Scheduling for Deep Learning Clusters.
Proceedings of the Eighteenth European Conference on
Computer Systems, 835–850. https://doi.org/10.1145/
3552326.3587445
Gu, J., Song, S., Li, Y., & Luo, H. (2018). GaiaGPU:
Sharing GPUs in Container Clouds. 2018 IEEE Intl
Conf on Parallel & Distributed Processing with
Applications, Ubiquitous Computing &
Communications, Big Data & Cloud Computing, Social
Computing & Networking, Sustainable Computing &
Communications (ISPA/IUCC/BDCloud/SocialCom/Su
stainCom), 469–476. https://doi.org/10.1109/BDClou
d.2018.00077
Shi, L., Chen, H., Sun, J., & Li, K. (2012). vCUDA: GPU-
Accelerated High-Performance Computing in Virtual
Machines. IEEE Transactions on Computers, 61(6),
804–816. https://doi.org/10.1109/TC.2011.112
Han, M., Zhang, H., Chen, R., & Chen, H. (2022).
Microsecond-scale Preemption for Concurrent GPU-
accelerated DNN Inferences. 16th USENIX Symposium
on Operating Systems Design and Implementation
(OSDI 22), 539–558. https://www.usenix.org/confe
rence/osdi22/presentation/han7
Nvidia multi-instance GPU, seven independent instances in
a single GPU. (2023). https://www.nvidia.com/en-
us/technologies/multi-instance-gpu/
Hu, Q., Sun, P., Yan, S., Wen, Y., & Zhang, T. (2021).
Characterization and Prediction of Deep Learning
Workloads in Large-Scale GPU Datacenters.
Proceedings of the International Conference for High
Performance Computing, Networking, Storage and
Analysis. https://doi.org/10.1145/3458817.3476223
Weng, Q., Yang, L., Yu, Y., Wang, W., Tang, X., Yang, G.,
& Zhang, L. (2023). Beware of Fragmentation:
Scheduling GPU-Sharing Workloads with Fragmenta-
tion Gradient Descent. 2023 USENIX Annual Technical
Conference (USENIX ATC 23), 995–1008.
https://www.usenix.org/conference/atc23/presentation/
weng
Weng, Q., Xiao, W., Yu, Y., Wang, W., Wang, C., He, J.,
Li, Y., Zhang, L., Lin, W., & Ding, Y. (2022). MLaaS
in the Wild: Workload Analysis and Scheduling in
Large-Scale Heterogeneous GPU Clusters. 19th
USENIX Symposium on Networked Systems Design and
Implementation (NSDI 22), 945–960. https://www.use
nix.org/conference/nsdi22/presentation/weng
Yeh, T.-A., Chen, H.-H., & Chou, J. (2020). KubeShare: A
Framework to Manage GPUs as First-Class and Shared
Resources in Container Cloud. Proceedings of the 29th
International Symposium on High-Performance