Table 4: Number of requests pertaining to the proposed method.
Page Requests Page Requests
backslash1 3 js6_sq_combo1 4
basic 2 js_script_close 13
basic_in_tag 2 oneclick1 24
doubq1 6 onmouseover 9
enc2 33 onmouseover_div_unquoted 6
full1 2 onmouseover_unquoted 8
js3 1 rs1 2
js3_notags 1 textarea1 4
js4_dq 6 textarea2 4
js6_sq 4
be a solution. More work is still needed to enhance
training environments.
5 CONCLUSION
This paper presents an XSS vulnerability testing
method using RL and a training environment, XSS
Gym. The proposed method trains an RL agent to
autonomously compose test strings by replacing the
fragments of known test strings and observing the
parsing of the target web page. Since RL obtains an
efficient policy for composing test strings, the num-
ber of requests for testing web pages is drastically de-
creased. The experimental results demonstrate that an
RL agent can be trained using XSS Gym and the pro-
posed method discovers vulnerabilities in web pages
with the fewest requests compared to other existing
vulnerability testing tools.
REFERENCES
Avgerinos, T., Brumley, D., Davis, J., Goulden, R., Nigh-
swander, T., Rebert, A., and Williamson, N. (2018).
The mayhem cyber reasoning system. IEEE Security
& Privacy, 16(2):52–60.
Bland, J. A., Petty, M. D., Whitaker, T. S., Maxwell, K. P.,
and Cantrell, W. A. (2020). Machine learning cy-
berattack and defense strategies. Comput. Secur.,
92:101738.
Caturano, F., Perrone, G., and Romano, S. P. (2021). Dis-
covering reflected cross-site scripting vulnerabilities
using a multiobjective reinforcement learning envi-
ronment. Computers & Security, 103(102204).
Chen, S. (2014). WAVSEP - the web application vulnera-
bility scanner evaluation project.
Chowdary, A., Huang, D., Mahendran, J. S., Romo, D.,
Deng, Y., and Sabur, A. (2020). Autonomous secu-
rity analysis and penetration testing. In Proc. Interna-
tional Conference on Mobility, Sensing and Network-
ing.
Demetrio, L., Valenza, A., Costa, G., and Lagorio, G.
(2020). Waf-a-mole: Evading web application fire-
walls through adversarial machine learning. In Pro-
ceedings of the 35th Annual ACM Symposium on Ap-
plied Computing, pages 1745–1752. Association for
Computing Machinery.
Erdödi, L., Åvald Åslaugson Sommervoll, and Zennaro,
F. M. (2021). Simulating sql injection vulnerability
exploitation using q-learning reinforcement learning
agents. Journal of Information Security and Applica-
tions, 61:102903.
Erdödi, L. and Zennaro, F. M. (2022). The agent web
model: modeling web hacking for reinforcement
learning. International Journal of Information Secu-
rity, 21(2):293–309.
Frempong, Y., Snyder, Y., Al-Hossami, E., Sridhar, M., and
Shaikh, S. (2021). HIJaX: Human intent javascript xss
generator. In SECRYPT, pages 798–805.
Ghanem, M. C. and Chen, T. M. (2020). Reinforcement
learning for efficient network penetration testing. In-
formation, 11(1):6.
Hu, Z., Beuran, R., and Tan, Y. (2020). Automated pene-
tration testing using deep reinforcement learning. In
Proc. EuroS&P Workshops, pages 2–10.
Meyer, T., Kaloudi, N., and Li, J. (2021). A systematic liter-
ature review on malicious use of reinforcement learn-
ing. In 2021 IEEE/ACM 2nd International Workshop
on Engineering and Cybersecurity of Critical Systems
(EnCyCriS), pages 21–28.
Nguyen, T. T. and Reddi, V. J. (2021). Deep reinforce-
ment learning for cyber security. IEEE Transactions
on Neural Networks and Learning Systems, pages 1–
17.
OWASP Top 10 team (2021). OWASP Top 10:2021. https:
//owasp.org/Top10/.
Pathak, D., Agrawal, P., Efros, A. A., and Darrell, T. (2017).
Curiosity-driven exploration by self-supervised pre-
diction. In International Conference on Machine
Learning, pages 2778–2787.
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and
Klimov, O. (2017). Proximal policy optimization al-
gorithms.
Song, J. and Alves-Foss, J. (2015). The darpa cyber grand
challenge: A competitor’s perspective. IEEE Security
& Privacy, 13:72–76.
Automating XSS Vulnerability Testing Using Reinforcement Learning
79