
N., Tang, J., Babuschkin, I., Balaji, S., Jain, S., Saun-
ders, W., Hesse, C., Carr, A. N., Leike, J., Achiam,
J., Misra, V., Morikawa, E., Radford, A., Knight, M.,
Brundage, M., Murati, M., Mayer, K., Welinder, P.,
McGrew, B., Amodei, D., McCandlish, S., Sutskever,
I., and Zaremba, W. (2021). Evaluating Large Lan-
guage Models Trained on Code. arXiv:2107.03374
[cs.LG].
Clauss, A. (2024). Facilitating Competence-Oriented Qual-
ification in New Work: Evaluation of a Platform Pro-
totype:. In Proceedings of the 16th International
Conference on Computer Supported Education, pages
659–668, Angers, France. SCITEPRESS - Science
and Technology Publications.
Duong, H.-T. and Chen, H.-M. (2024). ProgEdu4Web: An
automated assessment tool for motivating the learning
of web programming course. Computer Applications
in Engineering Education, 32(5):e22770.
Fu, X., Peltsverger, B., Qian, K., Tao, L., and Liu, J.
(2008). APOGEE: Automated project grading and in-
stant feedback system for web based computing. ACM
SIGCSE Bulletin, 40(1):77–81.
Gabbay, H. and Cohen, A. (2024). Combining LLM-
Generated and Test-Based Feedback in a MOOC for
Programming. In Proceedings of the Eleventh ACM
Conference on Learning @ Scale, pages 177–187, At-
lanta GA USA. ACM.
Hevner, A. and Chatterjee, S. (2010). Design Research in
Information Systems: Theory and Practice, volume 22
of Integrated Series in Information Systems. Springer
US, Boston, MA.
Hollingsworth, J. (1960). Automatic graders for pro-
gramming classes. Communications of the ACM,
3(10):528–529.
Hou, X., Zhao, Y., Liu, Y., Yang, Z., Wang, K., Li, L., Luo,
X., Lo, D., Grundy, J., and Wang, H. (2024). Large
Language Models for Software Engineering: A Sys-
tematic Literature Review. arXiv:2308.10620 [cs.SE].
Insa, D., P
´
erez, S., Silva, J., and Tamarit, S. (2021). Semi-
automatic generation and assessment of Java exercises
in engineering education. Computer Applications in
Engineering Education, 29(5):1034–1050.
Leenknecht, M., Wijnia, L., K
¨
ohlen, M., Fryer, L., Rikers,
R., and Loyens, S. (2021). Formative assessment as
practice: The role of students’ motivation. Assessment
& Evaluation in Higher Education, 46(2):236–255.
Li, A., Wu, J., and Bigham, J. P. (2023). Using LLMs
to Customize the UI of Webpages. In Adjunct Pro-
ceedings of the 36th Annual ACM Symposium on User
Interface Software and Technology, pages 1–3, San
Francisco CA USA. ACM.
Liu, M. and M’Hiri, F. (2024). Beyond Traditional Teach-
ing: Large Language Models as Simulated Teaching
Assistants in Computer Science. In Proceedings of the
55th ACM Technical Symposium on Computer Science
Education V. 1, pages 743–749, Portland OR USA.
ACM.
Messer, M., Brown, N. C. C., K
¨
olling, M., and Shi, M.
(2023). Automated Grading and Feedback Tools
for Programming Education: A Systematic Review.
ACM Transactions on Computing Education, page
3636515.
Muuli, E., Papli, K., T
˜
onisson, E., Lepp, M., Palts, T., Su-
viste, R., S
¨
ade, M., and Luik, P. (2017). Automatic
Assessment of Programming Assignments Using Im-
age Recognition. In Lavou
´
e,
´
E., Drachsler, H., Ver-
bert, K., Broisin, J., and P
´
erez-Sanagust
´
ın, M., edi-
tors, Data Driven Approaches in Digital Education,
volume 10474, pages 153–163. Springer International
Publishing, Cham.
Naveed, H., Khan, A. U., Qiu, S., Saqib, M., Anwar, S., Us-
man, M., Akhtar, N., Barnes, N., and Mian, A. (2024).
A Comprehensive Overview of Large Language Mod-
els. arXiv:2307.06435 [cs.CL].
Neo, G., Moura, J., Almeida, H., Neo, A., and Fre-
itas J
´
unior, O. (2024). User Story Tutor (UST) to
Support Agile Software Developers:. In Proceedings
of the 16th International Conference on Computer
Supported Education, pages 51–62, Angers, France.
SCITEPRESS - Science and Technology Publications.
Nguyen, B.-A., Ho, K.-Y., and Chen, H.-M. (2020). Mea-
sure Students’ Contribution in Web Programming
Projects by Exploring Source Code Repository. In
2020 International Computer Symposium (ICS), pages
473–478, Tainan, Taiwan. IEEE.
Peffers, K., Tuunanen, T., Rothenberger, M. A., and Chat-
terjee, S. (2007). A Design Science Research Method-
ology for Information Systems Research. Journal of
Management Information Systems, 24(3):45–77.
Sharmin, S. (2022). Creativity in CS1: A Literature Review.
ACM Transactions on Computing Education, 22(2):1–
26.
Shute, V. J. (2008). Focus on Formative Feedback. Review
of Educational Research, 78(1):153–189.
Siochi, A. C. and Hardy, W. R. (2015). WebWolf: Towards
a Simple Framework for Automated Assessment of
Webpage Assignments in an Introductory Web Pro-
gramming Class. In Proceedings of the 46th ACM
Technical Symposium on Computer Science Educa-
tion, pages 84–89, Kansas City Missouri USA. ACM.
Smoli
´
c, E., Paveli
´
c, M., Boras, B., Mekterovi
´
c, I., and
Jagu
ˇ
st, T. (2024). LLM Generative AI and Students’
Exam Code Evaluation: Qualitative and Quantitative
Analysis. In 2024 47th MIPRO ICT and Electron-
ics Convention (MIPRO), pages 1261–1266, Opatija,
Croatia. IEEE.
CSEDU 2025 - 17th International Conference on Computer Supported Education
386