assesses the students’ program source code. In the 
testing phase, the system gives the lecturer the 
opportunity to view the students’ source code. The 
system also asks the lecturer questions regarding 
students’ program code as part of the assessment 
process. The examiner gives an answer to questions 
posed choosing one possible answer from the listed 
answers such as awful, poor, fair, very good etc., 
and then the assessment process continues. It also 
applies software metrics to students’ programs. 
Lastly, students get feedback on his/her exercises.  
In the system developed by Joy et al. (2005), the 
correctness, style and authenticity of the student 
program code is assessed. It is designed for 
programming exercises. Students can submit their 
programs using the BOSS system (a submission and 
assessment system) (Joy et al., 2005). In the 
feedback process, a lecturer tests and marks the 
students’ submissions using BOSS. The system also 
allows lecturers to get information on students’ 
results according to the automatic test applied and to 
view original source code. Thus, the examiner can 
then give further feedback in addition to the 
system’s feedback. At the end of the assessment, the 
student gets feedback including comments and a 
score, rather than just a score. 
2.1  Discussion of Related Work 
In the related work section five studies were 
introduced in terms of their strengths and 
weaknesses. Although some of them may provide 
sufficient feedback if correctly applied, they are not 
designed to significantly alleviate the workload of 
the examiner. That is, providing feedback may 
impact negatively on the time taken by the examiner 
to assess student work. The workload of the 
examiner entirely depends on the approach of the 
assessment systems. In addition to this, the workload 
of the examiner also depends on the length of the 
code script which could be short or long.   
The systems of Wang et al. (2007) and Sharma et 
al. (2014) and Saikkonen et al. (2001) focus on 
student programming code structures, which can be 
useful, but limiting in terms of feedback. While the 
systems of Wang et al. (2007) and Saikkonen et al. 
(2001) focus on the whole code structure in their 
own systems, the system of Sharma et al. (2014) 
covers only the ordering of conditions in the ‘else-if’ 
structure. The aim of the standardisation of the code 
structure in the system of Wang et al. (2007) is to 
reduce the number of model answers. Furthermore, 
code structure is standardised to provide grade 
students’ code rather than to provide comment 
(feedback) on the code structure. On the other hand, 
the system of Saikkonen et al. (2001) assesses the 
return values instead of actual output strings because 
of the differences in wording in students’ answers. 
In other words, the system focuses on the execution 
of abstract tree. In this case, the system of Saikkonen 
et al. (2001) may not provide comprehensive 
feedback for students. Also, Sharma et al. (2014) 
system can be used only for ‘else-if’ structures, 
which is effective for providing feedback to novice 
programmers, but only on ‘else-if’. However, the 
theory behind the system could allow it to handle 
other control structures,  loops and functions in the 
future. Moreover, the quality of feedback could have 
been enhanced by inclusion of a human in the 
assessment process.  
The system developed by Jackson (2000) and 
Joy et al. (2005) highlights the importance of human 
in the assessment process in providing 
comprehensive feedback. In Jackson (2000) system 
the examiner is part of the assessment process; 
however, the examiner is used after the assessment 
process in Joy et al. (2005) system. In these systems, 
humans check each student’s code separately. 
Therefore, the systems cannot reduce the workload 
of examiners significantly, although they can 
provide sufficient feedback using these approaches. 
One significant drawback to Joy et al. (2005) system 
is that while examiners can give additional 
comments to students, the system could also 
potentially provide inconsistent comments (as this is 
not checked by the system). It is this automatic reuse 
of feedback provided for given segments of code 
that would have allowed greater consistency and 
efficiency to be achieved. On the other hand, in 
Jackson (2000) system, the examiner chooses one 
comment from the suggested comments. However, 
the system cannot provide comprehensive feedback 
because the examiner cannot add comments to the 
student’s code.  
To conclude, the discussed assessment studies 
intended to provide sufficient feedback and reduce 
the workload of the examiner. However, they have 
generally focused on whole code segments rather 
than control structures, loop, functions etc. Thus, 
they have generally provided superficial feedback 
although some of them reduce the workload of the 
examiner. Moreover, these discussed studies 
generally based on the semantic similarity. The 
proposed approach also related to semantic and 
structure similarity. The main difference between 
them is that the proposed approach does not need 
model answer(s) although the discussed studies do. 
Therefore, the proposed approach parses the whole