Slide-recommendation System: A Strategy for Integrating Instructional

Feedback into Online Exercise Sessions

Victor Obionwu

, Vincent Toulouse

, David Broneske

2 a

and Gunter Saake

1 b

University of Magdeburg, Magdeburg, Germany

German Centre for Higher Education Research and Science Studies, Hannover, Germany

Keywords:

Text Mining, Filtering, Instructional Feedback, Learning Analytic, Natural Language Processing.

Abstract:

A structured learning behavior needs an understanding of the learning environment, the feedback it returns,

and comprehension of the task requirements. However, as observed in the activity logs of our SQLValidator,

students spend most time doing trial-and-error until they came to the correct answer.While most students re-

sort to consulting their colleagues, few eventually acquire a comprehension of the rules of the SQL language.

However, with instructional feedback in form of a recommendation, we could reduce the time penalty of inef-

fective engagement. To this end, we have extended our SQLValidator with a recommendation subsystem that

provides automatic instructional feedback during online exercise sessions. We show that a mapping between

SQL exercises, lecture slides, and respective cosine similarity can be used for providing useful recommen-

dations. The performance of our prototype reaches a precision value of 0.767 and an F

β=0.5

value of 0.505

which justiﬁes our strategy of aiding students with lecture slide recommendation.

1 INTRODUCTION

The beauty of traditional task exercise sessions is the

on-hand availability of instructors to provide guid-

ance, corrections to tasks, and immediate feedback to

questions that arise in the course of learning. How-

ever, owing to new regulations that require online

dispensation of lectures and corresponding exercises,

the use of online tools and platforms, such as Moo-

dle (Costello, 2013), Blackboard (Machado and Tao,

2007), and the SQLValidator (Obionwu et al., 2021),

for lecture administration, and collaborative learning

(Obionwu et al., 2022) is the new standard. This

switch to online learning platforms, while making

learning ﬂexible, mostly erases direct instructional

feedback, which is necessary for computer science

and engineering-related disciplines. Moreover, for

learners that are new to programming languages like

the structured query language, SQL, absence of in-

structional feedback results in a trial and error-based

form of learning engagement which results in a high

count of easily avoidable errors and an inconvenient

time penalty that leads to frustration. Thus, instruc-

tional feedback is necessary to aid novice students in

https://orcid.org/0000-0002-9580-740X

https://orcid.org/0000-0001-9576-8474

bringing their task engagement behavior into confor-

mity with the rules of SQL. Ergo, we integrated auto-

matic recommendations into the exercise task engage-

ment workﬂow. Hence, student need not get frustrated

as the feedback is persistent, automatic, and efﬁcient

compared to traditional settings. Our implementation

relies on content-based ﬁltering since the pedagogical

aspects of the database concepts course, correspond-

ing exercise tasks, and student’s proﬁciency levels are

already known (Charu, 2016). Hence, feedback pro-

vided to students are enriched with recommendations

that point to sections of our lecture material contain-

ing SQL topics, and sample queries necessary to suc-

cessfully solve the exercise task in question. Com-

pared to related efforts at student SQL learning en-

gagement improvement, optimization, and skill ac-

quisition, our method:

• provides meaningful instructional feedback dur-

ing individual online exercise sessions.

• fosters SQL skill acquisition by linking SQL the-

ory with corresponding exercise tasks.

In the following, we ﬁrst present related work. In Sec-

tion 3, we present the conceptual idea of our following

implementation in Section 4. We evaluate our imple-

mentation in Section 5 and discuss challenges in Sec-

tion 6. Finally, in Section 7, we conclude our work.

Obionwu, V., Toulouse, V., Broneske, D. and Saake, G.

Slide-recommendation System: A Strategy for Integrating Instructional Feedback into Online Exercise Sessions.

DOI: 10.5220/0011351000003269

In Proceedings of the 11th International Conference on Data Science, Technology and Applications (DATA 2022), pages 541-548

ISBN: 978-989-758-583-8; ISSN: 2184-285X

541

2 RELATED WORK

Several systems were developed to foster SQL learn-

ing. An educational tool developed by Arizona State

University enables the students to learn about rela-

tional query languages by letting the students solve

their homework assignments using their tool (Diet-

rich, 1993). However, the students receive their feed-

back directly from the interpreter, and thus we assume

the feedback to be too technical to be understood by

class participants with minor knowledge about SQL.

In comparison, our recommendation point’s students

to the lecture material that explains the correct us-

age of SQL concepts, thus reduces the time penalty

of the trial and error pattern. Another tool that sup-

ports students at learning SQL is the SQL-Tutor de-

veloped by the University of Canterbury (Mitrovi

1998) and used at many universities around the globe.

The tool prompts students to solve SQL-related exer-

cises and, while doing so, it provides different stages

of feedback to the students. The feedback although

is restricted to the task at hand and is not utilizing

the learning materials of the course. Another tool, the

SQL Tester (Kleerekoper and Schoﬁeld, 2018), is an

online assessment tool based on a 50-minute test that

comprises ten questions. These questions are based

on a table with different categories of SQL concepts.

There is no limit to the number of trials that a student

can attempt the test. However, the students only re-

ceived the regular error messages from the RDBMS

as feedback. We are not aware of any research that

is trying to improve the SQL learning experience by

automatically connecting exercises to the material of

the lecture. In general, no techniques have been pro-

posed to improve SQL learning experience by inte-

grating our method of instructional feedback.

3 CONCEPTUAL FRAMEWORK

Several systems have been developed to facilitate

SQL skill acquisition. SQLValidator (Obionwu et al.,

2021) is one of such web-based tools.It is an integral

part of the database courses held is our faculty, and

encompasses exercises, questionnaires, and tests to

assess students’ SQL programming skills. In the next

subsection, we describe the Concept of analyzing the

lecture’s slides

3.1 Analyzing the Lecture’s Slides

Being that our aim is to support students in solving

SQL exercises by recommending slides from the lec-

ture that relates to the exercise they engage with, we

restrict the analysis of the lecture slides to the chapters

that contain SQL query-related information required

in the exercises. Ergo, the lecture slide analysis work-

ﬂow consists of the following activity blocks: slide

conversion, preprocessing, keyword recognition, and

keyword analysis as shown in Figure 1. These activi-

ties are brieﬂy described in the subsections below.

Lecture Slides

Chapter 1 Chapter n

...

Page 1 (counts)

select: 2

join: 1

where: 1

Sub-chapter 1

...

Page n (counts)

select: 2

join: 0

where: 5

...

Sub -

chapter n

Sub -

chapter 1

Chapter 1

Page 1 (tf)

select: 1

join: 0.5

where: 0.5

Page 1

Page 1 (idf)

select: 0.2

join: 1.2

where: 0.5

Page 1 (tf *idf)

select: 0.2

join: 0.6

where: 0.25

Slide Conversion Preprocessing

Keyword recognition

Tf, idf and tf * idf calculation

Figure 1: Lecture slide analysis workﬂow.

3.1.1 Slide Conversion & Preprocessing

The objective of the slide conversion step is the ex-

traction of relevant information from lecture slides.

To achieve this, we ﬁrst convert a selected chapter

from PDF ﬁle format into a string format, enabling

us to further examine the textual contents of the chap-

ter. From here, we transit to the preprocessing activity

as shown in Figure 1. The objective of the prepro-

cessing stage is to clean the string data from the slide

conversion step (Famili et al., 1997; Nadkarni et al.,

2011). To execute the preprocessing step, the strings

are normalized and tokenized, splitting the normal-

ized strings into individual word stems. After this pro-

cess, the stop words are removed. Also at this stage,

further separation into subchapters, and partitioning

into pages are done as shown in Figure 1.

3.1.2 Keyword Recognition

This task aims to detect a slide’s topic designation.

Thus, the subchapters are broken down into pages,

and their contents are parsed, and stored for further

analysis. Then, for each page, the number of occur-

ring SQL keywords is counted. A list of relevant key-

words is predeﬁned for this event. A sample list of

detected keywords is shown in the keyword recogni-

tion block. We observed that the frequency of the key-

word in a page is insufﬁcient for deriving the slide’s

designation, as some keywords like the ”SELECT”

occur too often. Hence, we further perform keyword

analysis. Requisite for keyword analysis is the com-

putation of term frequency tf of a term, which can

easily be done by taking a count of all occurrences

of a keyword in a document (Manning et al., 2008).

DATA 2022 - 11th International Conference on Data Science, Technology and Applications

542

We further observed that larger documents are more

likely to repeat terms, which results in higher tf val-

ues in comparison to smaller pages. To eliminate the

effect of page length, we employ the maximum tf nor-

malization technique (Leskovec et al., 2014).

3.1.3 Keyword Analysis

Having evaluated, and normalized the term frequency,

we discriminate the pages from each other based on

term occurrences using a heuristic approach, inverse

document frequency, (idf). The mere occurrence of

a term in a document gives little or no insight into

the document with respect to its topic. When a term

appears seldom in the collection but occurs frequently

in a speciﬁc document, this term provides a hint about

how to distinguish the observed document from other

documents of the collection. The idf value for each

term t

is calculated by dividing the corpus (number of

documents), N, by the number of documents in which

the term n

appears and applying the logarithm, The

term frequency multiplied with the result of the for-

mula below equals the tf*idf value.

The inverse document frequency assigns a weight

to each keyword based on the number of pages the

keyword appears in. The keyword recognition block

in Figure 1 shows the idf values for the three key-

words WHERE, SELECT, and JOIN. The WHERE

keyword appears less often than the SELECT clause,

and the JOIN keyword among these three keywords is

least often featured in the slides. Thus, in the lower

part showing the calculations, the JOIN keyword re-

ceives a weight of 1.2 while the WHERE and SE-

LECT clauses receive an idf value of 0.5, and 0.2 re-

spectively. Next, we proceed to multiply the tf and idf

values for each keyword to compute the tf*idf value.

The Tf*idf (term frequency times inverse docu-

ment frequency) is used as a term weighting system

to calculate a weighing which translates to the inﬂu-

ence that a keyword has on a page. Figure 1 shows the

JOIN keyword to have the highest tf*idf value with

0.6 compared to 0.25 for the WHERE clause and 0.2

for the select keyword (Ramos et al., 2003; Wu et al.,

2008). Thus, we estimate the topic of the page to

be mostly about joins. As a result, we can use these

tf*idf values for the typical SQL keywords to extract

the topic of a slide that can be compared to the topic

of the SQL exercise task. This process, Figure 1. is

repeated for the lecture slide.

3.2 Comparing Slides and Exercises

Having processed respective slides and SQL exer-

cises, Section 3.1, into a comparable format, we use

the cosine similarity between the lecture slides and

SQL exercises as a comparison metric. A depiction of

this process for a hypothetical Exercise 1 is shown in

Figure 2. It consists of the following steps: merging

of previous analysis, computation of cosine similarity,

and mapping of exercises to pages with the highest

cosine similarity.

Page. 2 (tf*idf)

select: 0.2

from: 0.5

Page. n (tf*idf)

alter: 1

table: 0.5

Lectures slides

(1) Results of previous

analysis are merged

(2) Computation of cosine

similarity between

exercise and slides

(3) Mapping of exercise

to pages with highest

cosine similarity

Page. 1 (tf*idf)

select: 0.2

from: 0.2

where: 0.3

Ex. 1 (tf*idf)

select: 0.2

from: 0.2

where: 0.3

Exercise 1

Ex. 1 (cosine)

page 1: 0.8

page 2: 0.631

page n: 0

Exercise 1: Mapping

Ex. 1 (cosine)

page 1: 1.0

> : 0.3

Figure 2: Concept of combining the analysis results for the

lecture slides and SQL exercises.

In the merge stage, the slides and exercise analy-

sis are merged into a list during the ﬁrst step of the

comparison process. Thus, for each exercise, we now

have access to the tf*idf values from every page with

respect to this exercise. Figure 2 shows the mapping

between Exercise 1, and each page alongside the re-

sult of their keyword analysis. Having performed the

merging, we evaluate the cosine similarity, i.e.,the an-

gle between two-word vectors (Sidorov et al., 2014).

In our use case, the word vectors consist of keywords

recognized from a query or lecture slide. A low an-

gle between those vectors means that the content of

the vectors is similar, while a high angle expresses

dissimilar content. An advantage of using the cosine

angle as a similarity metric is that the length of the

vectors is not relevant. As a result, the cosine similar-

ity for each page of the lecture is calculated in regard

to Exercise 1. The result shows that Page 1 has the

highest cosine similarity of 0.8 followed by Page 2

with 0.631, and Page n with 0. The page with the best

cosine value, in this case, Page 1, is then selected to

be mapped to Exercise 1, and hence recommended to

students having a problem with Exercise 1.

4 KEYWORD AND JOIN

RECOGNITION

Keywords, being a vital part of our system as they are

used every time we search for the occurrence of cer-

tain SQL keywords, we implemented a tool to create

a customary keyword lists that contain speciﬁc SQL

keywords selected by the administrator. For example,

Slide-recommendation System: A Strategy for Integrating Instructional Feedback into Online Exercise Sessions

543

there are SQL keywords available in our implemen-

tation, such as CONCAT, that are currently not men-

tioned in the lecture. Since this could also change

in the future, our implementation already supports a

wide variety of keywords. We now show the process

of mapping slides to exercises by walking through a

recommendation for Exercise E that is about the up-

date operation:

Update wine set vintage = vintage +1

Where color = ’red’;

The keyword analysis for Exercise E is displayed in

Table 1 with WHERE and UPDATE recognized once

and the equality sign twice. The idf values reveal that

the UPDATE keyword is the rarest keyword of these

three with an idf of 1.82 compared to 0.301 for the

WHERE keyword and 0.519 for the equality sign. A

tf*idf value of 0.91 is assigned to the UPDATE state-

ment, which means that it is the most important SQL

keyword for the Exercise E.

Table 1: Keyword analysis for exercise E.

keywords count tf idf tf-idf

Where 1 0.5 0.301 0.151

Update 1 0.5 1.82 0.91

= 2 1 0.519 0.519

Table 2: Keyword analysis of the recommended page to ex-

ercise E.

Chp: 2, Page: 34, Cosine: 0.984

keywords count tf idf tf-idf

where 1 0.5 0.507 0.254

update 2 1 1.109 1.109

in 1 0.5 0.273 0.137

= 2 1 0.556 0.556

as 1 0.5 0.316 0.158

Our system declares the 34th page of the second

chapter as the most similar slide of the lecture. The

keyword analysis for this slide is shown in Table 2

with the keywords WHERE, IN, AS recognized once

and UPDATE and equality sign twice. The cosine

similarity between Exercise E and the recommended

page with 0.984 is close to one, which resembles a

high similarity.Compared to keyword recognition,

the recognition of joins needs a more sophisticated

effort because joins in SQL can either be created with

the JOIN keyword or by the usage of the WHERE

keyword. In case of a join in the form of the WHERE

keyword, simply counting the number of keywords

would lead to an increase of recognized equality

signs and WHERE keywords, but a JOIN would not

be recognized. We have implemented a way to recog-

nize those disguised joins in our implementation. An

example of a join formulated by using the WHERE

keyword in Query 4 below

select Name, Vintage, PRODUCER.Vineyard

Table 3: Keyword analysis of the recommended page to ex-

ercise E.

keywords where update in = as

Count 1 1 1 1 1

Table 4: Method to recognize joins in the WHERE clause.

1 $ join stack = [ ] ;

2 array push ( $ join stack , ’ where ’ , ’

’ , ’=’ , ’ n . ’ ) ;

3 foreach ( $uni gram as $ index =¿ $ entry ) :

4 if ( preg match ( $ join stack [ 0 ] , $ entry ) != false )

5 if ( sizeof ( $ join stack ) ¿ 1) array shift ( $ join stack ) ;

6 else {

7 array shift ( $ joins tack ) ;

8 array push ( $ join stack , ’where ’ , ’

’ , ’=’ , ’

’)

9 $ join counter++;

10 }

11 end foreach ;

12 return $ join counter ;

Table 5: Keyword recognition of page 21 from chapter six

after enabling join detection.

keywords select from join

Count 1 1 1

from WINES, PRODUCER

where WINES.Vineyard = PRODUCER.Vineyard;

The keyword analysis of the above query is dis-

played in Table 3. The SQL keywords SELECT,

FROM, WHERE and = are recognized once. The de-

sired behavior would be to recognize the keywords

SELECT, FROM and JOIN once each.

Table 4 shows the conceptual implementation of

the join recognition. We employ an array $ join stack

that stores the elements ’where’, ’.’, ’=’, ’.’ in this

exact order. In Line 3, the content of a page is read

in as one-liners. For each word, we test if this word

is equal to the ﬁrst element in the $ join stack. If

the word is equal to the ﬁrst element, we remove it

from the stack. This will be repeated until the stack is

empty, indicating that each keyword necessary to de-

scribe a join has been found, and therefore we incre-

ment the $ join counter variable. The $ join stack is

being ﬁlled again with keywords, and we resume the

join detection until every word of the page has been

read in. In the end, we return the number of detected

joins stored in the $ join counter variable of Line 12.

Considering Query 4, we can now correctly de-

clare the recognized keywords in Table 5. There is

one JOIN keyword detected, while the WHERE and

equality sign disappeared.

DATA 2022 - 11th International Conference on Data Science, Technology and Applications

544

5 EVALUATION

Since our objective is the integration of automatic in-

structional feedback into online exercise sessions, we

resorted to adding the recommendation feature to our

exercise administration tool, SQLValidator. This re-

quired the analysis of our lecture slides, and SQL ex-

ercise tasks. The keyword list has a signiﬁcant in-

ﬂuence on the mapping between slides and exercises.

Thus, based on the performance, we collated a list of

relevant keywords from a pool of SQL keywords con-

taining 58 elements.

Furthermore, to evaluate the usefulness of a map-

ping between slides and exercises, a ground truth is

required. Thus, we manually labeled the SQL exer-

cises so for each exercise there is a stored selection of

recommendable lecture slides. The decision of which

slides shall be recommended for a particular exercise

might be a topic of discussion. Depending on the ex-

perts’ view, the selection of suitable slides can be ei-

ther ﬂexible or strict. As a result, the evaluation is

greatly inﬂuenced by the experts’ labeling. We miti-

gate this problem by using multiple experts in the la-

beling process instead of a single expert.

5.1 Performance Measures

We will evaluate the performance of our implementa-

tion by computing several performance metrics.These

are the standard performance metrics (Sokolova and

Lapalme, 2009; Bradley et al., 2006; Alvarez, 2002):

Accuracy: the fraction of slides correctly predicted

as being recommendable plus the number of slides

that were correctly not recommended to the students,

Precision: an estimate of the exactness of the recom-

mendation,

Recall: an estimate how many slides that are labeled

as positive were recommended to the students,

F-measure: a combination of precision and recall.

We also use the F

, which enables us to put more em-

phasis on one of the metrics, and therefore we chose

to weigh the precision value higher. Therefore, we

chose 0.5 for our β parameter.

5.2 Baseline Evaluation

The baseline evaluation of our system derives a map-

ping between slides, and SQL exercises by purely

computing the cosine similarity without using any ad-

ditional parameters. Only the preferred way of cal-

culating the idf values has to be selected. Using the

baseline approach means that only the slide with the

highest cosine similarity will be selected for recom-

mendation. If multiple slides are sharing the best co-

Table 6: Confusion matrix of baseline approach with idf

sub

pred. pos. pred neg.

actual pos. 38 (TP) 178 (FN)

acutal neg. 32 (FP) 11632 (TN)

total 70 11.810

Table 7: Confusion matrix of baseline approach with idf

col

pred. pos. pred neg.

actual pos. 38 (TP) 178 (FN)

acutal neg. 28 (FP) 11636 (TN)

toal 66 11.814

Table 8: Performance metrics for baseline approach.

Metric Accuracy Precision Recall F-Measure F

β=0.5

value

sub

0.982 0.543 0.176 0.266 0.383

value

col

0.983 0.576 00.176 0.27 0.396

sine value, then they are recommended. The confu-

sion matrix of the baseline approach using the idf

sub

computation is shown in Table 6. In our implementa-

tion, there are 180 slides, and 66 SQL exercises result-

ing in 11.880 entries in the confusion matrix. Out of

the 70 entries predicted as positive, 38 were positive.

There are 11.810 entries in the column predicted neg-

ative with 178 counted as false negative and 11632 as

actually negative. The table cell of true negative en-

tries is of interest to us since it contains a little more

than 98% of all entries. This imbalance of instance

distribution is expected because there are 180 possible

recommendations for each exercise but usually, only

a few slides for each exercise are labeled as recom-

mendable. Suppose there is an exercise for which we

selected three pages as appropriate. If our implemen-

tation recommended a random slide for this exercise,

which turns out to be inappropriate for this exercise,

there would still be 176 slides correctly classiﬁed as

not recommendable and, hence, 176 entries are added

to the true negative cell. This effect is reinforced by

using the baseline approach, since the recommenda-

tion is restricted to only recommend the pages with

the highest cosine value, further decreasing the num-

ber of slides that are recommended. Furthermore, the

idf value of a keyword can also be calculated by con-

sidering, for all available pages, the number of oc-

currences of the keyword. Thus, we will refer to

the subchapter-wise idf calculation as idf

sub

, and the

collection-wise idf values will be referred to as idf

col

The result for the baseline approach in combination

with the idf

col

calculation is shown in Table 7. The idf

variation is barely visible because the idf

col

method

predicts 66 instances as positive, compared to 70 pos-

itive predictions in the idf

sub

computation. The dif-

ference in negative predictions is also negligible, with

11.810 negative predictions in Table 6 and 11.814 in

Table 7.

Slide-recommendation System: A Strategy for Integrating Instructional Feedback into Online Exercise Sessions

545

Table 8 shows the performance metrics with re-

spect to the confusion matrices from Table 6 and 7.

The accuracy of both idf calculations is rather high,

with 0.982 using idf

sub

and 0.983 using idf

col

. This is

mostly due to the previously described fact that most

of the pages are correctly classiﬁed as true negative.

The precision value of idf

sub

is slightly lower than the

precision for idf

col

method with 0.576. That means

slightly more than half of our baseline’s recommen-

dations are correct recommendations. Each of the

remaining metrics recall, F-measure and F

β=0.5

are

rather similar for both idf computations. The recall

value for both methods is 0.176 which implies that

around 17% of the slides classiﬁed as recommendable

are selected by our system. The F-measure, which is

inﬂuenced by the precision and recall metrics equally,

reaches 0.266 with the idf

sub

and 0.27 with the idf

col

The most important metric in our use case is the

β=0.5

which equates to 0.383 for the idf

sub

calcula-

tion and 0.396 for the idf

col

. The performance met-

rics of the collection-wise idf approach are slightly

better than the subchapter-wise idf. Therefore, we fo-

cus on the collection-wise idf calculation technique

in the remainder because the peak performance will

be achieved by using idf

col

. Hence, by the baseline

approach, we mean the baseline approach using the

collection-wise idf from now on. In the following sec-

tions, we will improve the recommendation perfor-

mance by utilizing our already introduced optimiza-

tions of join detection, clustering of keywords and

minimal cosine values.

5.3 Detecting Joins

The join detection has a beneﬁcial effect on the false

negative and true negative predictions, although they

proﬁted percentage-wise signiﬁcantly less compared

to the positive predictions. Table 9 shows the con-

fusion matrix of our join detection alongside the rate

of change compared to the baseline approach using

collection-wise idf values. Applying the join detec-

tion yields a positive effect on the classiﬁcation re-

sults. The number of true positive predictions in-

creased by 15.8% while the number of false-positive

predictions decreased by 21.4%. The performance

metrics of the baseline approach with and without join

detection are shown in Table 10. Each of the perfor-

mance metrics increased with the activated join detec-

tion. The accuracy value increased almost negligible

from 0.983 to 0.984. The recall and F-measure im-

proved more with 0.176 to 0.204 and 0.27 to 0.312

respectively. Especially noteworthy is the increase in

the precision value from 0.576 to 0.667 through en-

abling the join detection.

Table 9: Results of activated join detection compared with

baseline approach.

idf

col

pred. pos. pred neg.

actual pos. 44 ↑ 15.8% 172 ↓ 3.4%

acutal neg. 22 ↓ 21.4% 11642 ↑ 0.052%

total 66 ±0 11.814 ±0

Table 10: Performance with and without join detection.

Metric Accuracy Precision Recall F-Measure F

β=0.5

¬(join detection) 0.983 0.576 0.176 0.27 0.396

join detection 0.984 0.667 0.204 0.312 0.459

5.4 Clustering Keywords

To improve our slide recommendation, we analyzed

our data set and identify speciﬁc keywords that need

to be clustered. This cluster consists of the keywords

<,>=, and SELECT. This process of choosing suit-

able keywords is manual. Table 11 depicts the con-

fusion matrix for this clustering approach. The clus-

tering leads to 4.6% more true positive predictions,

while the false positive recommendations were low-

ered by 9.1%. The performance metrics are displayed

in Table 12 alongside the comparison to the former

best approach without cluster usage, but with join de-

tection. The application of the cluster causes the accu-

racy to increase from 0.983 to 0.984. More notably,

the precision rises from 0.667 to 0.697. The recall

value increases slightly from 0.204 to 0.213. The

improvement of both the recall and precision values

causes the F

value to increase from 0.459 to 0.479.

Especially, the improved precision and F

metrics im-

ply that the clustering of keywords enables our system

to recommend useful slides to the students.

The improved performance due to the clustering

is due to two more mappings between exercises and

slides that are now done correctly. One of the exer-

cises for which the prototype found the correct rec-

ommendation will be referred to as task E:

SELECT job, MIN (ALLsalary) AS min salary

FROM employee

GROUP BY job;

The keyword analysis for task E yields the results

shown in Table 13 with the recognized keywords SE-

LECT, GROUP BY, GROUP, AS, MIN, and ALL.

The MIN and ALL keywords have the highest tf*idf

value with 1.217 assigned to it and therefore they are

the most important keywords for this exercise.

The recommendation before clustering is incor-

Table 11: Confusion matrix of cluster application.

pred. pos. pred neg.

actual pos. 46 ↑ 4.6% 170 ↓ 1.2%

acutal neg. 20 ↓ 9.1% 11644 ↑ 0.02%

total 66 ±0 11.814 ±0

DATA 2022 - 11th International Conference on Data Science, Technology and Applications

546

Table 12: Performance with and without clustering.

Metric Accuracy Precision Recall F-Measure F

β=0.5

¬cluster 0.984 0.667 0.204 0.312 0.459

cluster 0.984 0.697 0.213 0.326 0.479

Table 13: Keyword analysis for task E from our system.

Chapter: 9, Page: 24, Cosine: 0.625

keywords select group by group as min all

count 1 1 1 1 1 1

tf 1 1 1 1 1 1

idf by 0.087 1.121 1.121 0.405 1.217 1.217

tf*idf 0.087 1.121 1.121 0.405 1.217 1.217

rect, since the recommended page is not helpful to

the students. The chosen page is the twenty-fourth

page of the ninth chapter ”Views and Access Con-

trol”. Page 24 contains information about the prob-

lems with aggregation views, although task E does

not feature any information about views. Hence, the

recommendation of page 24 is not useful for stu-

dents that are challenged by task E. Table 14 displays

the keyword analysis for page 24. The keywords

WHERE and HAVING were recognized once, and the

keywords SELECT, GROUP BY, <, MIN, GROUP

twice. The highest tf*idf values are reached by the

keywords MIN at 1.556 and < with 1.352.

The recommendation to task E should contain in-

formation as to how the GROUP BY keyword can

be used to aggregate data. Instead of recommending

page 24 of the ninth chapter, the independent labelers

chose page 61 of the sixth chapter, displayed in Fig-

ure 4 as a good ﬁt for task E since it visualizes the

process of using the GROUP BY clause.

The keyword analysis for our desired recommen-

dation is shown in Table 15. Page 61 contains the

three SQL keywords AND, GROUP BY and GROUP

once with the GROUP BY keyword reaching a tf*idf

of 1.109 and the GROUP clause following at 1.051.

The comparison between Table 14 and Table 15

shows that the cosine similarity of page 61 with 0.625

is lower than the cosine value of 0.682 from the cur-

rent recommendation. In order to change the recom-

mendation from page 24 to page 61, we need to in-

ﬂuence the cosine similarity between task E and the

slides by creating a suitable cluster.

Figure 3: Incorrect recommendation of page 24 from chap-

ter nine to task E before clustering.

Table 14: Keyword analysis of the incorrectly referred page

24 from chapter nine.

Chapter: 9, Page: 24, Cosine: 0.625

keywords select where group by group having < min

count 2 1 2 2 1 2 2

tf 1 0.5 1 1 0.5 1 1

idf by 0.347 0.484 1.109 1.051 1.301 1.352 1.556

tf*idf 0.347 0.242 1.109 1.051 0.651 1.352 1.556

Figure 4: Page 61 of the sixth chapter, which should be

chosen for recommendation.

Page 24 contains the < and SELECT keywords,

with the SELECT clause being also shared with task

E. We have a cluster in use that contains the SE-

LECT and < statements and thus changes the tf*idf

values of page 24. The performance evaluation of

page 24 is shown in 16 with the SELECT keyword

having a tf*idf value at 1.352 instead of 0.347. The

increased tf*idf value causes the similarity between

task E and page 24 to shrink, and thus the new cosine

value equals 0.625. The cosine similarity of page 61

does not change in respect to task E because our de-

sired recommendation does not share any keywords

with the cluster. The unchanged similarity of 0.64 is

sufﬁcient in order to be chosen for recommendation in

task E, since the former cosine value of page 24 de-

creased. Our cluster contains three keywords in total,

with the >= clause not being mentioned yet. In our

research, we observed a performance decrease when

using a cluster that only contains the SELECT and

< keywords. We believe that using the cluster with-

out the >=yields a side effect to the other exercises,

which is why we chose to include >= in our cluster.

6 DISCUSSION

The evaluation shows that we achieved our goals at

providing a method to optimize student learning en-

gagement during online SQL exercise task engage-

ment. However, there are still challenges with SQL

keywords which are regularly used in the English lan-

guage. Some SQL keywords such as AND, IN or AS

Table 15: Keyword analysis for page 61 from chapter six.

Chapter: 6, Page: 61, Cosine: 0.64

keywords count tf idf tf*idf

group by 1 1 1.109 1.109

group 1 1 1.051 1.051

and 2 1 0.499 0.499

Slide-recommendation System: A Strategy for Integrating Instructional Feedback into Online Exercise Sessions

547

Table 16: Keyword analysis for page 24 from chapter nine

after clustering.

Chapter: 9, Page: 24, Cosine: 0.625

keywords select where group by group having ! min

count 2 1 2 2 1 2 2

tf 1 0.5 1 1 0.5 1 1

idf by 0.347 0.484 1.109 1.051 1.301 1.352 1.556

tf*idf 0.347 0.242 1.109 1.051 0.651 1.352 1.556

are regularly used in the English language without any

SQL context. Since the lecture slides are written in

English, problems arise when counting the number of

times these keywords are used in a SQL environment.

This is challenging for our recommendation process if

the encountered keyword rarely appears in the corpus,

as it would receive a high idf weight, leading to a great

inﬂuence in deciding the topic of the page. In the

case of a page populated with many SQL keywords,

the negative effect of one incorrectly recognized key-

word might be mitigated by the tf*idf values of the

other keywords. If the page only has a few keywords,

it might happen that a word like IN or AND not used

in any SQL context will mislead the recommendation

process for this page.

7 CONCLUSION

This research is aimed at improving students’ perfor-

mance by reducing the unsystematic trial-and-error

behavior during online SQL exercises task engage-

ment in our SQLValidator. To this end, we have im-

plemented a strategy in which suitable slides from

lecture materials are mapped to respective SQL ex-

ercise tasks and are recommended to students in the

form of hints during exercise task engagement. We

have described, evaluated, and further optimized our

strategy via join detection and clustering. Our imple-

mentation as shown in the evaluation section reaches

a precision value of 0.767 and F

β=0.5

value of 0.505

thus justifying our strategy. The next stage in this rec-

ommendation system track is the impact assessment

of the recommendation on students’ engagement and

SQL skill acquisition. Students tend to share solution

codes. While solution distribution among students

cannot be stopped as it is also a part of learning. A

future direction is the implementation of a plagiarism

discouragement feature.

ACKNOWLEDGMENTS

This work was supported by the German Federal

Ministry of Education and Research [grant number

16DHB 3008].

REFERENCES

Alvarez, S. A. (2002). An exact analytical relation among

recall, precision, and classiﬁcation accuracy in infor-

mation retrieval. Boston College, Boston, Technical

Report BCCS-02-01, pages 1–22.

Bradley, A., Duin, R., Paclik, P., and Landgrebe, T.

(2006). Precision-recall operating characteristic (p-

roc) curves in imprecise environments. In ICPR, vol-

ume 4, pages 123–127. IEEE.

Charu, C. A. (2016). Recommender Systems: The Textbook.

Costello, E. (2013). Opening up to open source: look-

ing at how moodle was adopted in higher education.

Open Learning: The Journal of Open, Distance and

e-Learning, 28(3):187–200.

Dietrich, S. W. (1993). An educational tool for formal re-

lational database query languages. Computer Science

Education, 4(2):157–184.

Famili, A., Shen, W.-M., Weber, R., and Simoudis, E.

(1997). Data preprocessing and intelligent data anal-

ysis. Intelligent data analysis, 1(1):3–23.

Kleerekoper, A. and Schoﬁeld, A. (2018). SQL tester: an

online SQL assessment tool and its impact. In Pro-

ceedings of the Annual ACM Conference on Innova-

tion and Technology in Computer Science Education,

pages 87–92.

Leskovec, J., Rajaraman, A., and Ullman, J. D. (2014). Min-

ing of Massive Datasets. Cambridge University Press.

Machado, M. and Tao, E. (2007). Blackboard vs. moodle:

Comparing user experience of learning management

systems. In FIE, pages S4J–7. IEEE.

Manning, C. D., Raghavan, P., and Sch

utze, H. (2008). In-

troduction to Information Retrieval. Cambridge Uni-

versity Press.

Mitrovi

c, A. (1998). Experiences in implementing

constraint-based modeling in SQL-Tutor. In ITS,

pages 414–423.

Nadkarni, P. M., Ohno-Machado, L., and Chapman, W. W.

(2011). Natural language processing: an introduction.

JAMIA, 18(5):544–551.

Obionwu, V., Broneske, D., Hawlitschek, A., K

oppen, V.,

and Saake, G. (2021). Sqlvalidator–an online student

playground to learn sql. Datenbank-Spektrum, pages

1–9.

Obionwu, V., Broneske, D., and Saake, G. (2022). Topic

maps as a tool for facilitating collaborative work ped-

agogy in knowledge management systems. Interna-

tional Journal of Knowledge Engineering.

Ramos, J. et al. (2003). Using tf-idf to determine word rele-

vance in document queries. In Proceedings of the ﬁrst

instructional conference on machine learning, volume

242, pages 29–48. Citeseer.

Sidorov, G., Gelbukh, A., G

omez-Adorno, H., and Pinto, D.

(2014). Soft similarity and soft cosine measure: Simi-

larity of features in vector space model. Computaci

y Sistemas, 18(3):491–504.

Sokolova, M. and Lapalme, G. (2009). A systematic analy-

sis of performance measures for classiﬁcation tasks.

Information processing & management, 45(4):427–

437.

Wu, H. C., Luk, R. W. P., Wong, K. F., and Kwok, K. L.

(2008). Interpreting tf-idf term weights as making rel-

evance decisions. TOIS, 26(3):1–37.

DATA 2022 - 11th International Conference on Data Science, Technology and Applications

548