
Applying Causal Inference in Educational Data Mining: A Pilot 
Study 
Walisson Ferreira de Carvalho
1,2
, Bráulio Roberto Gonçalves Marinho Couto
3
, Ana Paula Ladeira
1
, 
Osmar Ventura Gomes
1
 and Luiz Enrique Zarate
2
 
1
Centro Universitário UNA, Av. Professor Mário Werneck, 1685, Belo Horizonte, Brazil 
2
Pontifícia Universidade Católica de Minas Gerais, Rua Walter Ianni, 255, Belo Horizonte, Brazil 
3
Centro Universitário de Belo Horizonte - UniBH, Av. Professor Mário Werneck, 1685, Belo Horizonte, Brazil 
Keywords:  Causal Inference, Educational Data Mining, e-Learning.  
Abstract:  Understanding  the  reasons  that  leads  students  to  succeed  during  their  course  is  a  challenge  for  every 
Institution of Education, independently of the modality of teaching and learning adopted. In this paper we 
use the theory of Causal Inference for analyzing the main factors that causes the success, or failure, of an 
engineering student enrolled in an online course of Algorithm . We used data extracted from the Learning 
Management System Moodle and, after preprocessing the dataset, analyzed the actions performed by the 
students  during  the  six  months  (20  weeks)  that  the  online  course  lasted.  We  concluded  that  before 
submitting  an  evaluation  activity  to  be  assessed,  it  is  important  that  students  analyze  the  problem 
thoroughly. Students that took a little bit longer to submit their work got more chances to be approved. 
1  INTRODUCTION 
Over the last years a new application of Data Mining 
has been emerged and it has been object of studies 
for many researchers, the Educational Data Mining 
(EDM).  This interdisciplinary area  of  Data Mining 
has  as  its  main  goal  to  analyze  data  from  the 
education sector in order to solve problems related 
to  education.  According  to  Romero  and  Ventura 
(2010), although EDM focus on educational data, it 
uses techniques of traditional Data Mining. 
The  Handbook  of  Educational  Data  Mining 
organized by Romero et al. in 2011 presents some 
applications of EDM. Among them, it is possible to 
emphasize  improvement  in  quality  of  the  courses, 
the opportunity in modeling the profile of students, 
increasing  performance  of  students,  predicting 
performance and others that can improve the quality 
of the process of teaching and learning. 
Baker and Carvalho (2011) presents a taxonomy 
of EDM divided in five sub areas: i) predicting; ii) 
clustering; iii) relationship mining; iv) distillation of 
data  for  human  judgment;  and  v)  discovery  with 
models. On the third subarea, Relationship Mining, 
according  to  the  authors,  the  goal  is  to  discover 
relationship between variables, being most common 
kinds  of  relationship  association,  correlation, 
sequential pattern and causal mining. In this article 
the  focus  will  remain  on  the  causal  association 
among variables. 
Besides the taxonomy, another issue pointed out 
by Baker and Carvalho (2011) is the opportunity for 
researchers  that  combine  online  education  and 
Educational  Data  Mining  aiming  to  improve  the 
process  of  teaching  and  learning.  This  opportunity 
emerges  from  the  growth  of  this  modality  of 
education  and  the  use  of  Learning  Management 
System (LMS) or e-learning systems such as Moodle 
(https://moodle.com/),  Eliademy 
(https://eliademy.com/) and others. 
In 2011 Judea Pearl won the Alan Turing Award 
“For  fundamental  contributions  to  artificial 
intelligence  through  the development of a  calculus 
for  probabilistic  and  causal  reasoning.”  By  causal 
reasoning Pearl means that it is necessary to look for 
root  causes  of  an  event  and  the  importance  of 
dissociate  correlation  and  causality.  After  all, 
correlation doesn't imply in causation.  
The three pillars of Causal Inference theory are 
Baysean  Network,  also  created  by  Pearl  in  1985, 
structural  equation  model  and  "do"  operator  which 
makes  possible  to  make  interventions  and  to 
simulate  the  model.  From  these  pillars  and  using 
454
Ferreira de Carvalho, W., Roberto Gonçalves Marinho Couto, B., Ladeira, A., Ventura Gomes, O. and Zarate, L.
Applying Causal Inference in Educational Data Mining: A Pilot Study.
DOI: 10.5220/0006792504540460
In Proceedings of the 10th International Conference on Computer Supported Education (CSEDU 2018), pages 454-460
ISBN: 978-989-758-291-2
Copyright
c
 2019 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved