An  interesting  functionality  for  CEBA  users 
would  be  to  provide  them  with  analytical  tools. 
Hence,  in  this  paper,  we  aim  to  investigate  using 
Elasticsearch  as  a  data  warehouse  and  Kibana  as  a 
Spatial  OLAP  visualisation  tool.  Data  warehouses 
support managers for decision-making (Jarke, 2002), 
(Inmon,  2005),  (Pinet,  2010).  Traditionally,  data 
warehouses are based on relational data models, but 
this type of models is not the most efficient model for 
real-time sensor streams. ELK stack is more suitable 
for stream  management, but  this approach does not 
provide  analytical  features  as  proposed  in  data 
warehouses.  The  authors  of  (Bicevska,  2017) 
discussed  the  NoSQL-based  data  warehouse 
solutions and provided some positive points for this 
solution. They  noted however  the  lack  of  reporting 
tools compatible with NoSQL systems. 
In  this  paper,  we  propose  a  method  to  model  a 
spatial  data  warehouse  model  with  ELK  stack.  We 
present the main structure of a component called IAT 
(Integration and Aggregation Tool) that allows users 
defining mappings (Lenzerini, 2002) and aggregation 
options between sensors sources and a target index in 
Elasticsearch.  IAT  acts  as  a  streaming  ETL (Sabtu, 
2017). It continuously extracts records from Logstash 
aggregate  records,  transforms  and  maps  them 
according to the output schema. The output data is in 
JSON format and is stored in an Elasticsearch index. 
Elasticsearch  (ES)  is  powerful  in  search  and 
aggregation queries but less for join queries (Pilato, 
2017). Hence, we store the data going out from IAT 
in one ES index. 
The paper is organised as follows. In the next 
section, we present some related work. In section 3, 
we present our work and the architecture composed 
of the ELK stack, as well as the use case for analytical 
queries.  We  present  the  functionalities  of  IAT 
components through the use case. Finally, we present 
an example of measurement station dashboard for our 
use case and we conclude. 
2  BACKGROUND AND RELATED 
WORK 
In this section, we present the main related work and 
concepts related to our paper topic, i.e., sensor data, 
spatial data warehouse, ETL process, ELK stack. 
2.1  Sensor Data 
Sensors  are  popular  technology  solutions  to  collect 
environmental  data.  With  the  developing 
technologies, there are many kinds of environmental 
sensors,  e.g.,  (Werner-Allen,  2006),  (Yick,  2008), 
(Richter, 2009), (Noury, 2018).   
Usually, sensor data are georeferenced data. The 
records consist of measurements or observations got 
at a specific location (geo-point) or within a specific 
area (geo-shape). The geographical information in the 
measurement is usually the  physical  location of  the 
sensor.  In  CEBA,  data  collected  from  sensors  are 
georeferenced data.   
2.2  Data Warehouse and Spatial Data 
Warehouse  
In  principle,  data  warehouses  are  designed  for 
analytical  queries  (Inmon,  2005).  Data  can  be 
arranged into either as facts or dimensions and mainly 
modelled  in  a  star  or  snow-flake  schema.  Facts 
consist mainly of measures or metrics (i.e., the data to 
analyse), and dimensions are mainly descriptive and 
upon which the aggregation are processed (Jarke, 
2002).  Data  warehouses  can  be  represented  in  a 
multidimensional  conceptual  model.  The 
multidimensional data structures are also called data 
cubes. Users can analyse data using online analytical 
processing  (OLAP)  tools.  The  most  popular  OLAP 
operations are roll up, roll down, slicing, and dicing 
(Matei, 2014).  
Spatial data warehouses and OLAP  tools  extend 
these  concepts.  They  especially  provide  support  to 
store,  aggregate  and  analyse  geographical  data 
(Nipun Garg, 2011). In spatial data warehouses, facts 
and dimensions may be spatial objects. 
2.3  Batch and Streaming ETL (Extract 
Transform Load) 
Traditionally, ETL is a process for (i) extracting data 
from  multiple  sources,  (ii)  transforming  and  (iii) 
loading them into a  data warehouse (Bansal, 2015). 
Batch  ETL  corresponds  to  the  ETL  process,  it  is 
triggered  at  a  specific  time  and  which  processes  a 
large volume of data in one time. 
The  streaming  ETL  is  an  enhanced approach  of 
the ETL process. It executes the ETL process in near 
real-time.  This approach solves the limitations of the 
batch  ETL  for streaming data  and allows  analysing 
data in a short time after it is produced by the sources.  
2.4  ELK Stack 
ELK stack (Elasticsearch, 2020) is composed of four 
main  open-source  projects:  Beats,  Logstash, 
Elasticsearch,  and  Kibana.  Beats  are  data  shippers.