
5.2  Integration with the Operational 
Data Base 
The operational data is used for storing the existing 
data histories of market prices as well as the newly 
incoming  market  data  from  the  data  feed.  In 
addition, it stores the return values calculated by the 
DSE as the percentage change from one price value 
to the next. These data are passed directly from the 
DSE. Hence, each record of market data consists of 
4 attributes: 
(symbol_ID, time_stamp, price, 
return), 
where symbol_ID is the provider specific identifier 
for the trading instrument, e.g. 6-letter representation 
in the case of currencies (as e.g. EURUSD, 
USDJPY, etc.), time_stamp is a numeric value for 
Unix time, price and return are both of type double. 
On the other  hand,  trade  data  are  stored  in  the 
data  base.  These  consist  of  lists  of  order  data, 
indicating time stamped buy or sell information:  
(symbol_ID, time_stamp, price, 
amount|position size, buy|sell) 
The final data to be stored are the lists of current 
assets,  just  being  the  time  stamped  list  of  symbol 
IDs  of  the  assets  contained  in  the  portfolio  at  a 
certain time. 
(time_stamp, symbol_ID
1
, … 
symbol_ID
n
) 
5.3  Integration with ML Algorithms 
The  correlation  discovery  algorithm  is  being  used 
for  building  the  correlation-matrix.  The  decisive 
requirement  for  the  CDBA  project  is  real-time 
computation. For the calculation of VaR one of the 
computationally  most  complex  steps  is  the 
calculation of the variance-covariance matrix and/or 
the correlation matrix respectively. In this use case 
the time series correlation discovery algorithm will 
be  used  to  set  up  the  correlation  matrix  through 
pairwise  measuring  of  correlations  of  all  return 
vectors corresponding to all assets in the portfolio. 
As a precondition for the matrix setup, the time 
series  have  to  be  synchronised.  We  agreed  to 
synchronise  all  of  the  time  series  on  a  1-second 
basis.  This  is  done  by  the  DSE  as  described 
previously.  This  means,  that  in  the  most  frequent 
update  case  this  computationally  intensive 
calculation is repeated every second, over the length 
of the time window that is shifted every second by 
one. This way, the window always contains the most 
recent  data. The  sliding  length  is  the  frequency  of 
updates (here one second as pseudo real time). 
The  computational  complexity,  of  course,  will 
depend on the size of the portfolio, i.e. the number 
of different assets contained  therein. Therefore, for 
large portfolios, the preferred  mode of deployment 
may be based on the trigger alarm of the DSE and a 
re-calculation  of  the  correlation  matrix  only 
performed in case  of an alarm,  or  on user request, 
e.g. for a what-if-scenario as pre-trade analysis. 
5.4  Fast Analytics Engine 
Since  the  financial  sector,  i.e.  risk  management  is 
one  of  the  main  application  areas  of  the  fast 
analytics engine, there exists already support for this 
use case on several levels. In particular, there is an 
open source project available for the calculation of 
VaR, that  was taken and expanded to the needs of 
the use case. 
The  VaR  project  receives  the  P&L  vectors  as 
generated by the scenario engine and calculates the 
VaR risk measure that is then returned to the GUI. 
The  analysis  has  been  expanded  by  the  ES 
calculation  that  is  based  on  the  same  set  of 
scenarios. 
5.5  Usage 
The use case application is configured and the risk 
assessment run from a central GUI. 
The  log  in  dialog  offers  two  roles  for  users, 
distinguishing  between  trader  and  risk  controller.  
While  the  trader  can  only  enter  one  or  several 
(potential)  new  trades  and  then  push  the  “VaR-
button”,  the  risk  controller  is  also  able  to  enter  or 
change parameters such as confidence level, sliding 
window length, number of scenarios to be generated, 
training sample length and forecasting period length. 
6  CONCLUSION 
The  presented  risk  monitoring  use  case  is  a  data-
intensive  application  in  a  critical  infrastructure.  It 
does  not require many different functionalities, but 
focusses  on  a  central  aspect  in  the  daily  risk 
management  procedures  of  banks  and  financial 
institutes.  
The  challenge  of  the  application  lies  in  the 
computational  complexity  of  the  calculation  of  the 
risk  measures.  This  is  where  it  exploits  the 
Implementing Value-at-Risk and Expected Shortfall for Real Time Risk Monitoring
463