Tallinn University of Technology, Estonia
Data Warehouse, Data Lineage, Dependency Analysis, Data Flow Visualization.
Business Intelligence Applications
Knowledge Discovery and Information Retrieval
Visual Data Mining and Data Visualization
We present a method to calculate component dependencies and data lineage from the database structure and a large set of associated procedures and queries, independently of actual data in the data warehouse. The method relies on the probabilistic estimation of the impact of data in queries. We present a rule system supporting the efficient calculation of the transitive closure. The dependencies are categorized, aggregated and visualized to address various planning and decision support problems. System performance is evaluated and analysed over several real-life datasets.