Improving Statistical Reporting Data Explainability via Principal Component Analysis

Shengkun Xie, Clare Chua-Chow

Abstract

The study of high dimensional data for decision-making is rapidly growing since it often leads to more accurate information that is needed to make reliable decision. To better understand the natural variation and the pattern of statistical reporting data, visualization and interpretability of data have been an on-going challenging problem, mainly, in the area of complex statistical data analysis. In this work, we propose an approach of dimension reduction and feature extraction using principal component analysis, in a novel way, for analyzing the statistical reporting data of auto insurance. We investigate the functionality of loss relative frequency, to the size-of-loss as well as the pattern and variability of extracted features, for a better understanding of the nature of auto insurance loss data. The proposed method helps improve the data explainability and gives an in-depth analysis of the overall pattern of the size-of-loss relative frequency. The findings in our study will help the insurance regulators to make a better rate filling decision in the auto insurance that would benefit both the insurers and their clients. It is also applicable to similar data analysis problems in other business applications.

Download


Paper Citation