Large-scale Retrieval of Bayesian Machine Learning Models for Time Series Data via Gaussian Processes

Fabian Berns, Christian Beecks, Christian Beecks


Gaussian Process Models (GPMs) are widely regarded as a prominent tool for learning statistical data models that enable timeseries interpolation, regression, and classification. These models are frequently instantiated by a Gaussian Process with a zero-mean function and a radial basis covariance function. While these default instantiations yield acceptable analytical quality in terms of model accuracy, GPM retrieval algorithms automatically search for an application-specific model fitting a particular dataset. State-of-the-art methods for automatic retrieval of GPMs are searching the space of possible models in a rather intricate way and thus result in super-quadratic computation time complexity for model selection and evaluation. Since these properties only enable processing small datasets with low statistical versatility, we propose the Timeseries Automatic GPM Retrieval (TAGR) algorithm for efficient retrieval of large-scale GPMs. The resulting model is composed of independent statistical representations for non-overlapping segments of the given data and reduces computation time by orders of magnitude. Our performance analysis indicates that our proposal is able to outperform state-of-the-art algorithms for automatic GPM retrieval with respect to the qualities of efficiency, scalability, and accuracy.


Paper Citation