Authors:
Narges Manouchehri
and
Nizar Bouguila
Affiliation:
Concordia Institute for Information System Engineering (CIISE), Concordia University, Montréal and Canada
Keyword(s):
Mixture Models, Multivariate Beta Distribution, Maximum Likelihood, Clustering.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Artificial Intelligence and Decision Support Systems
;
Data Mining
;
Databases and Information Systems Integration
;
Enterprise Information Systems
;
Industrial Applications of Artificial Intelligence
;
Sensor Networks
;
Signal Processing
;
Soft Computing
Abstract:
Model-based approaches specifically finite mixture models are widely applied as an inference engine in machine learning, data mining and related disciplines. They proved to be an effective and advanced tool in discovery, extraction and analysis of critical knowledge from data by providing better insight into the nature of data and uncovering hidden patterns that we are looking for. In recent researches, some distributions such as Beta distribution have demonstrated more flexibility in modeling asymmetric and non-Gaussian data. In this paper, we introduce an unsupervised learning algorithm for a finite mixture model based on multivariate Beta distribution which could be applied in various real-world challenging problems such as texture analysis, spam detection and software modules defect prediction. Parameter estimation is one of the crucial and critical challenges when deploying mixture models. To tackle this issue, deterministic and efficient techniques such as Maximum likelihood (M
L), Expectation maximization (EM) and Newton Raphson methods are applied. The feasibility and effectiveness of the proposed model are assessed by experimental results involving real datasets. The performance of our framework is compared with the widely used Gaussian Mixture Model (GMM).
(More)