Keyword(s):Non-negative Matrix Factorization, Binary Data, Binary Matrix Factorization, Text Modelling.

Related
Ontology
Subjects/Areas/Topics:Artificial Intelligence
;
Business Analytics
;
Computational Intelligence
;
Data Analytics
;
Data Engineering
;
Evolutionary Computing
;
Knowledge Discovery and Information Retrieval
;
Knowledge-Based Systems
;
Machine Learning
;
Mining Text and Semi-Structured Data
;
Soft Computing
;
Symbolic Systems

Abstract: We propose the Logistic Non-negative Matrix Factorization for decomposition of binary data. Binary data
are frequently generated in e.g. text analysis, sensory data, market basket data etc. A common method for
analysing non-negative data is the Non-negative Matrix Factorization, though this is in theory not appropriate
for binary data, and thus we propose a novel Non-negative Matrix Factorization based on the logistic link
function. Furthermore we generalize the method to handle missing data. The formulation of the method
is compared to a previously proposed logistic matrix factorization without non-negativity constraint on the
features. We compare the performance of the Logistic Non-negative Matrix Factorization to Least Squares
Non-negative Matrix Factorization and Kullback-Leibler (KL) Non-negative Matrix Factorization on sets of
binary data: a synthetic dataset, a set of student comments on their professors collected in a binary termdocument
matrix and a sensory dataset. We find that choosing the number of components is an essential part
in the modelling and interpretation, that is still unresolved.(More)

We propose the Logistic Non-negative Matrix Factorization for decomposition of binary data. Binary data are frequently generated in e.g. text analysis, sensory data, market basket data etc. A common method for analysing non-negative data is the Non-negative Matrix Factorization, though this is in theory not appropriate for binary data, and thus we propose a novel Non-negative Matrix Factorization based on the logistic link function. Furthermore we generalize the method to handle missing data. The formulation of the method is compared to a previously proposed logistic matrix factorization without non-negativity constraint on the features. We compare the performance of the Logistic Non-negative Matrix Factorization to Least Squares Non-negative Matrix Factorization and Kullback-Leibler (KL) Non-negative Matrix Factorization on sets of binary data: a synthetic dataset, a set of student comments on their professors collected in a binary termdocument matrix and a sensory dataset. We find that choosing the number of components is an essential part in the modelling and interpretation, that is still unresolved.

Guests can use SciTePress Digital Library without having a SciTePress account. However, guests have limited access to downloading full text versions of papers and no access to special options.

Guests can use SciTePress Digital Library without having a SciTePress account. However, guests have limited access to downloading full text versions of papers and no access to special options.

Larsen, J. and Clemmensen, L. (2015). Non-negative Matrix Factorization for Binary Data.In Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1 KDIR: SSTM, (IC3K 2015) ISBN 978-989-758-158-8, pages 555-563. DOI: 10.5220/0005614805550563

@conference{sstm15, author={Jacob Søgaard Larsen. and Line Katrine Harder Clemmensen.}, title={Non-negative Matrix Factorization for Binary Data}, booktitle={Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1 KDIR: SSTM, (IC3K 2015)}, year={2015}, pages={555-563}, publisher={SciTePress}, organization={INSTICC}, doi={10.5220/0005614805550563}, isbn={978-989-758-158-8}, }

TY - CONF

JO - Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1 KDIR: SSTM, (IC3K 2015) TI - Non-negative Matrix Factorization for Binary Data SN - 978-989-758-158-8 AU - Larsen, J. AU - Clemmensen, L. PY - 2015 SP - 555 EP - 563 DO - 10.5220/0005614805550563