Accelerating Matrix Factorization by Overparameterization

Pu Chen, Hung-Hsuan Chen


This paper studies overparameterization on the matrix factorization (MF) model. We confirm that overparameterization can significantly accelerate the optimization of MF with no change in the expressiveness of the learning model. Consequently, modern applications on recommendations based on MF or its variants can largely benefit from our discovery. Specifically, we theoretically derive that applying the vanilla stochastic gradient descent (SGD) on the overparameterized MF model is equivalent to employing gradient descent with momentum and adaptive learning rate on the standard MF model. We empirically compare the overparameterized MF model with the standard MF model based on various optimizers, including vanilla SGD, AdaGrad, Adadelta, RMSprop, and Adam, using several public datasets. The experimental results comply with our analysis – overparameterization converges faster. The overparameterization technique can be applied to various learning-based recommendation models, including deep learning-based recommendation models, e.g., SVD++, nonnegative matrix factorization (NMF), factorization machine (FM), NeuralCF, Wide&Deep, and DeepFM. Therefore, we suggest utilizing the overparameterization technique to accelerate the training speed for the learning-based recommendation models whenever possible, especially when the size of the training dataset is large.


Paper Citation