Combining GAN And Matrix Factorization (GANMF)
3 main points
✔️ Identified two major problems in using cGAN for collaborative filtering and solved them to develop a novel GAN-based recommendation system, GANMF
✔️ Evaluated the recommendation system on three datasets and achieved better performance than all previous methods
✔️ Improved uniqueness of user-profiles generated using feature matching.
GAN-based Matrix Factorization for Recommender Systems
written by Ervin Dervishaj, Paolo Cremonesi
(Submitted on 20 Jan 2022)
Comments: Accepted at the 37th ACM/SIGAPP Symposium on Applied Computing (SAC '22)
Subjects: Information Retrieval (cs.IR); Machine Learning (cs.LG)
code:
The images used in this article are from the paper, the introductory slides, or were created based on them.
Background
A recommendation system is a system that recommends items that a customer might like, based on past customer behavior and other customer data.
Recommendation systems are very important in providing personalized experiences in various domains. Collaborative Filtering (CF ) is one of the most accurate recommendation methods.
Matrix Factorization (MF) is a well-known method for collaborative filtering. By projecting users and items into the latent space, we can extract user- or item-specific features from their mutual relationships.
One of the methods to generate data indistinguishable from real data is Generative Adversarial Networks (GANs ), but only a few studies have applied GANs to recommendation systems.
The GANMF introduced in this article proposes a method to combine GAN with matrix factorization and incorporate it into collaborative filtering.
Existing studies using GANs for recommendation
IRGAN and CFGAN are representative examples of existing studies that have applied GAN to recommendations.
IRGAN is the first application of GAN to the domain of information retrieval, not limited to the recommendation. By adversarial training of a Generator, which models the probability $p(d|q,r)$ of selecting a document given a query, and a Discriminator, which determines whether a query and a document are related, we can sample relevant sentences when a query is given The query is then used to determine the relevance of the query to the document.
CFGAN is a model in which the discrete sampling in IRGAN is a factor that reduces the generating capacity and the generator is modified to generate the user's past profile. The problem is that it has to generate high-dimensional data when the number of items is large.
Setting up a recommendation problem
The recommendation problem is to recommend N items to all users given user $U$ and item $I$, based on their past feedback.
The feedback information is given as a matrix called User Rating Matrix (URM), where the elements of the matrix are binary values of whether the user expressed interest in the item or not.
Challenges in applying cGAN to the recommendation
The author states that there are two major problems in using conditional GAN (Conditional GAN, cGAN), like CFGAN, for generating user profiles.
The first problem is that the correct gradient is difficult to propagate to the Generator. This is due to the fact that the input to the Discriminator is high-dimensional while the output is scalar, so it is difficult to get a gradient that correctly reflects how far away from the real thing the profile is being generated.
The second problem is the number of samples per class. When generating data conditioned on class labels, we need a lot of data belonging to a single class label, but since users are class labels in the recommendation problem setup, we can only have one sample per class.
The authors have developed a new model called GANMF that addresses both of these issues. We will look at it in more detail in the next section.
Overview of GANMF
The overall picture of GANMF is shown in the figure below: the Generator receives the user's ID and generates a profile, and the Discriminator discriminates between real and generated profiles.
Unlike conventional cGAN, the Discriminator employs a self-encoder. The reconstruction loss shown in the following equation is used as an index to discriminate between real and generated profiles, and this reconstruction loss is regarded as an energy function, and low values are learned to be real profiles and high values are learned to be fake profiles.
Hinge loss with reconstruction loss is used for the loss function of the Discriminator.
On the other hand, the Generator does not accept noise as input, unlike conventional cGANs. This is because it generates a single profile for a single user.
The generator is a simple structure consisting of two embedding layers and generates user profiles as follows. This is a representation of matrix factorization.
In the formula, y is the ID of the user you want to generate, $V$ is the item matrix, and $\Sigma$ is the user matrix.
The loss function of the Generator is set to a loss consisting of a reconstruction loss of the generated profile and a feature matching loss to avoid the problem of continually generating single-user samples.
The feature matching loss is closely related to the Max Mean Discrepancy (MMD), which is often used as the distance between probability distributions and is also used as a loss function for GANs.
As can be seen from the above, GANMF is a very simple model that combines matrix factorization and GAN.
Recommendation experiment by GANMF
Let's look at the results of actually using GANMF as a recommendation system.
In the problem set-up of this experiment, we rank items that have not previously been associated with a user: for GANMF, we generate a profile for each user and sort the obtained values to compute the top N items.
Three datasets, MovieLens 1M, MovieLens HetRec, and LastFM, are used as datasets, and those that reflect user preferences based on the size of the value, such as ratings and number of views, are converted to binary values of 0 and 1.
The model is evaluated using Normalized Discounted Cumulative Gain (NDCG) and Mean Average Precision (MAP), which are calculated for the top 5 or top 20 recommended items. These metrics take into account the relevance of the items as well as their ranking.
The results of comparing GANMF with existing methods of collaborative filtering on three different datasets are as follows: u in GANMF stands for user-based and i for item-based.
These results show that GANMF outperforms the other methods on all three datasets. In particular, GANMF outperforms CFGAN on average by 24%, even though GANMF uses the same training method of generating profiles with GAN as CFGAN does.
We do not see much performance difference between ItemKNN, SLIM, and $P^3\alpha$, which use similarity between items, and GANMF achieves on average 10% better performance than these models. It appears that these models tend to be somewhat inferior to matrix factorization-based methods.
Importance of Feature Matching in GANMF
The authors investigate the extent to which the feature matching-derived terms incorporated into the Generator's loss in GANMF affect the profiles generated.
The figure below shows the results of generating user profiles with and without feature matching in GANMF and calculating the cosine similarity between each user.
Without feature matching, the cosine similarity is as close to 1 as possible in both datasets. This shows that feature matching is essential for generating unique profiles for each user.
in conclusion
I thought that the problem setting of selecting which item to recommend next by generating a profile conditioned on the user by GAN is very special. Also, since the data to be generated is still the same profile as in CFGAN, I felt that it would still be difficult to handle data formats with more items and higher dimensions, or data formats that include evaluation values (ratings).
It seems that the use of GAN for recommendation is still in the development stage, so we expect further progress in the future.
Categories related to this article