Balázs Hidasi

Data Mining Researcher

Recommender systems

This page is outdated!

Please refer to the publications page for recent work until the page is updated!

Overview

Recommender systems are information filtering tools that help people with the information overload problem by suggesting them content/products relevant to their taste. Recommender systems become more and more popular as the number of possible choices becomes increasingly higher.
The most important part of a recommender system is the recommender algorithm that decides if an item relevant to an user and provides the personalized recommendation list for them. There are many possible ways to classify recommender algorithm one of which is wether it uses implicit or explicit feedback data. Explicit feedback means that information of the taste of the users is explicitely available for example in the form of ratings: low ratings mean negative, higyh ratings positive preference. On the other hand implicit feedback does not explicitely describe user feedback. An example for implicit feedback is the purchase history of a user on a website. The presence of an event does not necessarely mean positive preference (e.g.: bought it then got disappointed in it) and the absence of an event hardly means negative preference (e.g.: never even heard about the product) and there is really no negative feedback available in the data. Explicit feedback is much harder to gather as in most domains users are not motivated to submit ratings after purchase.
Although the implicit feedback case is more important in practical applications its research is lagging way behind the research of the implicit feedback based algorithms. There are multiple reasons for that like the implicit feedback case is a much harder problem; the evaluation of the implicit feedback based algorithms is not as exact as that of the explicit algorithms; starting with the Netflix prize, higher impact contests usually revolve around explicit data as they are sponsored by large companies/websites; etc. However I think that the implicit feedback case is more important, because (a) with the exception of the largest websites, most wesites doesn't have enough explicit feedback for accurate personalized recommendations; (b) the amount of implicit feedback is several magnitudes higher than the explicit feedback; (c) every user provides implicit feedback while only a small fraction provides explicit feedback. And our goal is to provide relevant recommendation to ALL users.
My research revolves around implicit feedback based algorithms, usually some kind of model-based collaborative filtering method. I also research context-awareness in implicit feedback based algorithms. Context-awareness basically means that in different situation we recommend different items for the same user (e.g.: on weekend afternoons family films, on weekday nights horror moives are recommended on an IPTV service). Including several context informations can greately increase the accuracy of recommenders. I also do research on transparently including other informations (e.g.: item metadata, user demographic information, etc) in the factorization based collaborative filtering framework.

Main research areas

  • Implicit feedback based collaborative filtering algorithms
  • Context-awareness
  • Including additional information in factorization framework

Solutions

The research is still in its early stages and the publication of the results is lagging behind it. Therefore there is no reason to summarize the methods here at the moment, because most of them is still work in progress (or unpublished) and I can't talk about them. But check back later as I will update this section once a block of the research is finished!
Meanwhile you can read the draft versions of my published papers on the subject.

Resources

iTALS: context-aware tensor method on implicit feedback (paper)

iTALS is a context-aware recommender algorithm for implicit feedback data. The user-item-context(s) setup is modelled in a binary tensor. Weights are also assigned to the cells based on the certainity of their information. In this paper we derive an ALS-based algorithm that is capable of efficiently factorizing this tensor. Additionally we introduce a novel context information: sequentiality. This context allows us to incorporate association rule like information into the factorization framework and to differentiate between items with different repetetiveness patters and thus to make recommendations more accurate. The paper was presented at ECML/PKDD in September 2012.

iTALS: context-aware tensor method on implicit feedback (slideshow)

The slideshow of the presentation of iTALS at ECML/PKDD 2012.
SlideShare version

iTALS: context-aware tensor method on implicit feedback (poster)

The poster version of the paper about iTALS for ECML/PKDD 2012. It's a great overview about the main properties of the method, but some of the details are not presented on it.

Initialization of matrix factorization (paper)

This paper describes why the initialization of matrix factorization methods is important and presents an interesting initialization method (coined SimFactor). Context-based initialization is introduced as well. Presented at the 2nd workshop on Context-awareness in Retrieval and Recommendations (CaRR 2012).

Initialization of matrix factorization (slideshow)

The slideshow of the presentation at CaRR 2012.
SlideShare version

Initialization of matrix factorization (extended journal version)

This paper is an extended version of the CaRR 2012 paper, published in the J.UCS journal in October 2013. Major updates: (1) Sim2Factor algorithm: initialization from similarity based similarities; (2) examination of additional similarity functions; (3) additional experiments on new databases.

Context-aware similarities in the factorization framework (paper)

This paper is about an interesting side project of my main research in recommender systems. It is about the preliminary examination of context-aware similarities in the factorization framework. This work is in the intersection of the following areas: (1) implicit feedback based recommendations; (2) context / context awareness; (3) item-to-item recommendations; (4) matrix / tensor factorization. The aim of this work is to examine whether context can be used to compute more accurate item similarities based on their feature vectors. Two levels of context aware similarities are introduced: (1) context is only used during training, but not for computing the similarity; (2) context is used during the training and for the similarity computations as well. It was presented at the 3rd workshop on Context-awareness in Retrieval and Recommendations (CaRR 2013).

Context-aware similarities in the factorization framework (slideshow)

The slideshow of the presentation at CaRR 2013.
SlideShare version

Approximate modeling of continuous context dimensions in factorization algorithms (paper)

This paper describes two ways to model continuous context dimensions in any factorization algorithms. Contrary to the currently applied approach, these methods don't suffer from context-state rigidness and the dismissal from the order of context-states. The experimental results of these approaches are rather outstanding. Presented at the 4th workshop on Context-awareness in Retrieval and Recommendations (CaRR 2014).

Approximate modeling of continuous context dimensions in factorization algorithms (slideshow)

The slideshow of the presentation at CaRR 2014.
SlideShare version

Navigate to...