Semi-supervised Learning for Stream Recommender Systems

Recommender systems suffer from an extreme data sparsity that results from a large number of items and only a limited capability of users to perceive them. Only a small fraction of items can be rated by a single user. Consequently, there is plenty of unlabelled information that can be leveraged by semi-supervised methods. We propose the first semi-supervised framework for stream recommender systems that can leverage this information incrementally on a stream of ratings. We design several novel components, such as a sensitivity-based reliability measure, and extend a state-of-the-art matrix factorization algorithm by the capability to extend the dimensions of a matrix incrementally as new users and items occur in a stream. We show that our framework improves the quality of recommendations at nearly all time points in a stream.

Continue reading →

Forgetting Methods for Incremental Matrix Factorization in Recommender Systems

Numerous stream mining algorithms are equipped with forgetting mechanisms, such as sliding windows or fading factors, to make them adaptive to changes. In recommender
systems those techniques have not been investigated thoroughly despite the very volatile nature of users’ preferences that they deal with. We developed five new forgetting techniques for incremental matrix factorization in recommender systems […]

Continue reading →

Selective Forgetting for Incremental Matrix Factorization in Recommender Systems

Recommender Systems are used to build models of users’ preferences. Those models should reflect current state of the preferences at any timepoint. The preferences, however, are not static. They are subject to concept drift or even shift, as it is known from e.g. stream mining. They undergo permanent changes as the taste of users and perception of items change over time. Therefore, it is crucial to select the actual data for training models and to forget the outdated ones.
The problem of selective forgetting in recommender systems has not been addressed so far. Therefore, we propose two forgetting techniques for incremental matrix factorization and incorporate them into a stream recommender […]

Continue reading →

Hoeffding-CF: Neighbourhood-Based Recommendations on Reliably Similar Users

Neighbourhood-based collaborative filtering recommenders exploit the common ratings among users to identify a user’s most similar neighbours. It is known that decisions made on a naive computation of user similarity are unreliable, because the number of co-ratings varies strongly among users. In this paper, we formalize the notion of reliable similarity between two users and propose a method that constructs a user’s neighbourhood by selecting only those users that are reliably similar to her. Our method combines a statistical test and the notion of a baseline user. We report our results on typical benchmark datasets.

Continue reading →

Correcting the Usage of the Hoeffding Inequality in Stream Mining

Many stream classification algorithms use the Hoeffding Inequality [6] to identify the best split attribute during tree induction. We show that the prerequisites of the Inequality are violated by these algorithms, and we propose corrective steps. The new stream classification core, correctedVFDT, satisfies the prerequisites of the Hoeffding Inequality and thus provides the expected performance guarantees […]

Continue reading →