Recommender systems suffer from an extreme data sparsity that results from a large number of items and only a limited capability of users to perceive them. Only a small fraction of items can be rated by a single user. Consequently, there is plenty of unlabelled information that can be leveraged by semi-supervised methods. We propose the first semi-supervised framework for stream recommender systems that can leverage this information incrementally on a stream of ratings. We design several novel components, such as a sensitivity-based reliability measure, and extend a state-of-the-art matrix factorization algorithm by the capability to extend the dimensions of a matrix incrementally as new users and items occur in a stream. We show that our framework improves the quality of recommendations at nearly all time points in a stream.Continue reading →
In October 2015 I presented my paper on semi-supervised learning in recommenders at the Discovery Science conference in Banff, Canada. Here are slides from my talk.Continue reading →
Numerous stream mining algorithms are equipped with forgetting mechanisms, such as sliding windows or fading factors, to make them adaptive to changes. In recommender
systems those techniques have not been investigated thoroughly despite the very volatile nature of users’ preferences that they deal with. We developed five new forgetting techniques for incremental matrix factorization in recommender systems […]
Recommender Systems are used to build models of users’ preferences. Those models should reflect current state of the preferences at any timepoint. The preferences, however, are not static. They are subject to concept drift or even shift, as it is known from e.g. stream mining. They undergo permanent changes as the taste of users and perception of items change over time. Therefore, it is crucial to select the actual data for training models and to forget the outdated ones.
The problem of selective forgetting in recommender systems has not been addressed so far. Therefore, we propose two forgetting techniques for incremental matrix factorization and incorporate them into a stream recommender […]
Neighbourhood-based collaborative filtering recommenders exploit the common ratings among users to identify a user’s most similar neighbours. It is known that decisions made on a naive computation of user similarity are unreliable, because the number of co-ratings varies strongly among users. In this paper, we formalize the notion of reliable similarity between two users and propose a method that constructs a user’s neighbourhood by selecting only those users that are reliably similar to her. Our method combines a statistical test and the notion of a baseline user. We report our results on typical benchmark datasets.Continue reading →
Many stream classification algorithms use the Hoeffding Inequality  to identify the best split attribute during tree induction. We show that the prerequisites of the Inequality are violated by these algorithms, and we propose corrective steps. The new stream classification core, correctedVFDT, satisfies the prerequisites of the Hoeffding Inequality and thus provides the expected performance guarantees […]Continue reading →