This is one of those answers that keep me going to check Quora from time to time. Not just because it’s totally true.
NewScientist reports on April 30th that Futureful, a Finnish start-up, is building a predictive iPad based search engine that will use a recommender system. By harvesting information from social feeds from Facebook, Twitter, etc…, its algorithm take the topics that are trending, it analyses the users’ interests and behaviour, and recommends new topics that might interest them.
Eric Schmidt is also quoted as having said “The ability to tell me things I didn’t know but am probably very interested in is the next great stage of search“.
I am possibly cynical about this topic and have extensively blogged (Who wants to be recommended?, May 2009) about the problem of appropriate recommendations and the ability to surprise of such systems.
The problems I see relate to how you are supposed to evaluate a system whose task is to generate surprising recommendations. Especially in academic research, the success of a recommendation engine is traditionally evaluated using a very simple metric: take a list of users choices on the given domain, hide a number of entries, check if the recommender system returns them upon analysing the remaining ones. Straightforward, although several other metrics have been proposed.
Now, how are you supposed to evaluate a system that doesn’t have a reference list? We can surely think of many metrics, some of them quantitative, some of them qualitative (or even social-based):
- the probability a user follows the suggested link
- the strength of the trust feeling towards the recommender
- the fact that a user suggests the recommender system to other users …
However, a metric needs to be meaningful and qualitative metrics often lack this meaningfulness. If I’m a user and I want to be surprised, I will be probably following any random link. I often do that in what I call my serendipitous Wikipedia crawls. My favourite recommender system is, above all, Twitter: I only follow people that make me learn something interesting. Not one of the people that Twitter’s “Who to follow” system recommended me was relevant to me.
So I am a bit confused: what exactly a predictive search engine is really trying to achieve?
I’ve recently finished reading Hofstadter’s “Goedel, Escher, Bach”, after three years and a number of failed attempts with restarts. Of the main topics touched, I’ve found interesting its approach to the problem of natural language automatic understanding and generation. And I feel that this problem is intrinsically related to that of generating recommendations for users (ok, this is not a great discovery, I must admit).
The way we can understand the problem can be simply put as follows. Imagine we have a language generator we can ask to create sentences. We could:
- ask it to create correct sentences (i.e. grammatically correct sentences – this is somewhat possible)
- ask it to create meaningful sentences
- ask it to create funny sentences
The three points before carry different attributes, whose meaning attribution can be subject to discussion. As you can imagine, funny implies meaningful and correct, and meaningful implies correct. Which means that the generation of such sentences is increasingly hard and complicated. Moreover, as everyone can, within certain boundaries, generate a correct sentences, there are surely more shadows on what are the characteristics of a meaningful sentence (e.g. what is meaningful to me could not be meaningful to you), and a funny sentence needs its real, underlying, meaning to mean something different than its apparent meaning. You can also notice that the attribution of such attributes to a correct sentence is increasingly personal, too. The attribution of meanings is an intrinsically human activity, and this is well known to programming languages developers and logicians who deal with concepts such as syntax and semantics.
How all of this relates to the field of recommender systems should be obvious by now. A RS is a tool that, more or less, tries to understand what is meaningful to a user to provide him or her with suggestions. What a general purpose RS should do is to understand the meaning of objects and find similar objects. The thing is, the meaning of objects, especially when expressed by natural language, is not easy to establish, and in general cannot be established at all.
I recently reviewed a paper for a friend doing research in RS that reported an example similar to this: “I’m at home, and would like to get a restaurant in Angel Islington for tonight”. Contextual information (and subsequent activity and intent inference) are the interesting part of this request for a recommendation: it does not matter where I’m now, but where I would like to go. This is a very simple issue to deal with, but how about all those situations in which context is implicit?
You will object that a general purpose RS cannot exist and wouldn’t be that useful. Truth is, however, that even a limited domain RS as one for books or DVDs may encounter similar problems. I’ve been discussing the possibility of a “surprise me” button, proposed by Daniele Quercia. The idea is that sometimes as a user I would like to be suggested something new rather than something similar to what I’ve done in the past or to what my friends like. But this concept opens a very deep issue about to what extent should a surprise be made. In other words: it’s not possible to understand what kind of recommendation the user would like to receive. What a RS may do is to detect users’ habits or activities, and provide always a similarity suggestion.
So here’s my view of the limitation of current RS: they cannot – as of today – provide a recommendation to a user that likes to try something new. RS are for habituées.
A stupid example: I’ve read four books in a row by the English author Jonathan Coe. After that, Amazon kept on recommending me other books by Coe, whilst of course I wanted a break from them.
Any objections? E.g.:
– meaning in current RS is not expressed by natural language: true, but nonetheless this is a limitation of the systems themselves. This actually produces the result of not being able to give suggestions other than those based on the values. For example, “rate your liking of the book from 1 to 5” will never be able to express if the user actually would like to read it again, if it would recommend to others, or if it. Structured representation does not capture real meaning, and restricts the gamut of representable information about the user.
– no RS is general purpose: I think even limited domain RS suffer from the same problem, as no RS can infer a user’s feelings.
I’m not proposing silver bullets here, and of course not all research/applications in RS is to be trashed. Some possible research and development directions may be:
– use direct social suggestions: to whom you would suggest it? (similar to direct invitation in Facebook – where nonetheless all the limitations of this approach are evident)
– deal with changes in user tastes and try to predict them
– use more contextual information
– try inference from natural language, for example inferring user tastes from his or her long reviews
– better user profiling based on psychological notions and time-variance: TweetPsych has for example tried profiling a user based on tweets, that are short and scattered across time.
There’s a lot of ongoing research on recommender systems, fostered by the Netflix Prize.
Recommender systems are basically a software implement of some sort that allows suggestions on a given domain to be offered to users. Usually they are specialised: Amazon’s recommender system recommends books, last.fm’s recommends songs, and the like.
The key to recommendation relies into different aspects. I may be suggested things similar to things I previously chose, or things my friends like. There’s a whole theory behind this so I won’t bore you. To know more, use this site as a starting point.
My problem with RS is that of this post’s title: who wants/needs recommendation? Is it always true that I like the same kind of things? Surely, I’m a good counter example to this. I love Star Trek. I have watched and would like to watch again all single episode. Nonetheless, I hate Star Wars. I find it boring. I don’t like sci-fi in general. No Terminator, no Robocop. I can’t even name other non-trek sci-fi. So my hypothetical RS should know that I don’t like every kind of sci-fi film, but only Star Trek. Maybe my friends share this view (but as far as I know, no one really does), so it could try checking my friends’ profiles first.
If you give a look at my music library (or simply explore my Last.fm profile), you could define it at least eclectic. Someone would say it’s schizoid.
Moreover, sometimes I might want to do different things from those of my friends. Negative recommendation could be part of the solution, but the underlying algorithm would just be the same.
So what would represent a good recommendation to me? Well, usually what is important to me is surprise. I like many different things. The parameters that show that I like maybe are originality, quality, …, but maybe they are simply unknown. Some people suggested a “Surpise me button” to accomplish this task. But it’s not that easy. Even if I know what I don’t like.
Hence, the final questions: how can I represent tastes of a user? How can I represent his or her reactions (or feelings) towards something he or she expects or does not? How can I represent what I would like recommendation on, and what I wouldn’t?
Stay tuned on RecSys conferences to see if someone comes out with an answer; my guess is that we’ll be seeing lots and lots of new recommender systems in the next years, and each one will be confronted with these issues.