I’ve recently finished reading Hofstadter’s “Goedel, Escher, Bach”, after three years and a number of failed attempts with restarts. Of the main topics touched, I’ve found interesting its approach to the problem of natural language automatic understanding and generation. And I feel that this problem is intrinsically related to that of generating recommendations for users (ok, this is not a great discovery, I must admit).
The way we can understand the problem can be simply put as follows. Imagine we have a language generator we can ask to create sentences. We could:
- ask it to create correct sentences (i.e. grammatically correct sentences – this is somewhat possible)
- ask it to create meaningful sentences
- ask it to create funny sentences
The three points before carry different attributes, whose meaning attribution can be subject to discussion. As you can imagine, funny implies meaningful and correct, and meaningful implies correct. Which means that the generation of such sentences is increasingly hard and complicated. Moreover, as everyone can, within certain boundaries, generate a correct sentences, there are surely more shadows on what are the characteristics of a meaningful sentence (e.g. what is meaningful to me could not be meaningful to you), and a funny sentence needs its real, underlying, meaning to mean something different than its apparent meaning. You can also notice that the attribution of such attributes to a correct sentence is increasingly personal, too. The attribution of meanings is an intrinsically human activity, and this is well known to programming languages developers and logicians who deal with concepts such as syntax and semantics.
How all of this relates to the field of recommender systems should be obvious by now. A RS is a tool that, more or less, tries to understand what is meaningful to a user to provide him or her with suggestions. What a general purpose RS should do is to understand the meaning of objects and find similar objects. The thing is, the meaning of objects, especially when expressed by natural language, is not easy to establish, and in general cannot be established at all.
I recently reviewed a paper for a friend doing research in RS that reported an example similar to this: “I’m at home, and would like to get a restaurant in Angel Islington for tonight”. Contextual information (and subsequent activity and intent inference) are the interesting part of this request for a recommendation: it does not matter where I’m now, but where I would like to go. This is a very simple issue to deal with, but how about all those situations in which context is implicit?
You will object that a general purpose RS cannot exist and wouldn’t be that useful. Truth is, however, that even a limited domain RS as one for books or DVDs may encounter similar problems. I’ve been discussing the possibility of a “surprise me” button, proposed by Daniele Quercia. The idea is that sometimes as a user I would like to be suggested something new rather than something similar to what I’ve done in the past or to what my friends like. But this concept opens a very deep issue about to what extent should a surprise be made. In other words: it’s not possible to understand what kind of recommendation the user would like to receive. What a RS may do is to detect users’ habits or activities, and provide always a similarity suggestion.
So here’s my view of the limitation of current RS: they cannot – as of today – provide a recommendation to a user that likes to try something new. RS are for habituรฉes.
A stupid example: I’ve read four books in a row by the English author Jonathan Coe. After that, Amazon kept on recommending me other books by Coe, whilst of course I wanted a break from them.
Any objections? E.g.:
– meaning in current RS is not expressed by natural language: true, but nonetheless this is a limitation of the systems themselves. This actually produces the result of not being able to give suggestions other than those based on the values. For example, “rate your liking of the book from 1 to 5” will never be able to express if the user actually would like to read it again, if it would recommend to others, or if it. Structured representation does not capture real meaning, and restricts the gamut of representable information about the user.
– no RS is general purpose: I think even limited domain RS suffer from the same problem, as no RS can infer a user’s feelings.
I’m not proposing silver bullets here, and of course not all research/applications in RS is to be trashed. Some possible research and development directions may be:
– use direct social suggestions: to whom you would suggest it? (similar to direct invitation in Facebook – where nonetheless all the limitations of this approach are evident)
– deal with changes in user tastes and try to predict them
– use more contextual information
– try inference from natural language, for example inferring user tastes from his or her long reviews
– better user profiling based on psychological notions and time-variance: TweetPsych has for example tried profiling a user based on tweets, that are short and scattered across time.