|
|
[Sorry, this post is in Italian]
AGGIORNAMENTO 2 (14/6/2010):
Una pagina chiamata “Caccia al tesoro” compare su web archive gia’ a gennaio 2006: http://web.archive.org/web/20060112195056/http://www.danieleluttazzi.it/?q=node/144. Quindi, esisteva a gennaio 2006. Indicizzata circa 2 mesi dopo la sua creazione.
Si noti un dettaglio: node=144 invece di node=285. Con un formato di URL fondamentalmente diverso. Ovvero, c’e’ stato un cambio di CMS.
Questo chiaramente non toglie nessuno dei discorsi sul plagio, la copia, eccetera, ma quanto meno svuota l’accusa di cospirazione, che per quanto mi riguarda era fastidiosa (e non utile ai fini “morali” della discussione, che e’ quella di stabilire se e quanto sia lecito “copiare”/”citare”, con o senza riferimento). Non c’e’ stata, almeno per questo post, alcuna retrodatazione: esisteva gia’ nel 2005.
AGGIORNAMENTO (14/6/2010):
- Per correttezza, il gestore del blog ntvox, quello che per primo ha parlato di questa vicenda, mi ha chiesto di precisare che la questione di web.archive.org non e’ l’argomento chiave del suo blog, che invece e’ piu’ interessato alla discussione generale della liceita’ del copiare battute, e alla mole di battute apparentemente copiate da Luttazzi. Sebbene l’argomento venga citato nel blog, e’ vero che non ne e’ la questione fondamentale.
- Giusto per ripetere fino alla noia: mi sto formando una posizione sull’intera questione, e tale posizione ovviamente e’ personale. Questo blog e’ pero’ un blog tecnico, e questo post si riferisce solo agli aspetti tecnici di una prova usata in modo, a mio parere, tecnicamente errato. Non vuole essere un richiamo ad altre prove, o presunte tali. Ci sarebbe da discutere su cosa costituisca indizio e cosa prova inconfutabile; cosi’ come su quali siano i requisiti tecnici di quella che puo’ essere ammessa come “prova”. In questo post mi concentro sul perche’ questa specifica questione non possa essere ammessa come prova, per mancanza di requisiti tecnici. Full stop.
(fine)
Non lo nascondero’, fino a ieri mattina “ero” un fan di Daniele Luttazzi.
Dopo aver letto le notizie sull’eventuale “plagio” sono diventato un ex fan deluso.
Eppure qualcosa mi ha spinto a verificare le informazioni riportate, in particolare riguardo quella che viene ritenuta la prova “schiacciante” della mala fede del comico romagnolo.
Credo che ci siano delle ragioni prettamente tecniche che, invece, difendono tale buona fede o quanto meno dimostrano che le prove portate a suo carico sono, nel migliore dei casi, inconclusive.
Premetto: di professione faccio l’informatico, mi occupo di internet e networking, ho una certa esperienza personale di gestione di siti internet.
L’accusa: Luttazzi avrebbe copiato delle battute da famosi autori satirici e, onde evitare di essere smascherato quale plagiatore, avrebbe scritto sul suo blog due post in cui invitava a una “caccia al tesoro di citazioni”, retrodatando questi due post in modo tale da non destare “sospetti”.
Reperti dell’accusa: i due post in questione sono recuperabili dal blog di Luttazzi e sono:
- http://www.danieleluttazzi.it/node/285 datato 9 giugno 2005
- http://www.danieleluttazzi.it/node/324 datato 10 gennaio 2006
Prove dell’accusa: il sito internet http://web.archive.com. Tale sito permette di recuperare tutte le versioni precedenti di una pagina internet. Cercando su web.archive.com le due pagine in questione, vengono riportate le presunte “data di creazione”:
- per il post 285, tale data sarebbe il 9 ottobre 2007 (oltre 2 anni dopo la data riportata da Luttazzi)
- per il post 324, tale data sarebbe il 13 dicembre 2007 (poco meno di 2 anni dopo la data riportata da luttazzi)
Da un punto di vista tecnico-informatico, in realta’, quelle due date sono fuorvianti.
Quello che sfugge all’accusa e’ un piccolo dettaglio tecnico: la data che web.archive.org riporta NON e’ la data di creazione della pagina. Si tratta invece della data in cui tale pagina e’ stata raggiunta per la prima volta dai “robot” di web.archive.org. Se oggi viene creata una pagina internet, questa pagina ci mettera’ un certo tempo, piu’ o meno lungo, ad essere “trovata” da web.archive.org. Questo tempo puo’ richiedere, effettivamente, anni.
Ci si potrebbe chiedere, dunque, se due anni siano un tempo ragionevole per l’indicizzazione di un sito popolare come quello di Luttazzi. Chiaramente non e’ possibile, a rigor di logica, avere una risposta certa. Statisticamente parlando, pero’, abbiamo degli indizi piuttosto seri che i post non siano stati retrodatati da Luttazzi. Basta prendere alcune pagine a caso dal blog, e verificarne data riportata e data su web archive:
- http://www.danieleluttazzi.it/node/277, data blog: 3 aprile 2007, MAI archiviata su web archive (forse questo dimostrerebbe che la pagina non esiste affatto?)
- http://www.danieleluttazzi.it/node/286, data blog: 10 gennaio 2006, prima data web archive: 9 ottobre 2007
- http://www.danieleluttazzi.it/node/289, data blog: 1 novembre 2006, prima data web archive: 9 ottobre 2007
- http://www.danieleluttazzi.it/node/291, data blog: 14 marzo 2007, prima data web archive: 9 ottobre 2007 (per questa pagina viene riportata anche una modifica risalente al 2 agosto 2008, prova che dal 2007 in poi il sito di Luttazzi e’ stato costantemente seguito da web archive)
Si noti che molte di queste date risalgono a ottobre 2007. Anzi, allo stessa data di ottobre: il 9. La stessa data del presunto post incriminato. Motivo? L’intero sito e’ stato indicizzato a partire da ottobre 2007. Prima non era presente su web.archive.org.
A maggior riprova di questo, basti guardare http://web.archive.org/web/*/danieleluttazzi.it/* Questa pagina contiene l’elenco di TUTTE le pagine del sito danieleluttazzi.it presenti su web.archive.org. E’ facile verificare come fino al 9 ottobre 2007 il sito NON fosse indicizzato. Tant’e’ che in quella data sono state aggiunge letteralmente centinaia di pagine a web.archive.org
Lo stesso vale per altri blog.
Prendete, ad esempio, un altro comico molto, Beppe Grillo:
- http://www.beppegrillo.it/2005/01/il_papa_e_infal.html data blog: 31 gennaio 2005, prima data web archive:
7 febbraio 2006 (oltre un anno dopo)
o quello del “cacciatore di bufale” Paolo Attivissimo:
- http://attivissimo.blogspot.com/2005/12/come-sta-valentin-bene-grazie-e-ha.html data blog: 31 dicembre 2005, prima data web archive: 16 gennaio 2006
Succede anche al noto quotidiano online repubblica.it, seppur con meno attesa:
- http://www.repubblica.it/ambiente/2010/04/27/news/marea-_nera-3646349/index.html?ref=search pubblicato il 27 aprile 2010 e non ancora su web archive; la pagina, tra l’altro, riporta un attesa di circa 6 mesi per entrare negli archivi (nel 2010, potrebbe essere stata piu’ alta nel 2007).
Se questo non prova che i due post in questione siano stati scritti davvero nel 2005 e nel 2006, diciamo che quanto meno e’ un indizio piuttosto forte che le date non siano state modificate manualmente. E comunque dimostra chiaramente che web.archive.org non puo’ essere usato, come e’ stato fatto, come prova per accusare Luttazzi di essersi difeso in mala fede, in quanto l’indicizzazione comincia troppo tardi.
Le valutazioni sul fatto se sia o meno lecito usare battute di altri non spettano a me da un punto di vista tecnico, ma al pubblico di Daniele. Di cui, ammirando prima di tutto lo stile di performance, torno a essere “fan”, dato che questo piccolo giro tecnico di verifica ha ristabilito la mia fiducia nella buona fede della sua difesa.
Mi piacerebbe che blogger, giornalisti, e altri accusatori verificassero il funzionamento di uno strumento tecnico di cui, evidentemente, hanno capito poco, prima di sbandierarlo come prova di mala fede.
Vic Gundotra on Android 2.2:
- 2x-5x increase in speed (due to Just-in-time compilation)
- tethering and portable hotspot
- impressive voice recognition capabilities
- cloud/app communication with instant mobile/desktop synchronisation
- Adobe Flash (“It turns out that on the Internet, people use Flash.” is my favourite quote ever…)
Steve Jobs on iPhone 4G:
Do I need to add anything more?
I think I sometimes express my childhood dreams even when using mySql…

Some random thoughts after The possibilities of real-time data event at the City Hall.
Free your location: you’re already being photographed
I was not surprised to hear the typical objection (or rant, if you don’t mind) of institutions’ representative when requested to release data: “We must comply with the Data Protection Act!“. Although this is technically true, I’d like to remind these bureaucrats that in the UK being portraited by a photographer in a public place is legal. In other words, if I’m in Piccadilly Circus and someone wants to take a portrait of me, and possibly use it for profit, he is legally allowed to do so without my authorization.
Hence, if we’re talking about releasing Oyster data, I can’t really see bigger problems than those related to photographs: where Oyster data makes it public where you are and, possibly, when, a photograph might give insight to where you are and what you are doing. I think that where+what is intrinsically more dangerous (and misleading, in most cases) than where+when, so what’s the fuss about?
Free our data: you will benefit from it!
Bryan Sivak, Chief Technology Officer of Washington DC (yes, they have a CTO!), has clearly shown it with an impressive talk: freeing public data improves service level and saves public money. This is a powerful concept: if an institution releases data, developers and business will start creating enterprises and applications over it. But more importantly, the institution itself will benefit from better accessibility, data standards, and fresh policies. That’s why the OCTO has released data and facilitated competition by offering money prizes to developers: the government gets expertise and new ways of looking at data in return for technological free speech. It’s something the UK (local) government should seriously consider.
Free your comments: the case for partnerships between companies and users
Jonathan Raper, our Twitter’s @MadProf, is sure that partnerships between companies and users will become more and more popular. Companies, in his view, will let the cloud generate and manage a flow of information about their services and possibly integrate it in their reputation management strategy.
I wouldn’t be too optimistic, though. Albeit it’s true that many longsighted companies have started engaging with the cloud and welcome autonomous, independently run, twitter service updates, most of them will try to dismiss any reference to bad service. There are also issues with data covered by licenses (see the case of FootyTweets).
I don’t know why I keep thinking about trains as an example, but would you really think that, say, Thameslink would welcome the cloud twitting about constant delays on their Luton services? Not to mention the fact that NationalRail forced a developer to stop offering a free iPhone application with train schedules – to start selling their own, non free (yes, charging £4.99 for data you can get from their own mobile web-site for free, with the same ease of use, is indeed a stupid commercial strategy).
Ain’t it beautiful, that thing?
We’ve seen many fascinating visualization of free data, both real-time and not. Some of these require a lot of work to develop. But are they useful? What I wonder is not just if they carry any commercial utility, but if they can actually be useful to people, by improving their life experience. I have no doubt, for example, that itoworld’s visualization of transport data, and especially those about Congestion Charging, are a great tool to let people understand policies and authorities make better planning. But I’m not sure that MIT SenseLab’s graphs of phone calls during the World Cup Final, despite being beautiful to see, funny to think about, and technically accurate, may bring any improvement to user experience. (Well, this may be the general difference between commercial and academic initiative – but I believe this applies more generally, in the area of data visualization).
Unorthodox uses of locative technologies
MIT Senselab’s Carlo Ratti used gsm cell association data to approximate people density in streets. This is an interesting use of technology. Nonetheless, unorthodox uses of technologies, especially locative technologies, must be taken carefully. Think about using the same technique to calculate road traffic density: you would have to consider single and multiple occupancy vehicles, where this can have different meanings on city roads and motorways. Using technology in unusual ways is fascinating and potentially useful, but the association of the appropriate technique to the right problem must be carefully gauged.
Risks of not-so-deep research
This is generally true in research, but I would say it’s getting more evident in location-based services research and commercial activities: targeting marginally interesting areas of knowledge and enterprise. Ratti’s words: “One PhD student is currently looking at the correlations between Britons and parties in Barcelona… no results yet“. Of course, this was told as a half-joke. But in many contexts, it’s still a half-truth.
What a shame having missed last year’s WhereCamp. The first WhereCampEU, in London, was great and I really want to be part of such events more often.
WhereCampEU is the European version of this popular unconference about all things geo. It’s a nonplace where you meet geographers, geo-developers, geo-nerds, businesses, the “evil” presence of OrdnanceSurvey (brave, brave OS guys!), geo-services, etc.
I’d just like to write a couple of lines to thank everyone involved in the organisation of this great event: Chris Osborne, Gary Gale, John Fagan, Harry wood, Andy Allan, Tim Waters, Shaun McDonald, John Mckerrel, Chaitanya Kuber. Most of them were people I had actually been following on twitter for a while or whose blog are amongst the ones I read daily, some of them I had alread met in other meetups. However, it was nice to make eye-contact again or for the first time!
Some thoughts about the sessions I attended:
- Chris Osborne’s Data.gov.uk – Maps, data and democracy. Mr Geomob gave an interesting talk on democracy and open data. His trust in democracy and transparency is probably quintessentially British, as in Italy I wouldn’t be that sure about openness and transparency as examples of democratic involvement (e.g. the typical “everyone knows things that are not changeable even when a majority don’t like them“). The talk was indeed mind boggling especially about the impact of the heavy deployment of IT systems to facilitate public service tasks: supposed to increase the level of service and transparency of such services, they had a strong negative impact on the perceived service level (cost and time).
- Gary Gale’s Location, LB(M)S, Hype, Stealth Data and Stuff
and Location & Privacy; from OMG! to WTF?. Albeit including the word “engineering” in his job title, Gary is very good at giving talks that make his audience think and feel involved. Two great talks on the value of privacy wrt location. How much would you think your privacy is worth? Apparently, the average person would sell all of his or her location data for £30; Gary managed to spark controversy amidst uncontroversial claims that “£30 for all your data is actually nothing” – a very funny moment (some people should rethink their sense of value, when talking about UK, or at least postpone philosophical arguments to the pub).
- Cyclestreet’s Martin Lucas-Smith’s Cyclestreets Cycle Routing: a useful service developed by two very nice and inspired guys, providing cycling route maps over OpenStreetMaps. Their strenght is that the routes are calculated using rules that mimick what cyclists do (their motto being “For cyclists, By cyclists“). Being a community service, they tried (and partially managed) to receive funding by councils. An example of an alternative – but still viable – business model.
- Steven Feldman’s Without a business model we are all fcuk’d. Apart from the lovely title, whoever starts a talk saying “I love the Guardian and hate Rupert Murdoch” gains my unconditional appreciation
Steven gave an interesting talk on what I might define “viable business model detection techniques“. As in a “business surgery” he let some of the people in the audience (OrdnanceSurvey, cyclestreetmaps, etc…) analyze their own business and see weaknesses and strenghts. A hands-on workshop that I hope he’s going to repeat at other meetings.
- OpenStreetMap: a Q&A session with a talk from Simone Cortesi (that I finally managed to meet in person) showing that OSM can be a viable and profitable business model. Even stressing that they are partially funded by Google.
Overall level of presentations: very very good, much better organised than I was expecting. Unfortunately I missed the second day, due to an untimely booked trip
Maybe some more involvement from big players would be interesting. Debating face to face about their strategy, especially when the geo-community is (constructively) critical on them, would benefit everyone.
I mean, something slightly more exciting than a bunch of Google folks using a session to say “we are not that bad”
…I think I can define GeoMob this way and I fit this definition perfectly
Nice London Geo/Mobile Developers Meetup Group meeting yesterday at City University. High level of the talks, providing vision, reporting experiences, and showing technologies and nice uses of them. Here’s a short summary.
Andrew Eland – Mobile Team Lead for Google UK
A very Google-like talk, showing up tech pieces with their vision. Of course, disappointing if you were expecting more in-depth analysis of market, novel ideas, or anything more than current publicly known work. But we’re used to that, and it was not a bad talk at all 
Best quote: “Tokyo is a vertical city“. That’s absolutely true, and this fact has a direct impact on geo-apps: being shops, clubs, bars, developed vertically at different levels of the buildings (this is a pic I took of the Keio Sky Garden, for example, and there are hundreds of beer gardens up on the roofs of several skyscrapers!) there’s a real need for accurate altitude information and 3d-mapping, or at least altitude-enabled maps. The interesting question for me here is how we can show multi-floor information on the 2d-maps currently in use.
Julianne Pearce, Blast Theory
An artists’ collective perspective on geo-development. Absolutely intriguing, as not the average techietalk you would expect from a GeoMob. I found this personally interesting, as I played with the Can you see me know? game and even created a modified version of it at the UbiComp Spring School at Mixed Reality Lab, University of Nottingham in April 2009, during a workshop dealing with Locative Game Authoring.
PublicEarth
They introduced their concept of a web 2.0 site for creating a personal atlas. Basically it’s about putting photographs and commercial activities of interest on a personal map. They seem to be developing APIs and the possibility of creating widgets, and directly deal with small companies (hotels, b&b, restaurants, bars) to put them in their database. The idea here is that users will be allowed to tell the (possibly intelligent) system what categories of data they’re mostly interested in, leading to some kind of customised Michelin guide.
On monetization, they have a three-fold strategy:
- contextual advertisement, empowered by the fact that users are genuinely interested in what they put in their atlas
- share of profit on direct bookings
- [long-term] user base providing more content, improving quantity and quality of contextual data in a positive feedback loop, possibly making it interesting to other companies
Laurence Penney, SnapMap
My favourite talk of the night. Laurence has been longing for a way of placing precisely photographs on a map for more than 10 years.
I was astonished of seeing him doing many of the things I would have liked to see in web sites like Flickr and that I’ve been discussing for ages with my friends and colleagues! Using gps data, a compass, waypoints, directions, focal length, and all the other data associated with a photograph, Laurence is developing a web site to allow users navigate those pictures, even creating 3d views of them like the guys at University of Washington with Rome wasn’t built in a day. Funnily, he started all of these before gps/compass-enabled devices were available, writing down all of his data on a notebook, and he even had problems with the police inquiring why he was taking picture at the Parliament (unfortunately, I have to say he’s not alone -_-).
Mikel Maron – Haiti Earthquake OpenStreetMap Response
Mikel explained what OpenStreetMap did to help in Haiti. Disaster response relies heavily on updated maps of building, streets, and resources, and OSM quickly managed to get that done. A great thanks to him and to all of OSM guys to show the world that mapping can be helpful to people even leaving out profit considerations.
Ollie Parsley is a developer from Dorset I’ve been following with much interest since his first appearance at the London Twitter Devnest last May (you might remember I blogged about it) as his work is often pointing mind-boggling problems in a developer’s everyday life (read about his Cease&Desist experience, for example).
HootMonitor is his latest Twitter application, even if I would say it’s reductive to call it a “Twitter application”. As it’s been introduced during last Devnest, HootMonitor is simply speaking a website monitoring tool using Twitter as a communication device. I.e.:
- you get an account on HootMonitor linked to your Twitter account
- add a web site you want to be monitored
- HootMonitor will periodically monitor the web site for you
- the service will send you a Twitter direct message/e-mail/sms if the web site goes down
- you will also get aggregate status reports (uptime and downtime, average response time, etc…).
As there has been much interest lately over the use of Twitter as a corporate tool, and never ending discussion over the possibility of a business model that allows Twitter to monetize its success, it looks like Ollie has touched again some issues and addressed the whole process of bringing this service to user in a way that resembles the classical case study from literature. I believe that HootMonitor is going to be an interesting and possibly successful experiment for the following reasons:
- Mashup use of Web 2.0 technologies: HootMonitor is not the first try of creating an application out of Twitter and there have been many mashups that received extensive press coverage. Nonetheless, HootMonitor is the very first application, as I’m going to explain, to deliver a service over Twitter that carries together: intrinsic usefulness, a business model, and a good “marketing” strategy.
- Useful service: HootMonitor adds value to user experience solving a real problem without disrupting the users’ life. There is plenty of monitoring tools out there, but not many of them generate reports in a way that integrates seamlessly into their lives and jobs.
- Freemium model: this is the most interesting aspect of HootMonitor. It can be used for free, but it has premium functionalities that you can get by paying a (reasonably priced) subscription. As far as I’m aware of, this is the first application with such a business model to have emerged over Twitter API. There is plenty of possibilities of trying the service for free. You can experience all the usefulness of it without paying a single penny. The functionalities you pay for, though, are worth the price (for example: personalised statistics or mobile text messages). Many other successful Twitter applications do not have a business model at all and it’s hard to imagine how they will ever lead to generate profit (unless they’re used as an advertisement tool for other products/services).
- Marketing strategy: Ollie has been developing HootMonitor for some months, letting the users of his other apps and his Twitter followers know about this idea. The steps here were developing some kind of “corporate” HootMonitor blog, a Twitter account to engage with potential users, a small company under whose name work (HootWare). Moreover, HootMonitor was launched exactly the night after its presentation at the Devnest. I believe this was a smart marketing move that made the service getting the highest level of advertisement possible.
Naturally, I can’t forecast whether or not HootMonitor will be a successful venture but I’m optimistic about it and of course I wish Ollie to get there. And as I’m finding it very useful for my websites, and I’m aware of many other people trying it, given its strategy and model it’s likely we’ll be hearing more about it in the short (and maybe longer) time.
I’ve recently finished reading Hofstadter’s “Goedel, Escher, Bach”, after three years and a number of failed attempts with restarts. Of the main topics touched, I’ve found interesting its approach to the problem of natural language automatic understanding and generation. And I feel that this problem is intrinsically related to that of generating recommendations for users (ok, this is not a great discovery, I must admit).
The way we can understand the problem can be simply put as follows. Imagine we have a language generator we can ask to create sentences. We could:
- ask it to create correct sentences (i.e. grammatically correct sentences – this is somewhat possible)
- ask it to create meaningful sentences
- ask it to create funny sentences
The three points before carry different attributes, whose meaning attribution can be subject to discussion. As you can imagine, funny implies meaningful and correct, and meaningful implies correct. Which means that the generation of such sentences is increasingly hard and complicated. Moreover, as everyone can, within certain boundaries, generate a correct sentences, there are surely more shadows on what are the characteristics of a meaningful sentence (e.g. what is meaningful to me could not be meaningful to you), and a funny sentence needs its real, underlying, meaning to mean something different than its apparent meaning. You can also notice that the attribution of such attributes to a correct sentence is increasingly personal, too. The attribution of meanings is an intrinsically human activity, and this is well known to programming languages developers and logicians who deal with concepts such as syntax and semantics.
How all of this relates to the field of recommender systems should be obvious by now. A RS is a tool that, more or less, tries to understand what is meaningful to a user to provide him or her with suggestions. What a general purpose RS should do is to understand the meaning of objects and find similar objects. The thing is, the meaning of objects, especially when expressed by natural language, is not easy to establish, and in general cannot be established at all.
I recently reviewed a paper for a friend doing research in RS that reported an example similar to this: “I’m at home, and would like to get a restaurant in Angel Islington for tonight”. Contextual information (and subsequent activity and intent inference) are the interesting part of this request for a recommendation: it does not matter where I’m now, but where I would like to go. This is a very simple issue to deal with, but how about all those situations in which context is implicit?
You will object that a general purpose RS cannot exist and wouldn’t be that useful. Truth is, however, that even a limited domain RS as one for books or DVDs may encounter similar problems. I’ve been discussing the possibility of a “surprise me” button, proposed by Daniele Quercia. The idea is that sometimes as a user I would like to be suggested something new rather than something similar to what I’ve done in the past or to what my friends like. But this concept opens a very deep issue about to what extent should a surprise be made. In other words: it’s not possible to understand what kind of recommendation the user would like to receive. What a RS may do is to detect users’ habits or activities, and provide always a similarity suggestion.
So here’s my view of the limitation of current RS: they cannot – as of today – provide a recommendation to a user that likes to try something new. RS are for habituées.
A stupid example: I’ve read four books in a row by the English author Jonathan Coe. After that, Amazon kept on recommending me other books by Coe, whilst of course I wanted a break from them.
Any objections? E.g.:
- meaning in current RS is not expressed by natural language: true, but nonetheless this is a limitation of the systems themselves. This actually produces the result of not being able to give suggestions other than those based on the values. For example, “rate your liking of the book from 1 to 5″ will never be able to express if the user actually would like to read it again, if it would recommend to others, or if it. Structured representation does not capture real meaning, and restricts the gamut of representable information about the user.
- no RS is general purpose: I think even limited domain RS suffer from the same problem, as no RS can infer a user’s feelings.
I’m not proposing silver bullets here, and of course not all research/applications in RS is to be trashed. Some possible research and development directions may be:
- use direct social suggestions: to whom you would suggest it? (similar to direct invitation in Facebook – where nonetheless all the limitations of this approach are evident)
- deal with changes in user tastes and try to predict them
- use more contextual information
- try inference from natural language, for example inferring user tastes from his or her long reviews
- better user profiling based on psychological notions and time-variance: TweetPsych has for example tried profiling a user based on tweets, that are short and scattered across time.
Hey folks, long time I haven’t blogged – been very busy at work and home! Let me resume my techie stuff by summarising some of my thoughts after the #GeoMob night at the British Computer Society, last 30 July.
The #GeoMob is the London Geo/Mobile Developers Meetup Group, and it organises meeting of developers interested in the geo/social/mobile field, usually with participation from industry leaders (Yahoo!/Google), businesses, startups.
This are my thoughts about the night, grouped by talk:
Wes Biggs, CTO Adfonic
Henry Erskine Crum, @henryec, Co-founder of Spoonfed
- Spoonfed is a London based web startup (Sep. 2008) that focuses on location-based event listings
- 12 people work there – which makes it interestingly big to be a startup
- very similar to an old idea of mine (geo-events but in a more social networking fashion) – which prompts me to realize I need to act fast, when I have such ideas
- I would have liked the talk to dig deeper into details about user base, mobile apps and HCI issues, but it was not a bad talk and it provided a very operational and yet open minded view of how the service works and evolves
- oh, and Henry was congratulated as the only guy in a suit (:P lolcredits to Christopher Osborne)
Gary Gale, @vicchi, Director of Engineering at Yahoo! Geo Technologies, with a talk about Yahoo! Placemaker
- get here the slides for this talk
- Yahoo! Placemaker is a useful service to extract location data from virtually any document – also known as Geoparsing. As the website says: Provided with free-form text, the service identifies places mentioned in text, disambiguates those places, and returns unique identifiers for each, as well as information about how many times the place was found in the text, and where in the text it was found.
- I see it very interesting especially as it is usable with Tweets and blog posts, and it can help creating very interesting mashups
- only issue: its granularity is up to the neighbourhood – which is perfectly good for some applications, but I’m not sure it is also for real-time-location-intensive mobile apps
Steve Coast, @SteveC, founder of OpenStreetMap and CloudMade, with a talk about Ubiquitous GeoContext
- OpenStreetMap can be somewhat considered the community response to Google Maps: free maps, community-created and maintained, freely usable – CloudMade being a company focusing on using map data to let developers go geo
- the motto from this talk is “map, please get me to the next penguin in this zoo” – that is, extreme geolocation and contextual information
- success of a geo app – but according to me also applicable to many Internet startups – summarized in 3 points:
- low cost to start
- no licensing problems
- openness / community driven effort
- it was an absolute delight to listen to this talk, as it was fun but also rich of content – the highly visual presentation was extremely cool, I hope Steve is going to put it online!
Oh, and many thanks to Christopher Osborne, @osbornec, for organising an amazing night!
UPDATE 27/08/09: The functionalities of my version of MarkerClusterer have been included in the official Google code project, you can find it gmaps-utility-library-dev. The most interesting part was the so called MarkerClusterer.
Imagine you need to show thousands of markers on a map. There may be many reasons for doing so, for example temperature data, unemployment distributions, and the like. You want to have a precise view, hence the need for a marker in every town or borough. What Xiaoxi and other developed, is a marker able to group all the markers in a certain area. This is a MarkerClusterer. Your map gets split into clusters (of which you can specify the size – but hopefully more fine grained ways of defining areas will be made available) and you show for every cluster a single marker, which is labelled with the total count of markers in that cluster.
I thought that this opened a way to get something more precise and able to make reasoning over map data. Once you have a ClusterMarker, wouldn’t it be wonderful if you had the possibility of displaying some other data on it, rather than the simple count? For example, in the temperatures distribution case, I would be interested in seeing the average temperature of the cluster.
That’s why I developed this fork of the original class (but I’ve applied to get it into the main project – finger crossed!) that allows you to do what follows:
- create a set of values to tag the locations (so that you technically attach a value to each marker)
- define a function that is able to return an aggregate value upon the values you passed, automatically for each cluster
That’s all. The result is very simple, but I believe it is a good way to start thinking about how the visualization of distributed data may affect the usability of a map and the understanding of information it carries. Here’s a snapshot of the two versions, the old on the left (bearing just the count) and the new on the right (with average data). Data here refer to NHS Hospital Death Rates, as published on here. If you want to see the full map relating to this example, click here.
|
|