Today's New York Times has an article entitled "Guessing the Online Customer’s Next Want," the basic
point of which is that it giving customers good recommendations is hard.
The article, which is a nice general audience discussion of how Amazon,
Netflix, et al do their recommendations, what some people are doing to
improve it, and why it's hard to get real improvement.
The basic technique that everyone uses is collaborative filtering, and it was invented in the early '90s by my friend and business partner Jeremy Bornstein (he holds the original patent), among others. Collaborative filtering essentially takes a pile of data that a person has generated, compares it to piles of data other people have generated, and looks for similarities and differences. When a person has a highly similar data pile to another person, or better yet, to a cluster of other people who have similar data piles, one can infer that the areas of dissimilarity are potential grounds for becoming more similar - i.e. all these people who seem to share your movie/music/book/pet/whatever preferences have bought this thing, but you don't have it yet: maybe you want it too. It works pretty well, with pretty well being a relative thing. Boosting sales by even a few percentage points is well worth it for most internet retailers.
Despite collaborative filtering's being pretty good, there's lots and lots of room for improvement. And there has been since the early '90s. The basic thing is that, while the technique is fundamentally sound, people have been using the same technique for 15+ years. Every year there are a few startups that have a better recommendation engine, and the major in-house ones get better and better, but these improvements are only little increments. This is mainly because they come from using different data sets, more and bigger data sets, and tweaking well-known algorithms, rather than doing anything fundamentally new.
The Times article didn’t discuss a few things are happening now that will make recommendations a whole lot better soon. While collaborative filtering won’t go away, it will be used in conjunction with other techniques and the quality of customer recommendations will get way better.
- Social network data portability will soon enable very robust analysis of who someone’s explicit “friends” are, and one would expect that friends are more likely predictors of purchase intent than strangers who are similar in certain dimensions.
- Link analysis techniques being applied in consumer marketing applications will make it a lot easier to know who a person’s real friends are, and to implicitly identify them even if they’re not in the expressed social graph.
- Better metadata from microformat-using product and content pages will provide more, and more robust grist for the mill.
- Consumer interest profiles are getting better, fast: deep packet inspection and clickstream –based profiles generated by the likes of Phorm, Adzilla, NebuAd, and Loomia are now adding to the hopper, and next generation services that do information extraction on this sort of data will create much more meaningful, actionable metadata.