1、How Computers Know What We Want - Before We DoHeres an experiment: try thinking of a song not as a song but as a collection of distinct musical attributes. Maybe the song has political lyrics. That would be an attribute. Maybe it has a police siren in it, or a prominent banjo part, or paired vocal h
2、armony, or punk roots. Any one of those would be an attribute. A song can have as many as 400 attributes - those are just a few of the ones filed under p.This curious idea originated with Tim Westergren, one of the founders of an Internet radio service based in Oakland, Calif., called Pandora. Every
3、 time a new song comes out, someone on Pandoras staff - a specially trained musician or musicologist - goes through a list of possible attributes and assigns the song a numerical rating for each one. Analyzing a song takes about 20 minutes.The people at Pandora - no relation to the alien planet - an
4、alyze 10,000 songs a month. Theyve been doing it for 10 years now, and so far theyve amassed a database containing detailed profiles of 740,000 different songs. Westergren calls this database the Music Genome Project.There is a point to all this, apart from settling bar bets about which song has the
5、 most prominent banjo part ever. The purpose of the Music Genome Project is to make predictions about what kind of music youre going to like next. Pandora uses the Music Genome Project to power whats known in the business as a recommendation engine: one of those pieces of software that gives you adv
6、ice about what you might enjoy listening to or watching or reading next, based on what you just listened to or watched or read. Tell Pandora you like Spoon and itll play you Modest Mouse. Tell it you like Cajun accordion virtuoso Alphonse “Bois Sec” Ardoin and itll try you out on some Iry LeJeune. E
7、nough people like telling Pandora what they like that the service adds 2.5 million new users a month.Over the past decade, recommendation engines have become quietly ubiquitous. At the appropriate moment - generally when youre about to consummate a retail purchase - they appear at your shoulder, whi
8、spering suggestively in your ear. Amazon was the pioneer of automated recommendations, but Netflix, Apple, YouTube and TiVo have them too. In the music space alone, Pandora has dozens of competitors. A good recommendation engine is worth a lot of money. According to a report by industry analyst Forr
9、ester, one-third of customers who notice recommendations on an e-commerce site wind up buying something based on them.The trouble with recommendation engines is that theyre really hard to build. They look simple on the outside - if you liked X, youll love Y! - but theyre actually doing something fie
10、ndishly complex. Theyre processing astounding quantities of data and doing so with seriously high-level math. Thats because theyre attempting to second-guess a mysterious, perverse and profoundly human form of behavior: the personal response to a work of art. Theyre trying to reverse-engineer the so
11、ul.Theyre also changing the way our culture works. We used to learn about new works of art from friends and critics and video-store clerks - from people, in other words. Now we learn about them from software. Theres a new class of tastemakers, and theyre not human.Learning to Love Dolph LundgrenPand
12、ora makes recommendations the same way people do, more or less: by knowing something about the music its recommending and something about your musical taste. But thats actually pretty unusual. Its a very labor-intensive approach. Most recommendation engines work backward instead, using information t
13、hat comes not from the art but from its audience.Its a technique called collaborative filtering, and it works on the principle that the behavior of a lot of people can be used to make educated guesses about the behavior of a single individual. Heres the idea: if, statistically speaking, most people
14、who liked the first Sex and the City movie also like Mamma Mia!, then if we know that a particular individual liked Sex and the City, we can make an educated guess that that individual will also like Mamma Mia!It sounds simple enough, but the closer you look, the weirder and more complicated it gets
15、. Take Netflixs recommendation engine, which it has dubbed Cinematch. The algorithmic guts of a recommendation engine are usually a fiercely guarded trade secret, but in 2006 Netflix decided it wasnt completely happy with Cinematch, and it took an unusual approach to solving the problem. The company
16、 made public a portion of its database of movie ratings - around 100 million of them - and offered a prize of $1 million to anybody who could improve its engine by 10%.The Netflix competition opened a window onto a world thats usually locked away deep in the bowels of corporate R it works entirely o
17、n the basis of the audiences reaction. So if a large enough group of people claim to have enjoyed, say, both Saw V and On Golden Pond, the software would be forced to infer that those two movies share some common quality that the viewers enjoyed. Crazy? Or crazy genius?In such a case, the software w
18、ould have discovered an aesthetic property that we might not even be aware of or have a name for but which in a mathematical sense must be said to exist. Even Bell and Volinsky dont always know what the properties are. “We might be able to describe them, or we might not be able to,” Bell says. “They
19、 might be subtleties like action movies that dont have a lot of blood, dont have a lot of profanity but have a strong female lead. Things like that, which you would never think to categorize on your own.” As Volinsky puts it, “A lot of times, we dont come up with explanations that are explainable.”T
20、hat makes recommendation engines sound practically psychic, but everyday experience tells us that theyre actually pretty fallible. Everybody has felt the outrage that comes when a recommendation engine accuses one of a secret desire to watch Rocky IV, the one with Dolph Lundgren in it. In 2006, Walm
21、art was charged with racism when its recommendation engine paired Planet of the Apes with a documentary about Martin Luther King. But generally speaking, the weak link in a recommendation engine isnt the software; its us. Collaborative filtering works only as well as the data it has available, and h
22、umans produce noisy, low-quality data.The problem is consistency: were just not good at expressing our desires in rating form. We rate things differently after a bad day at work than we would if we were on vacation. Some people are naturally stingy with their stars; others are generous. We rate movi
23、es differently depending on whether we rate them right after watching them or if we wait a week, and differently again depending on whether we saw a lousy movie or a good movie in that intervening week. We even rate differently depending on whether we rate a whole batch of movies together or one at
24、a time.All this means that theres a ceiling to how accurate collaborative filtering can get. “Theres a lot of randomness involved,” Volinsky admits. “Theres some intrinsic level of error associated with trying to predict human behavior.”The Great Choice EpidemicRecommendation engines are a response
25、to the strange new world of online retail. Its a world characterized by a surplus of something we usually cant get enough of: choice. Were drowning in it. As Sheena Iyengar points out in her book The Art of Choosing, in 1994 there were 500,000 different consumer goods for sale in the U.S. Now Amazon
26、 alone offers 24 million. When faced with such an oversupply of choice, our little lizard brains go straight to vapor lock. “We think the profusion of possibilities must make it that much easier to find that perfect gift for a friends birthday,” Iyengar writes, “only to find ourselves paralyzed in t
27、he face of row upon row of potential presents.” Were living through an epidemic of choice. We require an informational prosthesis to navigate it. The recommendation engine is that prosthesis: it winnows the millions of options down to a manageable handful.But theres a trade-off involved. Recommendat
28、ion engines introduce a new voice into the cultural conversation, one that speaks to us when were at our most vulnerable, which is to say at the point of purchase. What is that voice saying? Recommendation engines arent designed to give us what we want. Theyre designed to give us what they think we
29、want, based on what we and other people like us have wanted in the past.Which means they dont surprise us. They dont take us out of our comfort zone. A recommendation engine isnt the spouse who drags you to an art film you wouldnt have been caught dead at but then unexpectedly love. It wont force yo
30、u to read the 18th century canon. Its no substitute for stumbling onto a great CD just because it has cool cover art. Recommendation engines are the enemy of serendipity and Great Books and the avant-garde. A 19th century recommendation engine would never have said, If you liked Monet, youll love Va
31、n Gogh! Impressionism would have lasted forever.The risk you run with recommendation engines is that theyll keep you in a rut. They do that because ruts are comfy places - though often theyre deeper than they look. “By definition, we keep you in the same musical neighborhood you start in,” says West
32、ergren of the Music Genome Project, “so you could say thats limiting. But even within a neighborhood, there is a ton of room for discovery. Forty-five percent of the people who use Pandora buy more music after they start, and only 1% buy less.” And not being based solely on data from its audience, P
33、andora isnt as vulnerable to peer pressure as most recommendation engines are. It doesnt follow the crowd.Pandora is unusual, though. The general effect of recommendation engines on shopping behavior is a hot topic among econometricians, if thats not an oxymoron, but the consensus is this: they intr
34、oduce us to new things, which is good, but those new things tend to be a lot like the old things, and they tend to be drawn from the shallow pool of things other people have already liked. As a result, they create a blockbuster culture in which the same few runaway hits get recommended over and over
35、 again. Its the backlash against the “long tail,” the idea that shopping online is all about near infinite selection and cultural diversity. It has a bad habit of eating its own tail and leaving you back where you started.But this isnt just about retail. The Web has transformed how we shop. Now its
36、transforming our social lives too, and recommendation engines are coming along for the ride. Just as Netflix reverse-engineers our response to art, dating sites like M and eHarmony and OKCupid use algorithms to make predictions about that equally ineffable human phenomenon, love; or, failing that, l
37、ust. The idea is the same: they break down human behavior into data, then look for patterns in the data that they can use to pair up the humans.Even if youre not into online dating, youre probably on Facebook, currently the second most visited site on the Web. Facebook gives users the option of swit
38、ching between a straight feed, which shows all their friends news in chronological order, and an algorithmically curated selection of the updates Facebooks recommendation engine thinks theyd most like to see. And in the right-hand column, Facebook uses a different set of algorithms to recommend new
39、friends. If you loved Jason, why not try Jordan?!And as for the first most trafficked site on the Web, if you cock your head only slightly to one side, Google is, effectively, a massive recommendation engine, advising us on what we should read and watch and ultimately know. It used to return the sam
40、e generic results to everyone, but in December it put a service called Personalized Search into wide release. Personalized Search studies the previous 180 days of your searching behavior and skews its results accordingly, based on its best guess as to what youre looking for and how you look for it.T
41、he principle is almost endlessly generalizable. Anywhere the specter of unconstrained choice confronts us, were meeting it by outsourcing elements of the selection process to software. Largely unconsciously, we radiate information about ourselves and our personal preferences all day long, and more a
42、nd more recommendation engines of all shapes and sizes are hoovering up that data and feeding it back to us, reshaping our reality into a form that they fondly hope will be more to our liking - in an endless feedback loop. The effect is to create a customized world for each of us, one that is ever s
43、o slightly childproofed, the sharp edges sanded off, and ever so slightly stifling, like recirculated air.How far will it go? Will we eventually surf a Web that displays only blogs that conform to our political leanings? A social network in which we see only people of our race and religion? Our hori
44、zons, cultural and social, would narrow to a cozy, contented, claustrophobic little dot of total personalization.Lets hope not. People werent built to play it safe all the time. We were meant to be bored and disappointed and offended once in a while. Its good for us. Thats what forces us to evolve. Even if it means watching Rocky IV, with Dolph Lundgren. Who knows? You might even like it.