Posted on Friday, March 12th, 2010 by David Chen
In October 2006, Netflix launched a contest that challenged entrants to improve its recommendation algorithm by 10%. The winner(s) would receive $1 million, dubbed the Netflix Prize. It was a win-win: Netflix received relatively cheap R&D, while everyday statistics enthusiasts had a shot at making a big payday. The results of the competition came down to a buzzer-beating finish, with a group called BellKor Pragmatic Chaos submitting the winning entry just four minutes before the contest was over.
The Netflix Prize was a resounding success. It generated tons of publicity for Netflix and BellKor Pragmatic Chaos was obviously glad to take home the prize. This past August, Netflix announced it would follow up the Netflix Prize with a sequel.
But the initial contest also attracted the attention of the FTC and of certain Netflix users, who were concerned that the anonymized data Netflix provided to contest entrants compromised Netflix users’ privacy. Today, Netflix announced on its company blog that it would be canceling its Netflix Prize Sequel, after completing negotiations with the FTC and settling a private lawsuit.
The lawsuit alleged that Netflix gave away private information, even though it anonymized the movie rental information, such as movie choices and movie ratings. But in 2007, two researchers from the University of Texas released a paper demonstrating that they could determine a user’s identity if that user had also left movie ratings at another site, such as IMDB. “Simply removing names does not ensure that data will remain anonymous,” one of the researchers insisted. “And the implications stretch far beyond the world of Netflix.”
In its announcement today, Netflix promised:
We have reached an understanding with the FTC and have settled the lawsuit with plaintiffs. The resolution to both matters involves certain parameters for how we use Netflix data in any future research programs. In light of all this, we have decided to not pursue the Netflix Prize sequel that we announced on August 6, 2009. We will continue to explore ways to collaborate with the research community and improve our recommendations system so we can constantly improve the movie recommendations we make for you. So stay tuned.
Releasing anonymous user data has proven ill-advised in the past, even when companies thought it could do no harm. In 2006, the New York Times was able to determine the identity of an individual just by using her anonymous AOL search data, which AOL released to the general public without permission. Interestingly, Forbes points out that Netflix could have avoided this issue the first time around by employing “data masking,” which would have allowed the data to retain its usefulness while making it untraceable.
While protecting user privacy is always a good thing, it’s too bad we won’t see a Netflix Prize sequel in the near future; I may never get a good follow-up recommendation for Atanarjuat now…