Wednesday, September 24, 2014

BuzzFeed’s Data Whiz Understands the Limitations of Big Data

According to a recent profile, Dao Nguyen, BuzzFeed’s data scientist in chief, “firmly believes that data should not determine one’s editorial strategy.” Really? The head of data and growth (her job title) doesn’t think data are important?

Au contraire. Nguyen believes data should inform one’s decisions. As the article puts it, “As adept as she is at analyzing numbers ... she is also adept at understanding the limitations of data ... Data only tells its observer what is happening, not why it’s happening” (my emphasis).

Says Nguyen, “I think people would be surprised to know my relationship with data, which is actually one of great skepticism as well as great admiration.”

Addendum (2/5/2015): So does Netflix’s chief curator, Ted Sarandos:

“Studios and television networks have long made decisions about what to produce based on the intuitions of a limited number of executives. Television studios have Nielsen ratings, and movie studios have box-office sales, to help guide them. But those are relatively simple metrics, and notoriously unreliable; as the screenwriter William Goldman famously said, ‘Nobody, nobody—not now, not ever—knows the least goddamn thing about what is or isn’t going to work at the box office.’ As with the arrival of sabermetrics in baseball or the rise of pollsters in politics, the potential for the quants to change the industry—to really figure out what people want to watch—is clear ...

“‘It is important to know which data to ignore,’ [Sarandos said] ... ‘It’s probably a 70-30 mix ... Seventy is the data, and 30 is judgment ... But the 30 needs to be on top’ ...

“Of course, there is a big difference between using data in combination with intuition and relying entirely on an algorithm—the decision-making equivalent of Siri finding gas stations near you. I don’t think anyone—Netflix, Mitt Romney—makes big decisions that way. As Chris Kelly, the C.E.O. of Fandor, an indie-film Internet channel told me, ‘It just isn’t true that you can rely on data completely.’ Even Google, the champion of algorithms, employs substantial human adjustments to make its search engines perform just right. (It cares so much about this that Google claims First Amendment protection for its tweaks.) I do not doubt that companies rely more on data every day, but the best human curators still maintain their supremacy.

Addendum (2/21/2016): More from Nguyen:

“The data never tells you why anything happens. Data will tell you, if you're very lucky, what happened. It won't ever tell you why. If you want to understand why, that requires a different set of skills, largely in your brain and in your heart. Why did this story resonate with people?”