There’s been much discussion of late on the accuracy and legitimacy of FiveThirtyEight. I’ve been a fan of FiveThirtyEight since just after it got started, but I find myself having some mixed feelings about the topic.
Here are three articles that are central to many of the arguments being made:
1) David Brooks, who works with FiveThirtyEight founder Nate Silver at the NY Times, writes of the futility of watching and using polls. His implications are twofold: that as an individual it’s a waste of time to track polls, and that it’s impossible to predict what will happen in a presidential election because it’s too complex (and “even experts with fancy computer models are terrible at predicting human behavior”).
2) Dylan Byers at Politico argues that Silver is a biased hack who throws around numbers that don’t make sense.
3) Ezra Klein at the Washington Post rebuts Byers, arguing that the betting markets clearly favor Obama, and that a bettor could use a better system to beat them… but only if a better system existed. Moreover, he makes the point that, unlike Silver, most pundits aren’t accountable, and that their main goal is traffic and/or attention, not accuracy. Hence he gives Silver a tentative endorsement.
I like elements of what Brooks, Klein, and Silver are saying and doing. On the other hand, I think Byers doesn’t understand probability and is completely full of crap.
My feelings can be summed up as follows:
1) As both Klein and Brooks point out, predicting elections is hard. As Klein notes, Silver doesn’t publish his formula (in his shoes, I wouldn’t either), so it’s tough to say definitively how good it is. There are a limited number of historical examples of presidential elections, so Silver must cleverly combine the results of different types of elections (governor, Senator) in a way that effectively multiplies the sample size by a lot. Does this work? Probably to some extent, but since there aren’t many past elections to test his methodology against, it’s tough to know for sure. And the results of this election will neither confirm nor refute his methodology (even though some will claim otherwise). As I’ve written in the past, big data trumps small data, and elections are small data.
2) The world is becoming more quantifiable and accountability-driven. At some point, that means that pundits like Byers are likely to get left behind.
3) Brooks’ comment about computer models doing a poor job at predicting human behavior suggests that he doesn’t really understand the difference between different types of models. There are certain situations where behavior can with near certainty be very well predicted: the same kind of kids who eat the marshmallow today will eat it tomorrow, and a predictive model can be quite accurate. On the other hand, a model that has implicit assumptions about the surrounding world — who is going to repay their mortgage — can be a very different story. It’s debatable where on this gradient an election falls.
4) I share some of Brooks’ concern about people (like me!) wasting time reading about polls. Silver brings a much needed rigor to the poll-watching, making it more intelligent. But ultimately, reading FiveThirtyEight is still mostly about “fast news” and addictive news consumption. I can read a few hundred words every day, and all I get out of it is a slightly better understanding of the horse race (there’s a great chance that Ohio is the deciding state!)
As both a data guy and someone with a vested interest in good journalism, that seems like a bit of a waste. Nate Silver’s a talented and resourceful data guy/journalist, and he’s almost exclusively focused on fast news-update me right now-horse race journalism.
Thanks to a few twists of fate, I sit on a panel that awards an investigative journalism prize (the Goldsmith Prize). Being on that panel, I read over a hundred “slow news” investigative stories from around the U.S. Every year, I see more and more data-centered reporting. That happens because additional public data sets become available and journalists (slowly) become more data-savvy. The end result is more good stories like this and this.
These big investigative reports — slow news — can be very valuable to our democracy. But historically, many such reports have been anecdotal and lacking in rigor. That’s changing: the Las Vegas Sun story on health care was wonderful in its depiction of individuals’ tales, but also painted a solid quantitative picture.
Each year, in spite of newsroom cuts, the public seems to get more and more stories combining slow news with data-driven rigor. That’s great for society, but it’s happening more slowly than it could be.
Hence my reaction to Silver is two-fold. On one hand, I applaud him for his statistical rigor towards a topic that’s too long been the domain of those who can speak loudest. I hope it serves as a model for others working in a variety of fields.
On the other hand, I question whether spending time on fast news like election coverage is the best use of his talent. I hope to see more and more folks like Nate Silver applying themselves to the deeper but slower news stories of our day.