Why You Should Play Games In Inefficient Markets

Team Rankings, a business I started in 2000, takes a quantitative approach to understanding and predicting sports events. Over the years, we’ve built products targeted at a number of different groups within the world of sports enthusiasts.

Two products we’ve built — one to help people beat the Las Vegas line, another to help them win their NCAA Tournament office pools — are both data-driven tools to predict outcomes of sporting events. However, they reflect two very different sides of the world of quantitative decision-making.

The Skilled Opponent

When I was working at PayPal in the early 2000s, I was spending most of time building predictive models to detect fraud. At home one night, I decided to see if I could apply a similar methodology to sporting events to use on Team Rankings. My system incorporated a large number of inputs — how each team had been playing, what their respective strengths were, how far each had travelled — and used them to find a blackbox (i.e., really complicated) predictive model to assess the likelihood of a bunch of events related to a game. Among these: how likely is each team to win, how likely is each team to cover the point spread, and what is the expected final score.

It turns out that this approach works well enough to consistently beat Las Vegas. For the current models — which use essentially the same methodology — here are the results for college basketball since 2008-2009:

For these two bet types — against the spread and over/under — Team Rankings’ picks have been correct 5214 times and incorrect 4452 times. The odds of this level of success by a dart-throwing monkey or TV commentator — who would expect to be correct 50% of the time — is less than one in a billion.

However, while 50-50 constitutes “break-even” from a statistical perspective, the break-even point for a Las Vegas gambler is higher. The house takes an extra cut: in most cases, a gambler bets $110 on a game to win $100. That means that you need to be correct at least 52.4% of the time just to break even.

Team Rankings’ results beat that threshold too: 5214 wins and 4452 losses constitutes a 54% winning percentage. And that means that someone who gambles strictly based on Team Rankings picks will make a profit over the long term. Bet $100 on each game, and you’ll make an average of $7,000 over the course of a season; bet $1000 per game, and your expected return will be $70,000.

That’s a terrific return, and I’m proud of it.

But still, it begs a question. Team Rankings has a great team (I’m a tiny part), some cool technology, and lots of data, and the best we can do is 54%? That’s barely better than a coin flip! It means an expected return of about 3% on each bet — a good investment, but hardly earth-shattering and a far cry you get from the “100% GUARANTEED PICKS!!!” you see on sleazy gambling picks sites.

[Disclosures: Team Rankings' picks are for entertainment purposes only. Of course. And while I'm capable of thinking like a scientist (constantly skeptical), I'm also capable of thinking like a writer (cherry picking to prove my point). These are among the better-performing of our models, but overall our results are likewise very strong and profitable.

At some point, I'll take Anthony from Kaggle up on his offer to run a contest to predict games, so people smarter than I can have a crack at it. If we did that, we might improve the number to 55-56%.]

The Unskilled Opponent

Some of Team Rankings’ recent product innovations tell a very different story.

For many years, we’ve provided odds and analyses related to the NCAA Tournament. Our products have included tools to match up two teams, along with probabilities for each team to make each round of the tournament. So, for instance, you can see the odds of Gonzaga making the Sweet 16, the Crazy Eight, the Final Four, etc.

A few years ago, I realized that those features could help people win their NCAA Tournament pools, but only if they used our numbers in the correct context. Say, for instance, that Kansas is the team most likely to win the tournament, with 20% odds, but Kentucky is just behind at 19%. Kansas would be a better pick to win than Kentucky, right?

Not necessarily. Let’s say Kansas is a really popular pick, but Kentucky is not. 50% of people are picking Kansas to win and only 10% are picking Kentucky. If you picked Kansas, you’d have a slightly better chance of being right on the winner, but because you’d be competing head-on with many more people, you’d almost certainly have a smaller chance of winning your pool. In entrepreneur-speak, picking Kansas would be like trying to build a mobile photo sharing app in 2011, a group buying site in 2010, or a Facebook game in 2009: good business or not, you’re setting yourself up for a lot of competition.

So, with some help from Brad, we built a tool that allowed us to answer the question of what picks would maximize someone’s odds of winning their pool. To do that, we combined teams’ win probabilities (which many others do) with data on how many people picked each team (which no one else does). We simulated millions of tournaments, randomizing both the outcome of the games and the picks made by people in your pool.

The end results are pretty astounding. Our top brackets — based on a moderately conservative set of assumptions — had an expected return on investment (ROI) of around 800%. This bracket, for instance, would have had about a 0.9% chance of winning a 1000-person pool, nine times higher than the average participant. By picking as our champion Ohio State — a very strong team not given enough credit by the public — our odds would be much better than if we picked Kentucky, the best team but also one strongly favored by the general public.

Our 10-person bracket looked quite different: with a smaller number of people to beat, our simulations indicated it was a stronger strategy to pick mostly favorites, with Kentucky as the eventual champion. This bracket had a less ridiculous but still quite impressive 186% expected ROI.

Kentucky wound up winning the national championship. Most of our users in small pools did quite well and won their pool; our users in large pools did not. We won’t be successful every year, but over time our results have been very strong. And this approach works: I feel comfortable with the assertion that the average yearly return for our strategy will be at least 200%.

Comparing Markets

How do we synthesize all of this and bring it back to the non-sports world?

A bet on a game at a Las Vegas sportsbook and an entry in your colleague’s NCAA tourney pool both constitute a wager on sports. However, the underlying market dynamics could not be much more different.

Las Vegas is more or less efficient: if lots of people bet on one side of a game, they’ll update the odds. In contrast, your friends and colleagues in the NCAA pool are likely making impulsive decisions that are economically irrational.

Hence there’s one world (Vegas) where 3% returns are celebrated as amazing wizardry, and another (friends’ pools) where you can have expected returns of 900% without anyone really paying attention.

If you’re looking for a financial return, though, there’s a catch. There’s only one NCAA Tournament per year, so your opportunity to make money is limited. In theory, you could enter lots of pools with distinct but complementary undervalued picks, perhaps giving yourself a 50% chance of winning with only 15-20% of the pot. But you’d be putting your marbles in one basket.

By contrast, each year there are thousands of regular games on which you can bet against Las Vegas. Adding those together can yield a solid expected ROI over the course of a year. Quantitative hedge funds generally take this against-Vegas approach: they find short-term inefficiencies, and bet on them again and again.

Though difficult, Team Rankings and hedge funds show that betting in Vegas-style almost efficient markets can be extremely profitable. From the actor’s perspective, it’s a bunch of bets in established markets with positive ROI. Yet from the world’s perspective, it’s a bunch of minor market efficiency improvements, but a world that hasn’t really improved in any meaningful way. In other words, something that’s more Wall Street than Silicon Valley.

By contrast, the pool of your buddies — while itself not a world-saving problem — represents a far larger and more profound inefficiency. The large-scale decision-making of Las Vegas and Wall Street is close to being economically efficient, but one-off decision making by individuals and businesses is not. Most choices — companies deciding whom to hire or where to put resources, government choices on how to run cities and schools, individual choices on where to invest, and which teams to pick for your NCAA pool — are made haphazardly and could be improved a lot.

It’s tougher to build a business to address these massive inefficiencies: to build something large, you need to find important, quantifiable decisions that have associated data and haven’t already been examined properly. That type of problem is more interesting, has more upside (+800% vs. +3%), and can be much more impactful than its Vegas-style alternative. And it’s why, without hesitation, I choose Silicon Valley over Wall Street.

A Founder’s Constant State of Rejection

When recruiters ping me about open positions at hot companies, I tell them “thanks, but the next company I work for will be (another) one I start myself.”

It’s not clear whether I’m masochistic or just dumb; life was a lot easier before I got started on this whole founder thing.

An Easier Existence

The first seven years of my career were pretty straightforward. I was either an individual contributor or leading a small team inside a larger company. Within a year, I’d figure out a few things I could do to be successful, and I was able to cruise along easily.

PayPal hired 22-year-old me in 2000 to help solve the company’s massive fraud problem. For a few months, I didn’t really know what to do and flailed around a bit. But I soon created a template for predicting fraud, and used it repeatedly to apply a few techniques to solve many fraud problems. I was an individual contributor and making a comfortable salary; though I was working hard enough, my job lacked major challenges and I had little stress.

From there I went to LinkedIn, where I spent two and a half years leading the data analytics team. I faced more stress at LinkedIn than I had at PayPal: I had to hire people, I had to meet regularly with LinkedIn’s executives, and I was a lot closer to the company’s decision-making. Moreover, while at PayPal I had a known problem (detecting fraud) with an unknown solution, at LinkedIn I had an unknown problem (lots of data; what to do with it?) and an unknown solution.

Still, while I was at LinkedIn, my work-related stress was almost nil. I was occasionally exasperated by my colleagues’ decisions, but what could I do? LinkedIn’s successes were nice but hardly life-affirming; its failures made me roll my eyes but not search my soul.

In these larger companies, I found myself in positions where I was almost assured of success: I was skilled and solving problems I knew how to solve. I’d soon learn that life as a founder is completely different.

Founder Changes

When I co-founded the company that became Circle of Moms, I fount that my day-to-day responsibilities changed greatly. Instead of working in a cubicle in a huge office, I sat across from my co-founder at my kitchen table. Instead of asking IT to set up a new database for me, I figured out how to do it myself. Instead of asking a marketing person to write copy for the emails I wanted to send to users, I wrote the emails. Instead of being the crazy analytics guy the engineering team would never want writing production code, I coded the whole darned site myself.

And those are the unimportant changes. Here’s the important one:

A founder must continually put himself and his company out on the line for others to judge.

For an asocial geeky dude, that was an enormous shift. At LinkedIn and PayPal, I rarely took big risks and didn’t have to put myself out on the line. As a founder at Circle of Moms, I did it every single day.

When you’re a founder, your company defines you. That means that your company’s daily ups and downs become your personal ups and downs; that’s a big adjustment.

I’m a fairly even-keeled person: when my co-founder would jump up and down with excitement after seeing good feedback on a new feature, I’d describe it as “encouraging”. I maintained a healthy lifestyle over those 4.5 years: I exercised almost every day, I ate a home-cooked dinner with my wife most nights, and usually maintained a good balance between working hard and living the rest of my life. Nevertheless, I’d still leave the office on many a Friday night completely despondent about the week I’d had, worried about the company and its prospects.

Five Ways to Fail

A consumer Internet company must do well in five areas: product metrics, revenue metrics, hiring, team culture/productivity, and fundraising. In the four and a half years I spent as CTO of Circle of Moms, we never had a time when all five were on a great path.

Just after we launched the site, our product metrics were excellent, but thanks to the financial crisis investors weren’t eager to invest in anything. In 2010, our revenue numbers were excellent, but our traffic stats were dipping. In early 2011, our traffic recovered strongly, but we had more trouble selling ad inventory.

Team culture may be the area where founders take success and failure most personally. If I showed up at 7 AM and left at 8 PM, made honest appraisals of company strengths and weaknesses, and took full responsibility for my failures, shouldn’t my colleagues do the same? And if they didn’t, was it a personal rebuke of me?

Good founders feel strongly about establishing the right environment for a happy and productive team; that’s surprisingly hard to do. A challenging but not unusual week might feature one employee taking an extra day off after a vacation, another one calling in sick with an important deadline the next day, and two others playing big-company-style political games against one another.

Those three ordeals were independent of one another and seem small in retrospect. But at the time, I felt like the roof was caving in: our employees were rejecting my leadership and they were getting lazy, political, and unproductive. The end was surely near.

Likewise, hiring is vitally important and requires thick skin. At Circle of Moms, we’d reach out to dozens of top candidates and usually hear nothing in response. I’d spend a full day at Stanford pitching our company to CS undergrads — far more tiring than any day I’d ever spent coding. After several months, we’d finally find one good candidate and make him an offer. When he’d instead choose to work for another startup — whose name was well-known to TechCrunch readers but whose vision we didn’t quite get — it was hard to avoid getting flustered.

Raising capital almost invariably features many rejections from investors, even with companies that become very successful. We experienced periods where our traction was good and fundraising was almost too easy (we turned down money from one VC because we saw he hadn’t even bothered to sign up for our product), but we also failed in several attempts to close a larger venture round. It’s easy to see a lack of fundraising progress as a company (and personal) failure: if you can’t raise a lot of capital, there must be something wrong with you.

As a techie individual contributor in a larger company, I could go to work everyday and execute 99% predictably. As a founder, I had to find ways to plead your case over and over — to employees, investors, candidates, advertisers, users — and I got rejected a lot. For an introvert, the amount of pleading and subsequent rejection came as quite a shock.

As a founder, you need to be prepared for this sort of rejection. It should affect you: if it doesn’t, it means you don’t care enough and should be doing something else. But a rejection of your company is a (hopefully) rational move by someone else, and it’s not a reflection on you as a founder or an individual. Don’t take it personally.

Of course, the founder/non-founder divide I describe doesn’t need to be binary: non-founders can and do sometimes work like the founder I describe above. A number of the top people at Circle of Moms took ownership and were truly wrapped up in the company’s success, and that helped us immensely. And those are the best people to have on your team.

Founder or not, taking ownership and repeatedly putting yourself in front of the world to be judged is difficult. But ultimately, it’s a tremendous way to learn, grow, and succeed.

Why Campaigns Are Smart and Policymakers Are Dumb

There are lots of things to hate about political campaigns.

From the never-ending attack ads to the vague but inevitable pleas for “change” on both sides, from the repeated half-true talking points from candidates to the baseless grand claims of pundits, it’s easy to get irritated. And no doubt many Americans — especially those in swing states — look forward to getting on with their lives post-Election Day.

Though it may be surrounded by triviality, however, presidential campaigns are showing an increasing and admirable level of sophistication in their execution plans. Each election, the campaigns gain a better and better understanding of the electorate, and use that information to improve their decision-making.

How exactly do they do that? First, they ask, how likely is this guy to vote for me? The answer to that informs their tactical decisions. If there’s a 99% chance he votes for the other guy, the best bet is probably to just write him off. If there’s a 99% chance he’ll vote for me, on the other hand, I don’t need to spend much effort persuading him — but I want to do everything I can to make sure he votes. And if there’s a 50% chance he votes for me? In that case, I’m going to try to find out what he likes better about me, and accentuate it.

This, in a nutshell, is what “big data” — silly buzzword or not — is about: used well, data improve decision-making. The more of it a campaign has, the better and more targeted its decisions can be.

In some ways, it’s exciting that the campaigns are using these techniques. With everyone on the team rowing together toward the same goal, friction is minimal; a well-executed operation can significantly improve their chances of success.

Still, these improvements likely don’t help the average citizen. Each campaign gets better at crafting messages and allocating resources, but the political game is still a zero-sum battle. The sides get better at marketing themselves, but not at governing. It’s similar with political contributions: if I give $100 to Obama and you give $100 to Romney, each candidate gets a little more to spend on commercials, and the only party that really wins is the TV network that gets their ad dollars.

But, you ask, couldn’t these same politicians apply these methodologies to the decisions they make when they’re actually in office?

They certainly could, but it’s not happening very much today.

In a campaign, everyone’s moving toward the same clear goal. They want to beat the other guy, and they’re going to pull out all of the stops to get there. That means using data — which obviously help — with decision-making.

The world in which the next president will govern is a very different one. Unlike campaigns, many government outposts aren’t even collecting data, let alone using it wisely.

Both parties are to blame. On the left, for instance, there’s often an aversion to collecting data that might be used to fire ineffective employees. Unions still wield tremendous power in Washington and in states, and their guiding principle is job security (and higher wages) rather than workplace effectiveness. There are too many ineffective teachers in our schools. But because their unions are primarily focused on teacher job security, it’s tremendously difficult to find and replace those who aren’t doing a good job.

On the right, there’s often a gut response against “big government,” even for tasks where government might be able to make more effective decisions than anyone else. President Obama was attacked mercilessly during the healthcare debate for the notion that the government would “ration” healthcare. But rationing is simply another word for resource allocation, something every individual and every business does. Health care already comprises over 15% of GDP; if we want that number to stop growing, we need to understand effectiveness and costs and ration intelligently.

There’s certainly some room for optimism. Inane campaign statements aside, Romney and Obama have both shown themselves to be smart and pragmatic. Romney bucked many of the orthodoxies of his party when he worked on healthcare reform in Massachusetts. Obama has strayed from the traditional Democratic alliance with teachers’ unions, taking steps a toward accountable, data-informed education policy.

Still, the collection and use of data by governments leaves much to be desired. Where campaigns are able to move swiftly, policy is often driven less by effectiveness and more by the goal of winning points with the electorate. As a result, there are many bad reasons data aren’t collected: bureaucracy, vague fears of privacy, abstract concerns around “big government”. Without good data, decision-making suffers.

Today is Election Day. We’ve now seen months of data-led sophistication from our political leaders. May the winner bring that same sophistication to the policies of his administration.