Thursday, February 26, 2009

Properly interpreting projections

Every year, right about now, you see two related phenomenon:
  1. Fans completely misinterpreting the predictions of objective projection systems.
  2. "Experts" releasing hilariously skewed subjective predicitons.
What am I talking about? Specifically this: people, whether interpreting objective projections or making their own subjective projections, fail to discern the difference between saying that no individual team is likely to win 100 games and that the league is likely not to see any team win 100 games; the difference between not projecting a single pitcher to win 20 games and projecting the league not to have any 20 game winners; the difference between projecting that no single hitter is likely to hit 40 homeruns and drive in 120 runners and projecting that the league is likely not to see any hitter hit 40 homeruns and drive in 120 runners.

Do you think all these things are the same? If you do, you are failing to understand a basic concept in probablility. There's no shame in this. Probability is freaking confusing. Unless you're really, really well-versed in it, you're going to screw it up all the time. I know I do. Nonetheless, let's dig a little deeper into this problem.

Let's say I have a game of chance. You roll a 20-sided die. If you roll a natural 20, I will give you twenty dollars. If not, you will owe me one dollar. What are your expected winnings? Well, we would project you to owe me one dollar 95% of the time and win twenty dollars 5% of the time. That means your expected winning from any single dice roll is exactly five cents (0.95 x -1.0 + 0.05 x 20.0). There's no ifs, ands, or buts about it. We would project you to win five cents.

But what if 1,000 people play my game? Since the dice rolls are independent, we would project each of them to win five cents, just like in the individual game. But does this mean that we expect no one to win twenty dollars? Of course not! In fact, we would expect 50 people to win twenty dollars, we just can't predict which 50 people will win. Thus, each person expects to win five cents, even though roughly 50 people will win twenty dollars. In fact, the odds of not one person winning twenty dollars are less than one in 10,000,000,000,000,000,000,000.

So what good is our expectation of a five cent win? Simple: that's the expectation that will give us the least error over an infinite number of games. We can't improve on it because the only thing we haven't accounted for in our model of expectation is random chance. Note that this is different from the most likely outcome for one game. The most likely outcome from one game is that you lose a dollar. Nonetheless, we can't just ignore the (relatively) massive twenty dollar payout, even if it is (relatively) rare. That's how expected value works. It aggregates all possibilities into one number that has properly weighted all outcomes.*

The situation appiles exactly the same way to baseball predictions. Some pitchers are going to win 20 games (probably), we just don't know which ones. Some teams are going to win 100 games, we just don't know which ones. Some hitters are going to hit 40 home runs and drive in 120 runners, we just don't know which ones

That's why when fans react negatively to a projection system that predicts no pitcher to win 20 games, they are making a big mistake. And it's also why when you see Joe Expert predicting some team to win 100 games or some pitcher to win 20 games, he's making a big mistake. Yes, these events are probably going to happen, but it's a fool who thinks he can pick exactly which player or team is going to do it.

There are too many variables that go into a baseball season to determine which teams and players will hit the highs and lows. That's why any sane projection gives numbers that fit in a much more narrow range than you will end up seeing in the real season. There's nothing wrong with this. There's nothing else you can do. If all that remains is random noise, then it will be impossible to improve on your projection's accuracy.** If you could eliminate it, it wouldn't be random would it?

If you can, it often helps to look at player and team projection in terms of percentiles instead of mean expectation. This helps you get a sense of the uncertainty in the projetion. You can more easily see highs and lows and see a variety of outcomes. However, if all you're looking at is the mean expectation, try to keep in mind my simple dice game. Even though you won't have the shape of the curve involved, you'll still understand that this is the expectation that will give you the least error when the real results finally come in.

* Note that expected value is not necessarily the right variable to use for decision making. Value and utility are not necessarily the same thing. Furthermore, it is not true that the utility of the expected value of a particular choice is the same as the expected utility. For both baseball and my dice game, the two are likely to be close enough to be interchangable. For more information, read about the St. Petersburg paradox (which is awesome, by the way).

** That's not to say that any projection system out there is truly left with nothing but random noise. There may be opportunities for genuine improvement, but these opportunities can never fully overcome randomness. Furthermore, these improvements must be derived through rigourous statistical and scientific processes, not someone's gut feeling or some arbitrary pattern they've pulled out of thin air. These are not improvements.

2 comments:

D.Cous. said...

I'm not sure I understand (surprise). You seem to be claiming that predicting which specific teams will do well (say, the 100 wins) is a fool's errand, because of random chance, or luck. This may be true, but I think it's less true than you seem to be implying, because not all teams are the same. Isn't it more like a dice game where some of the dice are loaded, and are more likely (though not 100% likely, mind you) to roll twenties?

In that case, assuming that you can know which players have the loaded dice, wouldn't picking them be a totally legitimate exercise? Sure, they might just come up with nothing, and someone with a non-loaded die might still roll a twenty, but it's still more likely to work out that way than not.

Am I making sense here? Certainly, a lot of people are going to think that their team is the one with the loaded die (say, the best combination of players), and so you'll get a lot of bad predictions, but aren't you also likely to get some good ones, by the people who correctly guess the indicators of a team that is likely to succeed?

Also, who in the world has twenty-sided dice?

John Lynch said...

Yes, you're absolutely right. Obviously, some teams are more likely than others to win 100 games. Nonetheless, given the choice between "The Boston Red Sox will win 100 games this year" and "A team that is not the Boston Red Sox will win 100 games this year," you should take the latter almost all the time. It's a rare, special team whose median projection involves winning 100 games.

When an analyst actually picks *the* team that will one 100 games or *the* pitcher that will win 20 games, they are going to be wrong (and deservedly so) more often than not.

Don't mistake the forest for the trees here. Obviously, we have more information about baseball than dice. Yes, if you put a gun to my head, I can make an intelligent guess about which team will win 100 games. Nonetheless, even if you sit down and pencil in the most likely team to win 100 games to actually win 100 games and the most likely pitcher to win 20 games to actually win 20 games, you will still be more wrong more often than the guy who doesn't pick any team to win 100 and doesn't pick any pitcher to win 20 but instead puts down a mean expectation.

Even if we load the dice, as you say, the story doesn't change. Even if some dice are five times as likely to roll 20s, we would never put down 20 as our expectation for those players, even though they are far more likely to actually roll 20 than those with unloaded dies. Thus, when comparing each player's expected result, we will still see no one projected to roll 20, even though it is a near certainty that someone will and even though some players are more likely than others to do so. If a dice "fan" were to start complaining that my system is broken because I don't predict anyone to roll a 20 when *obviously* some has to, he's missing the point.

So, in summary:
* Yes, some teams are better than others.
* The field is almost always more likely to accomplish a particular task than even the most likely participant.
* Thus, a good projection will rarely predict any single team or player to reach an accomplishment that is common from the league's perspective but not from a team's perspective.