Wednesday, November 28, 2007

Is a run scored equal to a run saved? (Cont.) (Again)

I wanted to revisit the question of whether or not a run scored is equal to a run saved at least one more time to share some more thoughts I had on the subject.

First, I want to summarize one more test that I ran using the Pythagorean Theorem of Baseball model of expected performance that I've used in previous posts. I started with a team with 750 runs scored and 750 runs allowed and then gave it 100 runs to add to its runs scored or subtract from its runs allowed in any combination. Then I checked to see which combination would give the highest expected performance.

I had expected beforehand (foolishly as it turns out) that somewhere around 50 additional runs scored and 50 additional runs saved would be the optimal expected winning percentage. As it turns out, you achieve the optimal result by deducting all 100 runs from runs allowed. The difference is not large, only one win over a 162 game season. This gives credence to the idea that a runs saved is more valuable than a run scored, though only marginally so.

This got me thinking about why this was the case. It was only then that it occurred to me that the only way to expect to win 100% of your games was to allow zero runs. No matter how many runs a team scores, if it allows even one run over the course of a season there is a chance that it will lose a game. This is why our examination of the problem using the PToB values the run saved slightly more than the run score: it puts your team closer that the perfect scenario.

Another way of looking at it is that increasing how many runs you score acts as inflation in the run economy of baseball: it devalues all other runs. Conversely, allowing fewer runs is deflationary: each run is now worth more. Therefore, an absolute difference of 100 runs is a lot more significant when the overall run totals are lower. It's exactly the same difference as the difference between Jane making $5,000 more than Dick in a mythical two-person economy in which there are only $50,000 total and Jane making $5,000 more than Dick in a $10,000 two-person economy. The differences are identical, but the difference is worth a lot more in the $10,000 universe. Since saving more runs decreases the total amount of runs in the baseball universe, you need a larger increase in runs scored to have the same impact on winning as a given decrease in runs allowed.

This ignores, however, a key aspect of the baseball landscape: you cannot save runs beyond zero runs allowed. On the other hand, you can continue to score runs ad infinitum. In other words, the value of saving runs is offset by the fact that it will quickly become very hard to make further gains in that area. It is possible, in any given baseball game, to pitch far less than perfectly and still achieve the perfect outcome for runs allowed: zero. Shutouts are fantastically more common than perfect games. Thus, even if you continue to improve your pitching, you should reach a point of diminishing returns where even though you are pitching better, it is not reflected in your runs allowed total. With hitting, you can theoretically keep improving it until you reach the perfect offense, one that never makes an out and therefore scores an infinite number of runs.

Now, on to my second observation. (Yes, that's right. The preceding 6,000,000,000 words are only my first point.)

I was going to turn to historical data and run an experiment to see what the impact of scoring and allowing additional runs was historically. Specifically, for each trial I was going to select a real baseball team of ages past and then randomly select a game from that team's season from which to subtract one run from their opponent's run total. I would do this 25 times for each team (allowing the same game to be picked twice, resulting in further deduction) and then measure what that team's new record would be (ties counting as 0.5 wins and 0.5 losses). I would then run a sufficiently large number of trials and see what the aggregate impact was. Then, I would repeat the process for runs scored, measuring what the impact was.

Then, another thought occurred to me that saved from a lot of useless work. What if instead of picking games at random, I instead enumerated all possible combinations of 25 games (again, with multiplicity) for each trial?* In that case, each trial in the runs allowed test would have a corresponding trial in the runs scored test that had the exact same 25 games picked (and vice versa). And of course, the impact of deducting a run from an opponent's total in an given game is exactly the same as adding a run to your own. Therefore, if we enumerate every possible selection of games, the results for the runs saved test would be exactly the same as the results for the runs scored test! Right?

Sort of. There's only one problem with the above logic. We haven't decided what would happen when one of the selected games for the runs allowed test was already a shutout victory. Do we skip that game and then that team loses one of its runs saved? Or do we pick another game to preserve the runs saved total? There are good arguments both ways, but I think only two observations are important for this exercise.

First, by dropping the run saved instead of searching for another game, we preserve the exact one-to-one relationship with the runs scored test. This will cause the runs scored and runs allowed test to have exactly the same results, but we now aren't dealing with identical totals of runs scored and runs saved. Secondly, if we pick another valid game from which to save a run, we preserve the runs scored and runs saved totals, but the one-to-one relationship is destroyed and the runs saved test necessarily finishes with a higher expected win total. This must be true because the runs scored test will apply some of its runs to a set of games (shutout wins) that can never increase the win total and the runs saved test will instead apply those runs saved to games that might still be won, sometimes increasing the win total.

This brings us back to the initial point. Because of the inflationary/deflationary effect of adding and removing runs from the baseball run economy, saving a given number of runs is marginally more valuable than scoring the same additional amount because saved runs must go to games you have a chance of losing, while the additional runs scored might occur in a game you already have no chance of losing.

So what impact should this have on the question of how much of "the game" is pitching and how much is hitting (and fielding and base running)?

First, I again note that the difference between a runs scored and a run saved is not large, especially when we aren't at the extremes of the two ranges.

Second, when one sets out to acquire players one does not actually acquire a given decrease in runs allowed or increase in runs scored. Rather, the players themselves generally contribute only to the individual components of run scoring and prevention. A player doesn't simply score a run (other than by hitting a home run). Rather, he hits singles, doubles, triples, and home runs, draws walks, and steals bases. A pitcher doesn't simply save runs. Rather, he throws strikes, induces groundballs, and performs other tasks that are simply components of run prevention.

This impacts our discussion because as we have noted there is a distinct lower bound on how many runs you can allow. If I have a rotation of pitchers that throw 10 hit shutouts every time out and I replace them with a rotation of pitchers that throw 81-pitch, 27-strikeout perfect games every time out, I will win exactly zero more games despite the fact that I have drastically improved my pitching. In the baseball universe, the small edge that run prevention has over run acquisition is muted by the fact that it is far easier to hit the point of diminishing returns with pitching than it is with hitting. With hitting, you never know: that 18-run outburst just might win you a game 18-16. However, the perfect game can never improve upon the results of a seven walk, four hit shutout.

So, in the end, I stand by my original line of thinking. Even though we have demonstrated that a run saved is marginally more valuable than a run scored, this difference is muted by the diminishing returns at the extremes for run prevention. Furthermore, this difference is not enough to push pitching over 50% of "the game," as Hank Steinbrenner would have us believe.



* Math note: keep in mind that the reason we do the whole "let's pick a bunch of games at random" thing is that it allows us to approximate the result we would get if we enumerated every possibility. Enumerating all the possibilities takes a prohibitively large amount of time, but doing 1,000,000,000 random samples of those possibilities doesn't take very long at all on today's computers and should be a sufficient approximation. However, when looking at the problem theoretically, we can still consider the set of all possible combinations and avoid introducing the approximation where it isn't needed.

4 comments:

D.Cous. said...

Wow, John. At this point all I can say is that I hope Mr. Steinbrenner's original comments are important enough to merit the amount and quality of analysis you've put into it.

If I understand correctly, you're saying that the marginal run saved is (almost by definition) slightly better (in terms of game outcomes) than the marginal run scored. Given this conclusion, would you say that there is an inefficiency in the market (i.e. that either pitchers or hitters are under/over paid, relative to the other)? If not that (and I suspect this to be the case), wouldn't the different prices fetched by these two skills be far more dependent on the relative scarcity of the individuals who possess them?

Suppose, as you are suggesting, that pitching (saving runs) is slightly more valuable to game outcomes than hitting. However, suppose that ANYONE can pitch, and very very few people can effectively hit. Hitters would be paid far more, because the next best alternative for the hiring team in terms of pitching would be readily available, e.g. cheap.

Am I babbling? I shall stop.

Unknown said...

First, I don't think that the difference between the two is large enough to create an inefficiency. As you suspect, the relative scarcity of each is far more important. Furthermore, there often seems to be an institutional bias (as you see with Mr. Steinbrenner) that overvalues pitching relative to hitting despite its marginal advantage. The inefficiencies in the market are certainly much more subtle than simply overvaluing pitching relative to hitting or vice versa.

jjhall08 said...

John,

First, I have some comments regarding the PToB. A while back I wanted to quantify the effect of inconsistency on win totals. My assumption was that the streaky home run hitters on the White Sox are costing them wins. So, I set about creating an analytical model based on the assumption of independence between scoring and preventing runs (i.e. your runs scored don't affect your runs allowed). I came up with a method that involves summing over the probability distributions for scoring and allowind x number of runs. Basically, if you randomly combined the runs scored total from a random game with the runs allowed total from another random game, this would give you a winning percentage. When I compared the coefficient of determination for this method with the PToB, my method was slightly better. Later, I found a paper (I don't have the reference at the moment) which gave a more detailed mathematical proof of the same concept, but went further to apply curve fits to the runs scored and allowed distributions. The author showed that, upon choosing a certain curve fit, the result was equivalent to the PToB. However, this assumes that the only variable needed to determine the runs scored distribution is the average number of runs scored (deviation from the mean is not accounted for). Nevertheless, PToB is nearly equivalent to the analytical method. I did find it interesting that the error for both methods has been getting worse over the past four years.

Second, I want point out that it is possible to look at the effect of creating more runs or saving runs by evaluating the partial derivatives of winning percentage (calucalted from PToB) with respect to runs scored and runs allowed. Furthermore, if you parameterize the PToB based on runs allowed so that x = RS/RA, or RS = x*RA, you will find the change in expected winning percentage when the ratio of runs scored to runs allowed (x) is varied. It turns out that this derivative depends on x but not on RS or RA individually. Thus, your observation earlier that it is the ratio that matters is correct. If you start with a 750:750 team and change the run differential to be 100, the team with 750:650 would be better than the team with 850:750. However, if you're allowing 750 runs in a season, I don't think you have to worry about not being able to reduce your runs allowed because you're throwing too many shutouts.

Thanks,

Jeremiah

Unknown said...

Jeremiah,

It's definitely true that one is never likely to run into the point where you've actually reached that point of diminishing returns with saving more runs. Furthermore, I think even the most hardcore "pitching and defense" advocate in baseball would recognize when he had enough pitching to focus on hitting long before he reached that point.

In other words, I think that the difference between scoring an extra run and saving an extra run are dwarfed by the other factors that weigh into acquiring baseball players, in particular the availability of different players. One almost never has a choice between a player that would save ten additional runs and a player that would add ten additional runs. The choice is always between a range of players with a range of performance potential. It's probably best to simply acquire the largest total of runs saved and runs scored without much preference to saving them.

Thanks for the comment. More food for thought is always appreciated.