When we last addressed this topic, I presented evidence based on a model for expected winning percentage from runs scored and runs allowed that preventing runs is no more or less important, in general, than scoring runs. However, I noted that if our model were not to be trusted, then naturally neither could our conclusions. To this end, I will now endeavor to present the evidence that this model, Bill James' so-called Pythagorean Theorem of Baseball (PToB), does indeed provide us with a sufficiently accurate estimation of winning percentage that is not biased with respect to runs scored and runs allowed.
For clarity, I do not assert that the work here is groundbreaking or particularly original. Undoubtedly, someone has already performed the task of verifying the good old PToB. Nonetheless, for those of you who have not seen this before, this should give you plenty of food for thought.
First, let's examine just how accurate the model is. To do this, I calculated the expected winning percentage of each Major League Baseball team from 1900 through 2006. I then compared this with each team's actual record to see how many wins difference there was between the actual win total and the expected win total. This resulted in a total of 2160 team-seasons for analysis. Here are the results, in histogram form:
Here we see the frequency with which deviations from expected win total were distributed. Each bar represents the total of all team-seasons whose deviation from its expected win total was within 0.5 wins of the deviation represented by the bar. This effectively puts each team-season into a bin and then counts the number of team-seasons in each bin. In this case, we see that the +1 bin contains more than 200 occurrences, indicating that from 1900 through 2006 more than 200 teams finished with 0.5 to 1.5 more wins than their expected win total.
The important thing to take away from this is the shape of the distribution: the data are quite normally distributed about zero. This indicates that the PToB evenly distributes its error on either side of the actual win total for a given team-season. This is reassuring result because it means that the PToB does not appear to be inherently biased towards over- or under-estimating win totals.
In fact, with in this sample, the mean deviation from actual win total was -0.0359 with a standard deviation of 4.04 wins. This means that 68% of team-seasons will have actual win totals within roughly 4 wins of their expected win total.
Now we know the extent to which we can trust the PToB model. However, we need to go a bit farther than that to place confidence in out previous conclusions. Specifically, we need to demonstrate that the PToB is not biased towards run scoring or run prevention. For example, if the model consistently over-estimated the win total for teams with high runs scored totals and under-estimated the win total for teams with low runs allowed totals, then this would indicate that it was not properly valuing run scoring and run prevention relative to each other.
To put it another way, if teams that allow fewer runs than other teams consistently beat their expected win total, then we would have to ask ourselves why this was the case. We would be forced to conclude that run prevention was not being properly valued in the PToB; obviously we would need to place more emphasis on run-prevention to correct for the constant under-estimation of win totals for teams that allow few runs. On the other hand, if we cannot find these patterns, then this is an excellent indication that the PToB does indeed value run scoring and run prevention correctly relative to each other. This in turn would make it an excellent tool for answering our original question: is a run scored equal to a run saved?
Let's look at some more data:
Here we see a scatter plot of runs scored versus deviation from expected win total. See a pattern? I sure don't. This is pretty much a text book example of two data sets that are not correlated: all we have is a giant blob of points with no apparent relationship. Indeed, by doing a regression on the data, we find that a team's runs scored total can explain only 0.15% (r-squared of 0.0015) of team's deviation from expected winning percentage. To say that this is not in any way significant would be an understatement.
Let's do the same for runs allowed:
Again, we see a formless blob. Regression results are also similar: runs allowed explain only 0.39% (r-squared of 0.0039) of a team's deviation from its expected win total.
One last test: let's see if the ratio of runs scored to runs allowed shows any significant trend. If it did we could theorize that the PToB was biased towards teams with significant gaps between runs scored and runs allowed.
Same result: another formless blob. RS/RA accounts for only 0.56% (r-squared of 0.0056) of a team's deviation from its expected win total.
So what can we make of all this? Essentially, the Pythagorean Theorem of Baseball model does not show a discernible bias towards teams' runs scored and runs allowed totals. In fact, a team's skill at preventing or scoring runs tells us next to nothing about how it will deviate from its expected win total. This is strong evidence in support of the conclusions that we drew from analyzing the effect of varying runs scored and runs allowed on expected winning percentage. Since the average deviation from expected win total is centered around zero and shows no bias towards runs scored or runs allowed, the expected winning percentages that the PToB provides us are a good way to measure the effects of run scoring and run prevention on real life win totals.