Tuesday, October 16, 2007

The McCarver Strikes Back

Today, during the Indians-Red Sox game, as I was watching live, McCarver again reiterated the not-so-surprising not-so-revelation that your chances of a multi-run inning are higher after a lead off home run than a lead off walk. Why does this surprise Tim McCarver?

Here's how it works. In the game state with no one on base and no one out (hereafter abbreviated 0-000), you have a probability of scoring exactly zero runs from then on (P0), a probability of scoring exactly one run from then on (P1), and the probability of scoring more than one run from then on (P2+ = 1 - P0 - P1). Conversely, the probability of no multi-run inning is P0 + P1 = 1 - P2+.

After a lead off home run, you return to the 0-000 game state. Only now, in order for there to be a multi-run inning, you only need to score one more run. Therefore, the probability of no multi-run inning is now just P0. The probability of a multi-run inning has become P1 + P2+.

After a lead off walk, you enter the game state 0-100 (man on first, no one out). You still need to score two more runs. As before, you have a probability of scoring zero runs P0', one run P1', and two or more runs P2+'.

In order for the lead off walk to be more valuable, P0 would have to exceed P0' + P1'.

Now, P0 is roughly 0.72. P0' is roughly 0.58. P1' is roughly 0.25. Therefore, P0' + P1' is equal to roughly 0.83. Therefore, the probability of scoring one run or more from state 0-000 is 0.28. The probability of scoring two or more runs from state 0-100 is 0.17. (All of these numbers are based on Keith Woolner's "An Analytical Framework for Win Expectancy" from Baseball Prospectus 2005.)

The difference is roughly 11%. Most (all?) of this will be accounted for by the fact that the man on first can be doubled off and the man who hit the lead off home run can't.

So that's the math. But do you really need it? A home run is one guaranteed whole run that no one can take away. The lead off walk increases your odds of scoring exactly one run by roughly 14%. The lead off home run increases those odds by ONE HUNDRED FREAKING PERCENT.

**EDIT** As pointed out in comment number one, referencing the probability of scoring one run is misleading, as scoring one run and scoring zero runs both count for nothing for the purposes of counting multi-run innings. The ninth-inning analogy is apt: it's the second runner scoring that is important and neither the lead off walk nor the lead off home run will have a great affect on what that second runner does. What is important is that the lead off home run eliminates the probability of the double play and the lead off walk does not. In other words, it is the out that is important, not the run. Point well taken. **END EDIT**

Tim, do us a favor and stop bringing this up as if it's surprising. You will sound a whole lot more intelligent and we I will be a whole lot less aggravated.

7 comments:

google said...

"A home run is one guaranteed whole run that no one can take away. The lead off walk increases your odds of scoring exactly one run by roughly 14%. The lead off home run increases those odds by ONE HUNDRED FREAKING PERCENT."

Except that a one-run inning is no better than a zero=run inning for the purposes of having a multi-run inning. You got it right the first time: Anybody getting on base, regardless of which base, has roughly the same chance of launching a multirun inning (DPs aside).

(Think of it in terms of a team down two runs going into the 9th inning - since you have to score two, it doesn't matter what the leadoff batter does, so long as he doesn't make an out - another batter's going to have score anyway, and it's hard for him to do that without the runner scoring as well.)

None of which is to defend McCarver, who seemed to be saying he thought a walk would lead to more multirun innings, which is inane.

die Amerikanerin said...

My train of thought:

lead-off home run = half way to goal of multiple home-run inning.

lead-off walk = possibly half way to goal of multiple home-run innings.

So... yes, I have to agree. What is Tim McCarver finding so surprising here?

John Lynch said...

"Except that a one-run inning is no better than a zero=run inning for the purposes of having a multi-run inning. You got it right the first time: Anybody getting on base, regardless of which base, has roughly the same chance of launching a multirun inning (DPs aside)."

All true. You can't take away the DP possibility though, which is the thrust of that first sentence in that paragraph. No one can take the runner who score on the lead off HR away. Not so with the lead off walk. I grant the the probability of scoring one run is irrelevant, since it's actually the probability of erasing that runner via an out that's important. Mea culpa.

I'm mostly amazed, as you are, that McCarver thinks that walks would lead to more multi-run innings that home runs, which doesn't make any sense at all.

D.Cous. said...

Imagine an intersection of two streets, where for one reason or another, traffic accidents occur with relative frequency.

How is the probability of having two or more INDEPENDENT accidents in the course of the day, affected by of these two possible events:

1. The first car through the intersection that day runs a red light.

2. The first car through the intersection that day crashes.

The only way you would expect P(2 or more accidents) to be higher given the first event is if you are superstitious, and believe that running a red light sheds bad Karma onto the intersection for the day.

I realize that this is not a perfect analogy, because a player who walks stays in play, while a car passing through an intersection does not.

However, unlike TM, I would expect P(2 runs) < P(1 run) at any given state in an inning, even with one runner on base. Given that that is true (which it almost must be), a lead-off homer, which causes no out s and therefore does not effectively change the state of the inning, brings about a circumstance where P(2 runs) = P(1 add'l run) = P(initial run).

John Lynch said...

I think the problem with the car analogy is that the baseball events are actually quite dependent. The walk is more like running a red light and then stopping in the middle of the intersection. There's a great chance that you will still cause an accident.

Look at it this way: what if base runners could not ever be eliminated? In this bizarro baseball, there actually isn't a difference between the walk and the home run for the purpose of multi-run innings. For starters, once you reach base safely, you can steal every other base, including home, and never be called out. So if you walked, you might as well have homered.

Even if you impose a restriction like only allowing baserunners to advance as many bases as they are forced by the current batter, there still is no difference with respect to multi-run innings. Once a batter reaches base, he will always score if a runner behind him scores. Therefore, the only important event is what happens to the runner behind the initial runner. If he scores, it's a multi-run inning. If he does not, there is no multi-run inning. Therefore, the only difference between the walk and the home run with respect to multi-run innings is that the runner may be eliminated.

It should be noted that, even in real baseball, it actually would be possible for the walk to be more valuable than the home run with respect to multi-run innings,, IF walks implied more about the state of the game. For example, suppose that it took eight balls to walk and only two strikes to strike out, but that batters hit an average of 120 home runs a year. In this case, a walk might indicate MORE than just a change in base-out state. It might indicate that a pitcher is totally terrible or has lost his command. In this case, the walk might lead to more multi-run innings because it would indicate more about the game state than the home run. The game state would move from <0-000, Probably Average Pitcher> to <0-000, Probably Crappy Pitcher>. In other words, it would matter how we got to our current state, not just what that state was.

Our current model of baseball assumes that each state is conditionally independent of past states (see Markov property). It turns out that this is an accurate model for baseball on a large scale. However, if that assumption were not true, then the walk could assume larger significance.

However, in the current configuration of baseball walks are too common, especially relative to home runs, to indicate this type of thing reliably.

Jack Lynch said...

OK - we've been talking about probabilities. Now can we find out historically in other words, what has really happened in both cases. What if the runner at first causes pitchers to be distracted and therefore does lead to more muti-run innings?

John Lynch said...

Now can we find out historically in other words, what has really happened in both cases.

Others have done this. I took this data from a post on FJM:

2000:
1139 Leadoff HR => 294 2+ run innings, 25.81%
2705 Leadoff BB => 622 2+ run innings, 22.99%

2001:
1026 Leadoff HR => 247 2+ run innings, 24.07%
2238 Leadoff BB => 473 2+ run innings, 21.13%

2002:
1015 Leadoff HR => 229 2+ run innings, 22.56%
2296 Leadoff BB => 508 2+ run innings, 22.12%

2003:
1051 Leadoff HR => 252 2+ run innings, 23.97%
2261 Leadoff BB => 479 2+ run innings, 21.18%

2004:
1071 Leadoff HR => 246 2+ run innings, 22.96%
2321 Leadoff BB => 443 2+ run innings, 19.08%

2005:
979 Leadoff HR => 210 2+ run innings, 21.45%
2135 Leadoff BB => 462 2+ run innings, 21.63%

2006:
1069 Leadoff HR => 260 2+ run innings, 24.32%
2154 Leadoff BB => 490 2+ run innings, 22.75%


The historical probabilities are closer than our back of the envelope calculation, but still significantly out of line with McCarver's expectations.

Again, it's virtually impossible, outside of the bizarro scenario that I drew up earlier, for the walk to yield more multi-run innings. The best it can do is tie, which it still does not do.

Note that as with all matters of probability, walks (as in 2005) sometimes will lead to more multi-run innings, but that because the statistical evidence in the aggregate is overwhelmingly against the lead-off walk, this is almost assuredly simple statistical noise. We would never expect this to be the case, even if it sometimes will be.