Sunday, October 4, 2009

Sportscaster laziness

I think what mostly drives me crazy about sportscasters isn't so much that they do not see the baseball world like I do. There's room for healthy disagreement there. No, I think the thing that drives me crazy is that they are too often lazy in their analysis. Willing to fall back on clichés, they often slip into a sort of absurd autopilot in which certain events automatically trigger certain comments regardless of their correctness.

Here's the example that set this off. During the top of the ninth inning of today's game pitting the Tigers against the White Sox, the Tigers had a double play opportunity with a runner on first and no one out while protecting a two run lead. Mark Kotsay hit a groundball to Adam Everett. Everett went to second for the force, but Placido Polanco's relay to first was slightly offline, and Miguel Cabrera came off of first to field it, so the runner was safe. Rod Allen (I know, I know; fish in a barrel) immediately praised Cabrera for his decision to leave the bag and ensure that the runner didn't move into scoring position.

This is autopilot extraordinaire. It's the ninth inning with a two-run lead. That run does not matter. Any baseball fan worth his salt knows this. The only thing that matters are outs. The only advantage of keeping the runner on first by coming off the bag and forfeiting any shot at a double play is that you keep the double play in order for the next batter. Note further that even if you stretch and don't get the double play, it's not assured that the ball will get by you. You still may knock it down and prevent the runner from advancing. Thus, to come off the bag in that situation, you have to believe that the odds of not getting the double play AND having the ball get by you were so high that it was more likely that the next batter would hit into a double play.

This is highly unlikely. In fact, I'd be willing to bet that for any relay throw that is not truly atrocious (which again, Polanco's was not), the better play is to stretch for the throw. Furthermore, Fernando Rodney, the current pitcher, is not an extreme groundball pitcher (he's roughly neutral for his career). Alex Rios, the next batter, is not a groundball hitter (about three groundballs for every four flyballs in his career).

Sportscasters should know this. They should be able to make this point. They should be able to talk about the advantages and disadvantages of each decision. They should not default to meaningless and unhelpful platitudes. Rod had a great chance to talk about the little details of baseball and he blew it because he, like so many other sportscasters, are just up there spewing trite baseball clichés and cashing a paycheck.

I should note that the fact that Rios actually hit into a double play does not change the analysis. This result was far from certain and could not have been known ex ante.

Thursday, October 1, 2009

Payroll and Playoffs

Rob Neyer discusses the implications of the fact that six of the nine top teams in payroll are making the playoffs this year:
Well, you can't really compete with the Red Sox. Not unless you're the Yankees, anyway. But there are still plenty of teams wasting plenty of money. The Twins probably aren't going to the playoffs, but they certainly could have. The Rays are stuck in the wrong division. And more to the point, we just can't read too much into one possibly anomalous season.
The Forbes article to which he links also notes: "In 2006, three postseason clubs (Tigers, Twins and Padres) ranked 14th or lower."

Here's the thing. The Tigers are now one of those top payroll clubs making the clubs. Payroll is not static. Should we hold it against the Tigers that they've turned themselves into a top spending club in just a few years? That's a good thing. We want teams that spend to have greater success. It encourages owners to keep investing in their product on the field. The Tigers are a great example of what we should be encouraging in baseball.

Looking at payroll is misleading because teams let their payroll rise and fall over time depending on how they perceive their chances of success. Teams correctly recognize that they should spend more money when the marginal value of a win is the highest. Generally this occurs when a team is the 88-92 win range and additional wins will drastically increase the likelihood of making the postseason. Teams can and do let their payroll spike when a playoff berth is in range.

Thus, focusing on payroll is missing the point. No, the focus should be on whether or not all teams have roughly equal opportunity to support a large payroll. What Major League Baseball needs is a revenue sharing system based on teams' revenue potential, not their actual revenue. This correctly lines up the incentives in the system and ensures that all teams have an opportunity to let their payrolls rise when they get the chance to make the playoffs.

Higher paid teams are always going to be (and should be) more represented in the postseason. Instead of trying to fight that trend, MLB should be trying to find better ways to balance access to revenue.

Tuesday, September 29, 2009

Is the wins statistic useless?

Over at Rays Index, they make the case that wins, as a statistic, are not useless (hat tip to Rob Neyer).
The problem with Wins as an evaluator of starting pitchers is not that it is bad statistic. It is simply a matter of sample size. In a single game, a win or no win is not a good indicator. Why? Small sample size (n=1). However, ERA, for example, is a per inning stat. So in a single game, a pitcher’s ERA will have 5-9 data points (n>>1). Over the course of a full season, stats like ERA+, FIP and tRA have a sample size of 150-220 for each pitcher.
And later on (emphasis all mine):
In fact, in the absence of other stats, Wins is a very good, if not great, indicator of a pitcher’s value. So next time you hear somebody say Wins is a crappy way to evaluate a pitcher, throw a drink in their face and then make them read this post.
To me, this is a lot like saying that in the absence of anesthetic, a piece of wood to bite down on is a good pain management tool. Yeah, I guess that's sort of true, but it's also a completely useless observation in the modern world where anesthetic is always an option. Sort of like how, given the plethora of available information, wins are... ...completely useless. You would and should never prefer them when you have access to other, better statistics. Opting for wins to evaluate a pitcher is like opting for the piece of wood when your leg is being amputated. In the modern world, it's never defensible.

Also, the problem with wins is emphatically not small sample size. Even if pitchers played 1,000,000,000 games every season, wins would still be worth shit because pitchers play with the same offense and bullpen day in and day out. Those pitchers with better offenses and better bullpens will get more wins than those without and there's nothing that a large sample size can do about it. Indeed, larger sample sizes will make clear exactly how large this bias is. Wins are bad because they can do nothing to correct this bias.

Do not use wins. That is all.

**EDIT** J.C. Bradbury gets in on the action here. Worth reading.

Thursday, September 24, 2009

Let the games begin!

Which is to say, now that the Yanks are in the postseason, let's all start dissecting Alex Rodriguez like he was a stale frog in high school biology! Fangraphs' R. J. Anderson gets the party started with this fun piece. I can just smell the formaldehyde!

Monday, September 21, 2009

Sabermetric groupthink?

This post by J.C. Bradbury collects some comments on the notion of sabermetric groupthink: the idea that those of a sabermetric bent, like myself, tend to unjustly treat those who don't adhere strictly to sabermetric orthodoxy as morons. I'll leave aside the question of whether a particular sabermetric orthodoxy exists. Rob Neyer addresses that question well here. I want to make a more general point.

The overarching sabermetric philosophy is not related to baseball. The core sabermetric principle, as I understand it, is that baseball must be analyzed as a science. That's it. If your analysis of baseball is scientific, it is sabermetric. Of course, the "gotcha" here is that science is empirical. It is very, very hard to have science that spurns numbers of some kind or another because numbers are the language of empirics. Thus, sabermetrics tends to focus on numbers, on the quantitative over the qualitative.

Sabermetrics does not reject qualitative analysis. There is certainly a role for scouting and experience in baseball. No sabermetrician worth his salt disputes this. However, the use of qualitative analysis cannot be an excuse to flout systematic application of scientific principles. Qualitative analysis still must be backed up by empirical research. It must have a sound empirical basis and it must be vetted empirically.

How might a team do this? A good start is trying to systematically quantify how good your scouts are. Which scouts provide the best reports? How much information do these reports provide beyond what is available statistically? Can scouting data be incorporated into a useful model of player performance?

The point is that no matter what analysis you are undertaking, it must be systematic. You must know, in advance, how new data will inform your thinking. Too often this is not the case. Too often numbers are used ex post facto to provide faux-intellectual cover for unsystematic decisions. Too often numbers are used to confirm preexisting biases of those using them. Too often numbers are ignored when they provide evidence that runs counter to a cherished belief. Too often people use scouting and experience as escape hatches to avoid having to deal with the rigors of systematic, scientific analysis.

This is what causes sabermetricians to go crazy. It's not that we can't deal with scouts. It's not that we don't like baseball stories and anecdotes. It's not that we don't think men with experience have nothing to offer. Far from it. No, the problem is that we cannot stand the unsystematic, unscientific analysis that those in highly visible positions often engage in. It's lazy and worse: it's absolutely wrong. It must be shunned wherever it is found.

Let me close with a quotation from Malcolm Gladwell from this interview with Bill Simmons:
That's why I'm such a fan of the "Moneyball" generation of baseball GMs: It's not so much that their analytical tools are brilliant ways of predicting baseball success (and I have my doubts, sometimes), it's simply that they have an analytical tool. And when it comes to personnel evaluation, any tool is better than no tool...
Bingo. The merits of any particular tool, whether it be batting average, on base percentage, VORP, or scouting reports, are always up for debate. The important thing is that you have a tool and that you apply it systematically.

**EDIT** Here is a link to the Ken Rosenthal article that started it all. I like Ken. He does good work. Unfortunately, this article is an example of exactly what I'm talking about above. Ken throws out a bunch of numbers and throws in some other observations for good measure. And the result is... what exactly? How does he propose to use all this information to come up with a decision? Ken doesn't say.

Let me highlight this extended quote:
The first criterion for the award is "actual value of a player to his team, that is strength of offense and defense." Twenty-four of Mauer's 114 starts this season — more than one-fifth — have been at designated hitter, a position that requires no defense. Mauer also trails other candidates in the second criterion, number of games played.

When Mauer first stepped onto the field on May 1, the Twins already were 22 games into their season. Mauer obviously cannot be faulted for needing to recover from offseason kidney surgery, but two other MVP contenders — Tigers first baseman Miguel Cabrera and Jeter — have appeared in 141 and 139 games, respectively. Mauer has appeared in 120.

Am I nitpicking? Perhaps. But Mauer's absence in April, combined with his time at DH, raises the possibility another candidate may — repeat, may — be worthier. It certainly creates the opportunity for debate, which is my entire point.
Gee, if only we had a systematic way to weigh all these factors (playing time, quality of performance, positional adjustments, etc) to come up with an answer to our question! Oh, shit, we do! We have tons of them, and they all originate in sabermetrics.

So is there still room for debate? Of course there is! None of these systems are complete. They all have weak spots. Some are better than others. We can debate the merits of any particular system until the cows come home. The point is that you can't just throw out a bunch of disjointed pieces of information and then pull an answer out of your ass, not if you want to claim any sort of validity to your answer. You must be able to establish ex ante how one can determine who the best player is and then you must let the results of that process, that system, provide you with the best answer.

A fact

There is no good reason to deny Joe Mauer the MVP.

Thursday, September 3, 2009

More MVP talk

My mother (of all people) points me to this article by Allen Barra in the Wall Street Journal echoing my thoughts on Derek Jeter's MVP candidacy. A few thoughts:
  • The tone of the article is pretty funny. It essentially acknowledges that Joe Mauer has been better and should win, but says, "Hey, Derek's been great for a while and has been robbed a couple times. Let's give him this year's award anyway, as a kind of lifetime achievement award." I can't get behind that reasoning, but at least it's honest.
  • Naturally, the article falls back on intangibles to make Derek's case. This leads to one of my new favorite baseball quotes:
    "Some people will argue that intangibles don't exist, but in the ninth inning of close games everybody believes in them." - Marty Appel
    It's not quite as pithy as "There are no atheists in foxholes," but the sentiment is the same, and likely equally true.
  • The article strangely does not note the strongest part of Derek's case: the fact that he plays shortstop and none of the other contenders do. You don't need intangibles to close the gap between Derek and a first baseman with better offensive numbers. You just have to understand the massive, massive value of playing a tougher position.
  • Derek's longevity really is incredible. More than anything, this is what will get him in the Hall of Fame one day.
Thanks for the link, Mom!