Basebology (The Study of Baseball): Is the wins statistic useless?

Tuesday, September 29, 2009

Is the wins statistic useless?

Over at Rays Index, they make the case that wins, as a statistic, are not useless (hat tip to Rob Neyer).

The problem with Wins as an evaluator of starting pitchers is not that it is bad statistic. It is simply a matter of sample size. In a single game, a win or no win is not a good indicator. Why? Small sample size (n=1). However, ERA, for example, is a per inning stat. So in a single game, a pitcher’s ERA will have 5-9 data points (n>>1). Over the course of a full season, stats like ERA+, FIP and tRA have a sample size of 150-220 for each pitcher.

And later on (emphasis all mine):

In fact, in the absence of other stats, Wins is a very good, if not great, indicator of a pitcher’s value. So next time you hear somebody say Wins is a crappy way to evaluate a pitcher, throw a drink in their face and then make them read this post.

To me, this is a lot like saying that in the absence of anesthetic, a piece of wood to bite down on is a good pain management tool. Yeah, I guess that's sort of true, but it's also a completely useless observation in the modern world where anesthetic is always an option. Sort of like how, given the plethora of available information, wins are... ...completely useless. You would and should never prefer them when you have access to other, better statistics. Opting for wins to evaluate a pitcher is like opting for the piece of wood when your leg is being amputated. In the modern world, it's never defensible.

Also, the problem with wins is emphatically not small sample size. Even if pitchers played 1,000,000,000 games every season, wins would still be worth shit because pitchers play with the same offense and bullpen day in and day out. Those pitchers with better offenses and better bullpens will get more wins than those without and there's nothing that a large sample size can do about it. Indeed, larger sample sizes will make clear exactly how large this bias is. Wins are bad because they can do nothing to correct this bias.

Do not use wins. That is all.

**EDIT** J.C. Bradbury gets in on the action here. Worth reading.

5 comments:

Robert Lynch said...: Yes.

Very yes.; September 29, 2009 at 11:19 AM
Lisa said...: Does FIP mean full innings pitched?
What about tRA? (true run average? Maybe the pitcher gots some runs he didn't earn...?) Just trying to have a more complete baseball education.; October 5, 2009 at 1:32 PM
John Lynch said...: Honestly, I couldn't even tell you what tRA is off the top of my head. FIP is Fielding Independent Pitching, a measure of pitcher effectiveness that purports to remove many of the effects of fielding on ERA.

Knowing what those two stats are isn't really germane to the discussion though. The point is that almost anything is better than wins.; October 5, 2009 at 2:29 PM
The Professor said...: So. You dont think better pitchers win more games in the long run? Because that is all that I am saying in the post. Are there exceptions? Sure. Some smokers live to be 100 years old. That doesn't mean smoking doesn't shorten a person's expected lifespan.

And you must have missed the editor's note about the statement "in the absence of other stats." Try not to ignore the facts of the post and get caught up in the wording. All I was saying there was that OVER TIME we don't need other stats to interpret what a pitcher's win total means. That the win total alone OVER TIME can give us a *sense* of how good a pitcher is.

Unless of course you actually believe that better pitchers don't win more games...OVER TIME; October 9, 2009 at 9:52 PM
John Lynch said...: The Professor,

Thanks for commenting. I'm always surprised when people find my little corner of the Internet, especially when it happens to take the form of honest to goodness interaction with the people to whom I link.

I have a variety of issues with your original post and your comment here.

1. Certainly it is true that over time better pitchers will tend to win more games on average than worse pitchers. I don't dispute this. We would be in agreement that in the absence of many other options, wins would become a good indicator of pitching ability.

2. However, just because wins are better than nothing doesn't mean they are good in general. Since we never find ourselves without other superior information in the modern world, wins are rendered useless relative to other statistics, even over the long run.

3. Thus, I must question why one would write an article whose point is either that wins are objectively a good statistic or that wins are a good statistic under a set of circumstances that will never arise. I believe the first point to be entirely wrong and I believe the second point to be true but unhelpful: if someone understands your point, they will have learned nothing that will further their ability to analyse baseball, given modern access to information; if someone misunderstands your point, they will erroneously believe that the wins statistic carries more weight than it does. This is the last thing discussions of pitching need.

4. You closed with the line: "So next time you hear somebody say Wins is a crappy way to evaluate a pitcher, throw a drink in their face and then make them read this post." Would you throw a drink in the face of someone who said that a horse and carriage was a crappy way to commute to work? Or that an abacus was a crappy way to perform complicated math problems? Or that a piece of wood to bite down on was a crappy substitute for anesthetic? Of course not! Yet at one time these were all great options, and they are still better than nothing. People could rightly call them crappy because, in the modern context, they are crappy. Wins are the same way.

5. Finally, even in the long run wins are subject to many biases that do not necessarily even out. I address this in my final paragraph. The classic example of this effect is, of course, Bert Blyleven. His .534 winning percentage over nearly 5,000 innings pitched is a pretty poor indicator of his true value. Even over the course of many, many, many games, there are still better ways to evaluate pitchers. So why settle for a form of measurement that fails to provide an accurate assessment of pitching ability when the modern world has rendered the need to do so completely obsolete? This, I believe, is the source of my (and others') confusion with your piece.

Finally, if we can agree that:
1. better pitchers tend to accrue more wins over time than worse pitchers, and
2. in the modern world, there are always better measurements of pitcher performance than wins, even over time,
then I do not think we ultimately disagree.

Again, I thank you for taking the time to comment here. If I still have not understood the thrust of your post, I am more than willing to be further informed.; October 9, 2009 at 11:09 PM

Post a Comment

Key Stats

ARP
Adjusted Runs Prevented

ARP measures the amount of runs that a relief pitcher prevented from scoring above what an average relief pitcher would have prevented. ARP is adjusted for the situation in which the pitcher was used.

ISO
Isolated Power

ISO is the ratio of extra bases that a player has accumulated to the number of at bats he has received. ISO is essentially a player's SLG minus his batting average. This has the effect of giving a player credit only for extra base hits. ISO is not a useful measure of player value on its own, but is a very effective measure of a player's extra base ability.

OBP
On Base Percentage

OBP is the ratio of the number of times a player reached base safely to the number of opportunities he had to reach base. It effectively measures a player's skill at not making outs. Since outs are a teams most precious commodity, OBP measures perhaps the most valuable and fundamental skill a player can have.

OPS
On Base Plus Slugging Percentage

OPS is a crude metric that simply sums a player's on base and slugging percentages. It is probably the most popular non-traditional measure of overall batting performance due to its simplicity. However, it has drawn criticism from performance analysts for its inaccuracy relative to other advanced metrics and because it works by adding two numbers with different denominators together to produce a conceptually meaningless quantity. It is best used as a quick and dirty estimator of batting prowess.

SLG
Slugging Percentage

SLG is the ratio of total bases that a player has accumulated to the number of at bats he has received. It is essentially a weighted batting average that gives a player more credit for extra base hits.

UZR
Ultimate Zone Rating

UZR is a defensive metric that uses play-by-play data to determine how good a player's defense is. On Fangraphs, it is denominated in runs saved above average.

VORP
Value Over Replacement Player

VORP measures the amount of runs that a player contributed above what a "replacement player" at the same position would produce. VORP considers only offensive contributions.

WARP
Wins Above Replacement Player

WARP measures the amount of wins that a player contributed above what a "replacement player" at the same position would produce. WARP considers both offensive and defensive contributions.

WXRL
Win Expectancy added above Replacement adjusted for Lineup

WXRL measures the amount of wins that a relief pitcher contributed above what a "replacement player" would produce. WXRL differs from WARP because it is adjusted for both the game situation in which the pitcher was used and the hitters that the pitcher faced.

Basebology (The Study of Baseball)

Tuesday, September 29, 2009

Is the wins statistic useless?

5 comments:

Blog Archive

Key Stats

Contributors