Saturday, February 28, 2009

Is VORP dead?

I've said a few times that one of the things I would love from folks who hate VORP is an honest critique of it from a scientific and analytic perspective. Naturally, those who hate VORP have not done this, because to do so would legitimize the real thing that they hate: the use of science to describe something that they believe transcends such vulgar quantifications. This is nonsense, and we don't need to get into it again.

Of course, one of the hallmarks of science is that the old is always being discarded in favor of the new. Indeed, VORP has come under some criticism recently, but not the banal criticism you've come to expect from the neanderthal luddites that have a stranglehold on print, radio, and television. Rather, it's been the subject of intelligent criticism from people who understand not just how VORP works, but how science works.

The bottom line is that VORP is flawed. You can read a good rundown of its flaws here. Essentially, VORP incorrectly calculates positional adjustments and underestimates the impact of walks (ironically, it has been noted) and doubles. These are fixable problems, of course, and it may be that VORP will be adjusted in the future to account for them. Baseball Prospectus has already started to overhaul WARP to adjust for some of its deficiencies.

In any case, using VORP is still miles better than falling back on the classical trifecta of batting average, home runs, and RsBI (for you, Cous). However, just because it was quick to arrive on the analytical scene or because it's promoted by the biggest name in sabermetrics doesn't mean that deserves to stick around.

I haven't been hitting the stats too hard recently on this site. If and when I get back to it, don't be surprised to see VORP supplanted by a statistic with a better run estimator and positional adjustments. There's no reason to become attached to a particular statistic when we have the tools to progress past it.

Friday, February 27, 2009

Tim Raines: Master Thief

Poaching from Tom Tango again:
Tim Raines, in high-leverage situations, has reached 1B or 2B 755 times (1B+2B+BB+HB+ROE), and has 260 SB, 49 CS.  That’s 0.34 successful SB per opportunity.

Tim Raines, in low-leverage situations, has reached 1B or 2B 1312 times, and has 134 SB and 15 CS.  That’s .10 SB per opp.

Do you see that?  He stole a base 10% of the time in low-leverage situations, but 34% of the time in high-leverage situations.

So, to all those BBWAA writers who said that he didn’t steal enough… yeah, he didn’t pad his SB totals when the game situation didn’t matter. 
Would you like player who successfully steals a base 34% of the time when it really matters? while only making an out 6% of the time? Yes?

Why is this man not in the Hall of Fame?

Thursday, February 26, 2009

Properly interpreting projections

Every year, right about now, you see two related phenomenon:
  1. Fans completely misinterpreting the predictions of objective projection systems.
  2. "Experts" releasing hilariously skewed subjective predicitons.
What am I talking about? Specifically this: people, whether interpreting objective projections or making their own subjective projections, fail to discern the difference between saying that no individual team is likely to win 100 games and that the league is likely not to see any team win 100 games; the difference between not projecting a single pitcher to win 20 games and projecting the league not to have any 20 game winners; the difference between projecting that no single hitter is likely to hit 40 homeruns and drive in 120 runners and projecting that the league is likely not to see any hitter hit 40 homeruns and drive in 120 runners.

Do you think all these things are the same? If you do, you are failing to understand a basic concept in probablility. There's no shame in this. Probability is freaking confusing. Unless you're really, really well-versed in it, you're going to screw it up all the time. I know I do. Nonetheless, let's dig a little deeper into this problem.

Let's say I have a game of chance. You roll a 20-sided die. If you roll a natural 20, I will give you twenty dollars. If not, you will owe me one dollar. What are your expected winnings? Well, we would project you to owe me one dollar 95% of the time and win twenty dollars 5% of the time. That means your expected winning from any single dice roll is exactly five cents (0.95 x -1.0 + 0.05 x 20.0). There's no ifs, ands, or buts about it. We would project you to win five cents.

But what if 1,000 people play my game? Since the dice rolls are independent, we would project each of them to win five cents, just like in the individual game. But does this mean that we expect no one to win twenty dollars? Of course not! In fact, we would expect 50 people to win twenty dollars, we just can't predict which 50 people will win. Thus, each person expects to win five cents, even though roughly 50 people will win twenty dollars. In fact, the odds of not one person winning twenty dollars are less than one in 10,000,000,000,000,000,000,000.

So what good is our expectation of a five cent win? Simple: that's the expectation that will give us the least error over an infinite number of games. We can't improve on it because the only thing we haven't accounted for in our model of expectation is random chance. Note that this is different from the most likely outcome for one game. The most likely outcome from one game is that you lose a dollar. Nonetheless, we can't just ignore the (relatively) massive twenty dollar payout, even if it is (relatively) rare. That's how expected value works. It aggregates all possibilities into one number that has properly weighted all outcomes.*

The situation appiles exactly the same way to baseball predictions. Some pitchers are going to win 20 games (probably), we just don't know which ones. Some teams are going to win 100 games, we just don't know which ones. Some hitters are going to hit 40 home runs and drive in 120 runners, we just don't know which ones

That's why when fans react negatively to a projection system that predicts no pitcher to win 20 games, they are making a big mistake. And it's also why when you see Joe Expert predicting some team to win 100 games or some pitcher to win 20 games, he's making a big mistake. Yes, these events are probably going to happen, but it's a fool who thinks he can pick exactly which player or team is going to do it.

There are too many variables that go into a baseball season to determine which teams and players will hit the highs and lows. That's why any sane projection gives numbers that fit in a much more narrow range than you will end up seeing in the real season. There's nothing wrong with this. There's nothing else you can do. If all that remains is random noise, then it will be impossible to improve on your projection's accuracy.** If you could eliminate it, it wouldn't be random would it?

If you can, it often helps to look at player and team projection in terms of percentiles instead of mean expectation. This helps you get a sense of the uncertainty in the projetion. You can more easily see highs and lows and see a variety of outcomes. However, if all you're looking at is the mean expectation, try to keep in mind my simple dice game. Even though you won't have the shape of the curve involved, you'll still understand that this is the expectation that will give you the least error when the real results finally come in.

* Note that expected value is not necessarily the right variable to use for decision making. Value and utility are not necessarily the same thing. Furthermore, it is not true that the utility of the expected value of a particular choice is the same as the expected utility. For both baseball and my dice game, the two are likely to be close enough to be interchangable. For more information, read about the St. Petersburg paradox (which is awesome, by the way).

** That's not to say that any projection system out there is truly left with nothing but random noise. There may be opportunities for genuine improvement, but these opportunities can never fully overcome randomness. Furthermore, these improvements must be derived through rigourous statistical and scientific processes, not someone's gut feeling or some arbitrary pattern they've pulled out of thin air. These are not improvements.

Andre Dawson and why one must consider the whole picture

Tom Tango has a great post up here on Andre Dawson and his qualifications for the Hall of Fame.
The point is that the low OBP for Dawson can’t be looked at in isolation, as something he has to overcome.  It’s one piece of the puzzle, that’s all.
This is spot on, and it's something that I've been dwelling on over the past few months (and not just with respect to baseball).

We have a tendency as humans, excelling as we do at pattern recognition, to prefer discrete classifications to continuous valuations. This is a mistake most of the time. There isn't some hard and fast line at which an on-base percentage becomes acceptable for a Hall of Famer. Any low on-base percentage can be made up for by sufficiently greater production in another area.

It's so easy to get caught engaging in this kind of analysis. We think "Player X can't possible be productive! He only gets on base 32% of the time and he's a singles hitter!" We may be right: Player X may not be productive. However, if Player X is also the best defensive shortstop in the league, there's a great chance that he is actually a highly valuable player.

At the same time, simply being the best defensive shortstop in the league doesn't automatically make Player X valuable. If his offense is poor enough, it doesn't matter that he's better at defense than his peers. In order to perform a proper valuation of Player X we need to quantify every single factor that we can and then see where our evaluation fits within the continuum of baseball players.

Now this isn't an excuse to fudge the numbers. Far from it. Just because we need to whole picture to make correct valuations doesn't mean we can suddenly start assigning arbitrary value to things like "intangibles" or "hustle" or "grit" or "heart." We still need to place analysis within the proper scientific framework.

We need to realize that things rarely fall into discrete buckets. When we bucketize continuous data, we leave ourselves open to error. That is the lesson.

Tuesday, February 17, 2009

The most wonderful time of the year

That's right! Spring is here! What's that you say? Haven't seen a robin hopping around in your yard yet? Never fear, fellow spring lover! An even more sure sign of the end of winter is upon us. Yes, baseball players have reported to camp, and virtually all of them are in the best shape of their lives!

Julio Lugo
Julio Lugo, who came in about 5 pounds heavier (seemingly all muscle) than he did last season, said he would have played in the World Series last fall had the Sox gotten that far. He was healthy by that point, and now is ready to begin the 2009 season, completely healed from the quadriceps injury that cut short his 2008 campaign. He's also, he said, in the best shape of his life.
Ryan Ludwick
Ludwick soon expressed remorse for what he said, although he added that he was in the best shape of his life after an active winter of weight training.
Juan Pierre
"That stuff is out of my hands," said Pierre. "I learned last year I can only control what I can control. I came in the best shape of my career and I'm ready to go. After last year, they can't throw too much stuff at me I won't be ready for. Everybody knows I want to play every day, but it's not in my hands."
Dustin Pedroia
Pedroia said he gave up ice cream over the winter as part of his training program and is in the best shape of his career. He said he's ready to help the team get back to the playoffs after losing to the Tampa Bay Rays in Game 7 of the American League Championship Series last October.
Ronnie Belliard
Bowden said Belliard is in the best shape of his career since he joined Washington two years ago. Belliard played in 96 games last year and hit .287 with 11 home runs and 46 RBIs. [John: WTF? Best shape of career since two years ago? What does that even mean?]
Brad Penny
Penny, who is younger than A.J. Burnett and in 2007 was third in the NL Cy Young Award balloting, is in the best shape of his career. He says his shoulder was so weak last year, "I never should have tried to pitch." But after a winter of dogged conditioning and buying into the Red Sox program, Penny says he is throwing "as well as I ever have" and pitching coach John Farrell now believes Penny will open the season in the rotation.
Freddie Sanchez
"I feel like I'm in the best shape of my career," Sanchez said. "This is the strongest I've ever been. There have been no setbacks in my throwing program. My shoulder is feeling good. It feels strong and I want to see where it all goes this season."
Ryan Howard
Howard's friends have been buzzing all winter about how committed he was to improving his defense and conditioning. And while it's a little early to assess his leather craftwork, there's no doubt he's in the best shape of his career, after dropping 20 pounds in the offseason.
Chris Lubanski
"I know I can run," he said. "I know I haven't run as much as I should have in the past. I've focused this offseason to get into the best shape of my life. That is one thing this year I'm going to show is my speed cause I know it is still there. I know I still have it."
Alex Gordon
He’s been a disappointment his first two years in the majors, but reports are he’s in the best shape of his life, and working with former Royals 3B Kevin Seitzer can’t hurt. 
Carlos Pena
"When I tell you I was heartbroken, I was heartbroken, man. I was like, 'You've got to be kidding me.' I waited for [the World Baseball Classic] two years, I couldn't be a part of this last time. Now I'm in it, and I can't be a part of it. But at the same time, I'm making sure I take advantage of this time to heal. And I think it will all work out for the best. So I haven't spent too much time dwelling on the fact I won't be a part of it. Instead, I'm like, 'All right, I've got a lot of time to get into the best shape of my life.'"
What's amazing about this list is that it covers the entire spectrum of MLB players. For Pete's sake, last year's AL MVP says that this year he's in the best shape of his life. I have bad news, Dustin: you aren't likely to improve on last year.

Again, I have no doubt that the players believe this and that for some of them it's actually true. What kills me is that writers bother to use it and unironically at that. We get it. You worked hard during the offseason. Now let's talk about something interesting!

Also, here is a special bonus "best shape" article:
[Padres relief pitcher Heath] Bell said Monday that his workouts with the Nintendo Wii Fit were a big reason why he dropped about 25 pounds during the offseason and why he considers himself in "tip-top" shape as he prepares for his role as closer now that icon Trevor Hoffman has moved to Milwaukee.

As for the Nintendo Wii, Bell often played that with his wife and children at their home in Florida. It wasn't until the Bells purchased the Wii Fit game and board that he started doing the exercises that helped him get in what he considers the best shape he's been in since joining the Padres in 2007. 
Woohoo! Happy baseball to all!

PS: Thanks, Google News!

Thursday, February 12, 2009

Homer Simpson teaches us a lesson about steroids

Homer: Not a bear in sight. The Bear Patrol must be working like a charm.
Lisa: That's specious reasoning, Dad.
Homer: Thank you, dear.
Lisa: By your logic I could claim that this rock keeps tigers away.
Homer: Oh, how does it work?
Lisa: It doesn't work.
Homer: Uh-huh.
Lisa: It's just a stupid rock.
Homer: Uh-huh.
Lisa: But I don't see any tigers around, do you?
[Homer thinks of this, then pulls out some money]
Homer: Lisa, I want to buy your rock.

The Epistemoligical Steroid Problem

Over at ESPN, Buster Olney advocates for someone to stand up and tell the truth:
Could someone stand up and offer an unvarnished truth? Could someone please be fully credible and open and offer a complete version? Or are we going to see, day after day after day, these carefully crafted apologies, designed to tackle a public relations problem but really having nothing to do with honesty.
I could agree more. Let's have the truth, the whole truth, and nothing but the truth. But therein lies the rub: how do we know that's not already what we got? The problem with all the handwringing about athletes not telling the truth is that it presumes that they have not already told us the truth. And why do we believe that they have not told us the truth? Because it does not fit what we believe the truth to be.

This is an impossible problem. Any athlete who comes out and provides a story that does not conform perfectly to what we already believe the truth to be is simply presumed to be lying. We are making assumptions that can't possible be assumed and claiming to know things that can't possible be known when we take this position.

I grant that athletes do not have a great track record here. Furthermore, all the incentives are lined up to tell as little of the truth as possible. I'm not saying you have to believe a baseball player who tells you that he bought HGH but didn't use it. What I'm saying is that you can't simply presume that what they're saying is false simply because it doesn't fit what you think the story is.

Monday, February 9, 2009

A-Rod speaks

You can read the details here.

Alex admitted to taking PEDs between 2001 and 2003. He says he did it to cope with the pressure of being the highest paid athlete on the planet. He says he has been clean since then.

I want to believe Alex. I want to believe that he's sorry, that he only used for a couple years, that he hasn't used since 2003. It's important that you, the reader, know that because it can't help but color what I have to say. He's the best player on my favorite team. There's no way for me to be unbiased about this.

Much of what Alex says rings true to me. He admits to a much longer period of use than he "had" to. His reasons for taking them seem plausible. He seemed genuinely sorry, inasmuch as I can tell someone seemed genuine.

If Alex is telling the truth, and I mean the whole truth, not some Andy Pettitte truth where your story keeps evolving as more evidence is released against you, then I think he's done the right thing. I think people will be more receptive to this than they will be to a Roger Clemens style assault.

What if Alex isn't telling the truth? I don't want to think that's the case, but the skeptic in me admits that it is necessary. If you want to get full-blown cynical about it, athletes have a poor record of truth telling when it comes to personal failings. Alex's story could all be a very well conceived attempt to hit the "sweet spot" between admitting so little usage that people think you're lying and admitting so much usage that people think don't care that you are sorry. Are we really supposed to believe that all of his Yankee years, in which he has the most personal investment now and in the future, and all of his MVP years, which are critical to his legacy, are clean? That only a few years were not clean? That seems awfully convenient.

*sigh*

I can't be that cynical. Like I said, I want to believe A-Rod. It is my hope that he is telling the truth and that we can all move on from this. For now, I'm gonna take him at his word.

More A-Rod

If you're looking for a thoughtful take on the A-Rod situation (and profanity free!), try:
  • King Kaufman
    So maybe the records from that time -- which for all we know is still going on, since the cheaters are always ahead of the testers -- are tainted. But they aren't any more tainted than the records from the time when baseball was segregated.
  • Rob Neyer
    I hope Alex Rodriguez didn't cheat. If we do find out that he cheated, I will wish that he hadn't. But whatever happens, I'm not going to change my opinion that he's a great baseball player. Like many of the greatest players, he'll do whatever it takes to be the best player he can be. For a stretch of five or 10 years -- and yes, perhaps even today still -- being the best player could have meant cheating. Maybe the cheaters were wrong; that's the direction in which I lean, probably because I've got a streak of the moralist in me. But I will not sit idly while great athletes looking for an edge -- not all that different from the many generations before them -- are demonized by the high priests of baseball opinion. I will not.
If instead you want some hysterical, over-the-top reactions, try:
  • Jayson Stark
    In baseball, we love our numbers. And we love our heroes. And that brings us to Alex Rodriguez, a man who has committed a crime he doesn't even understand:

    A crime against the once-proud history of his sport.

    A-Rod didn't commit that crime alone, of course. In many ways, he is just the latest, greatest face of a mass conspiracy that has now succeeded in obliterating the quality that used to separate baseball from the rest of the sporting jungle.

    Once, the numbers of baseball used to mean something special and magical. And the men who compiled those numbers were transcendent figures in American life.

    But not now. Not anymore.
  • Peter Abraham
    At this point, anybody who played the game in the last 15 years is guilty until proven innocent. Nobody gets a pass. Rodriguez is the most physically talented player in decades. If he decided he had to cheat, everybody else has to be a suspect. Don’t forget, there are still 103 names out there just waiting to be leaked.

    Mike Mussina went from being bounced out of the rotation to his first 20-win season. Suspect. Mariano Rivera never seems to take a step back. Suspect. Derek Jeter plays every day. Suspect. Joba Chamberlain sure throws hard. Suspect. Two years ago you would bet your house on those guys being clean. Would you bet $20 now? You can’t be sure about any player, not even the supposed good guys. If you are, you’re hopelessly naive.
Mr. Abraham's opinion and Mr. Stark's opinion are exactly why I can't stand the steroid saga in baseball. It's not that I condone steroid use, or that I don't view steroids as cheating, or that I think we shouldn't care. I just don't see why steroid use is a bigger problem than segregation, game fixing, gambling, bat-corking, ball-scuffing, or any of the other ethical problems that baseball has faced over the years.

Baseball players aren't heroes. They never have been. That's not to say that some aren't admirable. I'm sure there are good men playing baseball. But on the balance, baseball players are actually just like you and me and the rest of humanity.

If you can say that you've never cheated on your taxes, never illegally downloaded a piece of software, a track of music or a movie, never used any sort of recreational drug, never lied to a boss, coworker, or business partner, never cheated on your spouse, and never driven drunk, then you can go ahead and get up on your soapbox to decry baseball players for failing to be saints.

The rest of us should probably remember that it is in human nature to cheat when it will benefit us. That doesn't make it right. That doesn't excuse anyone for cheating, but it should give us pause when we set out to villify a select group of people simply because their job is higher profile than ours.

Saturday, February 7, 2009

Shoot me

Please.

Seriously, I wish I could say that I didn't care about this. I do, though probably not for the reasons most care.

I just want this whole steroid hysteria to go away. We get it. Some players did steroids. Some did not.

So fucking what?

Seriously.

So fucking what?

Athletes do whatever they can to win. They always have. Steroids are just another manifestation of that. That doesn't mean we have to turn the whole goddamn situation into a media shitstorm every time someone is "outed." That doesn't mean we have to go around leaking supposedly anonymous tests that are legally sealed. That doesn't mean that a whole era of baseball is "tainted" or invalidated or whatever. For crying out loud, it's still not even a matter of fact that steroids even assist baseball performance.

We've covered this before, so I don't want to turn this into a 1,000,000,000 word post. Every era has different quirks and idiosyncracies that make direct comparisons across eras impossible. Steroids are just another thing that may or may not need adjusting for. They are, in this respect, no different from large ballparks, loosely-wound baseballs, poor playing surfaces, shoddy equipment, heavier bats, better nutrition, better training, better medicine, or any other factor that influnces baseball performance.

The only reason that steroids are villified the way they are is because it gives media pricks the chance to stand up and ram their sanctimonious, morallizing, narrow, shallow, uniformed opinions down our throats.

Yes, let's try to keep steroids out of baseball. I think that's fine. But we also need to move on and acknoweldge that what happened happened. Please, spare me the histrionics.

I just want to enjoy baseball again. I want us all to be able to enjoy baseball again.

Won't somebody please think of the children?

Tuesday, February 3, 2009

If a baseball player trains in the woods and nobody interviews him, is he still in the best shape of his life?

Who knows?

Gonzalez has lost 15 pounds. He’s a fit 6 feet 2, 200 pounds and in the best shape of his life, he says. He had to have surgery in October to remove a fatty tumor from his back called a lipoma, but it was benign, and he recovered in three weeks.
Kasey Kiker:
“I’m definitely in the best shape of my life,” Kiker said. “Everybody tries to point the finger at, ‘You’re on the disabled list, and you got hurt because you’re not in shape.’ It’s fair, to a point. I’ve worked as hard as I could.”
Dunn is not Ramirez, mind you. But he's 7 1/2 years younger, said to be in the best shape of his career and the only major leaguer to hit 40 or more homers in each of the past five seasons.
Russell Martin:
Russell Martin, the hard-nosed catcher for the Los Angeles Dodgers, has taken up yoga.

That’s right, the native of Chelsea, Que., best known for his dogged blue-collar work ethic and play-through-all-pain intensity, has made yoga an integral part of his off-season workout regimen for the first time this winter.

“I’ve never felt better,” he said.