Saturday, November 10, 2007

My frustration with "scouting"

A few years ago, Baseball Prospectus summed up the supposed dichotomy between scouting and objective statistical analysis with the phrase "beer and tacos." If some asks you if you would like beer or tacos, the correct answer is not one or the other, but both. They serve completely different purposes. Similarly, people often try to pit scouting and statistical analysis against each other as if they were opposite ends of the same spectrum. In reality, they are two totally different tools with the same aim: better decision making through better information.

Let me be perfectly clear about this: good scouting is absolutely essential to a well run baseball team. Scouting provides data that raw statistics will struggle to uncover. Scouts are very important and I have no quarrel with them.

Unfortunately, a disturbing trend has developed in the mainstream presentation of scouting data. Too often, analysts that are supposed to be providing the public with a scout's view of players have instead become nothing more than poor statistical analysts, justifying their use of small sample sizes with the vague notion that they are scouting. This type of analysis is not only completely useless, adding nothing to the discussion, but also damaging to scouting as a whole. Scouting should not become a tool for giving undue weight to a small sample of performances. It is supposed to supersede the small sample by providing data that cannot be gleaned from statistics alone.

I suppose an example or two is in order. Keven Goldstein is a writer for Baseball Prospectus. In fact, he's supposed to be their scouting guru. He was brought to BPro with the hopes of expanding their coverage beyond just statistical analysis. I like Kevin's columns and I read almost everything he writes. He writes a column every Monday that offers a small blurb on ten different prospects of note. Here is an example from his latest:

RHP Jake Arrieta, Phoenix Desert Dogs (Orioles)

Arrieta is becoming an offseason Ten Pack regular, as the Orioles keep pitching him an inning at a time, and he keeps putting up zeroes. At this point it’s gone from “nice start” to “downright impressive,” as Arrieta had his best outing yet on Saturday, striking out all three batters he faced. So far, the Orioles fifth-round pick who got first-round money has put together 12 scoreless innings over 10 appearances, while allowing just six hits and striking out 13. It’s a little too early to call him a steal, and his disappointing final college season is still in the back of people’s minds, but his timetable is on the verge of getting accelerated.

It's nice to know what Arrieta is up to, but if you're looking for any useful information here, you should be sorely disappointed. There isn't any. There is not one shred of scouting data in this blurb. The only information presented to the reader is some statistical data from a sample size so absurdly small that it is totally, utterly, meaningless. Arrieta might be a good prospect, but there is 100% no reason from this paragraph to suppose that he is. If Goldstein knows what makes Arrieta great, he hasn't included that information here, and it defeats the purpose of his presence of the BPro staff.

A small sample size is a small sample size. Unless you present compelling evidence above and beyond the data itself that it should be given significance, you cannot glean any useful information from a small sample. Let's look at another example:

RHP Daniel Bard, Honolulu Sharks (Red Sox)

Friday’s Boston prospect rankings, like any prospect list, generated a lot of email. Most of it concerned guys who didn’t make it, like Brandon Moss or Craig Hansen, but nobody asked about Daniel Bard. Twelve months ago, that wouldn’t have been the case, because last year at this time, Bard was a highly regarded first-round pick who could touch 100 mph, although he had some issues when it came to command and secondary stuff. This year, the wheels fell off. Beginning the year at High-A and then spending the majority of the year at Low-A after a demotion, Bard finished the year with a 7.08 ERA and 78 walks in 75 innings. Using the Hawaii Winter League as an opportunity to find the magic once again, the good news is that Bard has a 0.69 ERA in 13 innings while allowing just seven hits. The bad news is that he’s walked 11. It doesn’t matter how hard you throw if you have no idea where it is going.

This paragraph is only slightly better. We hear about why Bard was highly regarded, but then Goldstein uses another small sample size to explain that Bard is in serious trouble as a prospect. There's only one problem: he gives not one single ounce of scouting evidence to suggest that Bard is struggling. Again, it doesn't matter if you say you are a scout, a small sample size is a small sample size and it adds absolutely nothing to the discussion by definition unless it can be supported by extrastatistical evidence. That is what scouting is supposed to do. Goldstein and his scouting sources may know why Bard is struggling, but until that information is presented, Goldstein's paragraph amounts to little more than saying, "Bard is in trouble. Trust me, I know because I talk to scouts." How useful is that?

Scouting analysis can be done. Let's rewrite the Bard paragraph with some fictitious, though plausible, scouting analysis. The italicized part indicates my rewrite.

RHP Daniel Bard, Honolulu Sharks (Red Sox)

Friday’s Boston prospect rankings, like any prospect list, generated a lot of email. Most of it concerned guys who didn’t make it, like Brandon Moss or Craig Hansen, but nobody asked about Daniel Bard. Twelve months ago, that wouldn’t have been the case, because last year at this time, Bard was a highly regarded first-round pick who could touch 100 mph, although he had some issues when it came to command and secondary stuff. This year, the wheels fell off. Beginning the year at High-A and then spending the majority of the year at Low-A after a demotion, Bard's mechanics deteriorated. He began shortening his stride, causing a drop in his velocity. To compensate for this, he began to "muscle up" when throwing ball, exerting greater effort with his upper body. His left shoulder was no longer positioned properly when the ball was released, causing him to lose any semblance of command or control. It doesn’t matter how hard you throw if you have no idea where it is going.

Now, I can't really write like a scout, but that is essentially what scouting information should look like. We aren't relying on any statistics at all. There is no small sample size.

Scouting analysis can work, but it must be disciplined. The moment it deteriorates into a parade of small sample sizes, it loses all of its value. I sincerely hope that Mr. Goldstein recognizes this so that we can reap the full benefit of his scouting connections.

There is hope. The one absolutely essential scouting columnist on the Internet is The Hardball Times' Carlos Gomez. Gomez provides a true scout's view of players, including some excellent takes on Joba Chamberlain versus Phil Hughes and Ian Kennedy versus Clay Buchholz. Gomez's analysis breaks down each player physically, analyzing how they do what they do and how what they do leads to results. Assuming that Mr. Gomez just isn't talking out of his ass, every writer who wants to write about scouting should aspire to his level of work. If that happens, we'll finally start reaping the benefits of beer and tacos.

**EDIT** Fixed minor spelling mistake.

6 comments:

Repoz said...

John, it looks no more enjoying Carlos Gomez's (ChadBradfordWannabe, over here at Baseball Primer) work, as the Diamondbacks have hired him on as a scout.

http://www.azcentral.com/members/Blog/NickPiecoro/10440

http://www.baseballthinkfactory.org/files/newsstand/discussion/chadbradfordwannabe_hire_by_d_backs_as_major_league_scout/

Chadbradfordwannabe said...

Thank you for your kind words and support. As you may have found out already, the Dbacks have hired me to scout for them.

I do truly appreciate your comments though.

Sincerely,
Carlos

John Lynch said...

Gosh darn it. I kind of figured this was going to happen. In fact, as I was writing it, I thought that perhaps I should check to see if Carlos hadn't been hired away.

*sigh*

Here's hoping someone else can carry on in your stead.

John Lynch said...

Woah. I had no idea that it happened the same day I wrote my post. What terrible timing! Oh well. Congratulations, Carlos!

D.Cous. said...

Scouting is VERY important... in baseball.

A good post. I do wish that you could've used an actual example of good scouting writing in the post (links notwithstanding), rather than your own approximation of it. I'm uninformed enough about the subject to be unable to tell the difference.

John Lynch said...

I do wish that you could've used an actual example of good scouting writing in the post (links notwithstanding), rather than your own approximation of it.

I wanted to demonstrate an approximation in the context of one of the already quoted articles. However, I aim to please, so here's I quote from one the Joba Chamberlain/Phil Hughes comparison that Carlos Gomez wrote. His articles are just loaded with this type of analysis.

Note how well Hughes buries that right shoulder and gives his arm a nice, long arc that decelerates the arm more smoothly. Joba throws really hard, so it is a bit worrisome that he finishes a bit on the abrupt side.

It's an actual breakdown of the players' pitching process, exactly the kind of information that stats do not yet quantify well.