Basebology (The Study of Baseball): Measuring True Talent

Wednesday, May 23, 2007

Measuring True Talent

We've talked in previous posts about the need to have an accurate player model for objective player analysis. Furthermore, we've established that this player should reflect a player's "true talent level"; that is, it should reflect a player's level of proficiency at a skill under average conditions.

The next step is to measure a player's true talent level. For example, if we are trying to study the effects of out-making among different players in baseball, we need a model of how players make outs. This model needs to accurately reflect the differences between players at this skill. Those players who don't make many outs should make few outs in our model. Those players who make many outs should make many outs in our model.

I know that all of this sounds trivial. In fact, you've probably already leaped many steps ahead. The reason I'm being so particular about these points isn't that they're hard to understand, it's to drive home that nature of the process. This is crucial for when we begin to model more complex behavior.

For something simple like out-making, our model can simply provide us with the probability that an out will be made given an opportunity to make an out. This sort of binary outcome (out, not out) is well explored problem and provides many excellent statistical properties.

So how do we assign out-making probabilities to each player in our model? How do I know what Derek Jeter's true out-making talent level is?

The answer: I don't, nor can I ever know for certain.

This is a very unsatisfying answer, but like most things in science, we can cheat. We can estimate Derek's true talent level if we make a few assumptions.

First, we can assume that if Derek Jeter's true talent level at out-making is 60% (that is, he makes outs 60% of the time he's given the opportunity to) that in real life, he will make outs around 60% of the time, given average conditions. Secondly, we can assume that even though he doesn't ever see "average conditions", he sees enough of a variety of conditions that they tend towards normal in the aggregate.

This two points imply two other very important points. First, due to the fact that we don't have an infinite number of trials on which to examine a player, our estimation is only guaranteed to be in the right neighborhood. There is probably going to be some error. Secondly, the more trials we have, the more likely it is that the overall conditions will tend towards average. This is not guaranteed and needs to be verified when doing research. This may lead to adjustments in the way data is analyzed. For example, a player who played half his games in the pre-humidor Coors Field cannot be said to have played under "average conditions" no matter how many games he plays.

So after all this, what are we going to do? We are going to estimate a players true talent level by looking at all the relevant data points in that player's career and assuming that they are an accurate reflection of a player's true talent. The more points we have, the more confident we can be in our estimation. So for Derek Jeter, we estimate that given an opportunity to make an out, he will do so 61% of the time (his OBP is 0.390 for his career). We can apply this process similarly for other true talent levels, like home run rate or stolen base success rate.

This is a crude first attempt at a model, but it does illustrate to process and leaves with a couple of places to go. First, how can we quantify our confidence in our estimation? Secondly, how can we make adjustments to the data points to improve the model? These aren't easy questions. For now, the important point is that even this simple model provides us with a starting place for objective analysis.

As a final point, it is important to note that we can be confident in this model because if it were wrong the results would have to show up in the real world. If Derek Jeter's true talent level at out-making is actually 70%, it is massively, apocalyptically unlikely that he would actually only make outs 61% of the time over the course of his career. Now, perhaps his true talent level is actually 60.75% or 61.3%. This could be within our range of acceptable error. However, we are not likely to be significantly wrong. This point is important because when we begin turning our model into tools that can be used to make decisions, conclusions that we don't like can't be hand-waved away by saying that the underlying model must be significantly wrong in this case. It might be, but only a fool is going to bet against it.

**EDIT** Fixed some typos and added some language to clarify some sentences.

2 comments:

D.Cous. said...: "...a player who played half his games in the pre-humidor Coors Field cannot be said to have played under 'average conditions' no matter how many games he plays."

What difference does the quality of cigar storage facilites at Coors Field make on playing conditions?

Primadonna baseball player: "This cigar is dried out! I can't work under these conditions!"

Team owner: "Fine, we'll get you boys a humidor. How many damn concessions must I make?"; May 23, 2007 at 1:47 PM
Anonymous said...: The humidor in this case is used by Coors Field personnel to keep the balls from drying out in the dry mountain air. This is thought to be the reason for the trend toward normalcy at Coors. In the past, from its inception until the introduction of the humidor, Coors has been an amazingly wacky park for offense. The distinction of best hitters park in baseball probably now belongs to whatever they're calling the Ballpark in Arlington these days.; May 23, 2007 at 7:37 PM

Post a Comment

Key Stats

ARP
Adjusted Runs Prevented

ARP measures the amount of runs that a relief pitcher prevented from scoring above what an average relief pitcher would have prevented. ARP is adjusted for the situation in which the pitcher was used.

ISO
Isolated Power

ISO is the ratio of extra bases that a player has accumulated to the number of at bats he has received. ISO is essentially a player's SLG minus his batting average. This has the effect of giving a player credit only for extra base hits. ISO is not a useful measure of player value on its own, but is a very effective measure of a player's extra base ability.

OBP
On Base Percentage

OBP is the ratio of the number of times a player reached base safely to the number of opportunities he had to reach base. It effectively measures a player's skill at not making outs. Since outs are a teams most precious commodity, OBP measures perhaps the most valuable and fundamental skill a player can have.

OPS
On Base Plus Slugging Percentage

OPS is a crude metric that simply sums a player's on base and slugging percentages. It is probably the most popular non-traditional measure of overall batting performance due to its simplicity. However, it has drawn criticism from performance analysts for its inaccuracy relative to other advanced metrics and because it works by adding two numbers with different denominators together to produce a conceptually meaningless quantity. It is best used as a quick and dirty estimator of batting prowess.

SLG
Slugging Percentage

SLG is the ratio of total bases that a player has accumulated to the number of at bats he has received. It is essentially a weighted batting average that gives a player more credit for extra base hits.

UZR
Ultimate Zone Rating

UZR is a defensive metric that uses play-by-play data to determine how good a player's defense is. On Fangraphs, it is denominated in runs saved above average.

VORP
Value Over Replacement Player

VORP measures the amount of runs that a player contributed above what a "replacement player" at the same position would produce. VORP considers only offensive contributions.

WARP
Wins Above Replacement Player

WARP measures the amount of wins that a player contributed above what a "replacement player" at the same position would produce. WARP considers both offensive and defensive contributions.

WXRL
Win Expectancy added above Replacement adjusted for Lineup

WXRL measures the amount of wins that a relief pitcher contributed above what a "replacement player" would produce. WXRL differs from WARP because it is adjusted for both the game situation in which the pitcher was used and the hitters that the pitcher faced.

Basebology (The Study of Baseball)

Wednesday, May 23, 2007

Measuring True Talent

2 comments:

Blog Archive

Key Stats

Contributors