Saturday, May 02, 2015

Introducing DRA: The Latest Sabermetric Rage

Most readers of this blog are aware of the limitations of ERA or Run Average (RA) in evaluating pitcher performance.  Two of the biggest issues are:
  • RA gives pitchers full credit/blame for results of batted balls in play despite the fact that they share that responsibility with fielders.  For example, a pitcher with a strong defense behind him will tend to give up fewer hits (and thus fewer runs) than if he has a poor defense behind him.
  • RA gives pitchers full responsibility for sequencing or timing of events, that is, it assumes that they can control when they give up hits and walks. For example, if a pitcher pitches extraordinarily well with runners in scoring position in a given year, he will have a lower ERA than if he had a typical year in those situations. Additionally, a pitcher who tends to bunch base runners together in single innings will have a higher ERA than if he had a typical year distributing base runners more evenly.
In reality, pitchers have limited control over both the number of batted balls that drop for hits and sequencing of events.  Thus, Defense Independent Pitching Statistics (DIPS) such as FIP, xFIP, tERA and SIERA have been developed to remove some of the noise of RA.  DIPS are based on things that pitchers do control for the most part - walks, hit batsmen, strikeouts, home runs and types of batted balls (ground balls , fly balls, line drives, pop flies).

Because they are based on things that pitchers essentially control, the DIPS metrics are said to be better measures of true talent than RA.  As a result, they are also better than RA at predicting future performance. However, they only measure a portion of a pitcher's talent and should be used as complements to RA rather than as replacements.  

It is not known exactly how much control pitchers have on the results of balls in play, but recent research tells us that some pitchers are better than others at preventing hits on balls in play.  For example, Mike Fast, formerly of Baseball Prospectus and now a MLB sabermetrician, used Sportsvision's hit f/x data to show how pitchers varied on the speed of balls off the bat. 

So, rather than making the big leap from RA to FIP, it seems to be a good idea to first meet half way.   Instead of removing hit prevention and sequencing in one step, it might be better to remove one factor at a time.  Bill James did that with his Component ERA (ERC).  Applying the runs created methodology to pitchers, he determined what a pitcher's ERA should have been based on walks, hit batsmen,  strikeouts, homers AND hits allowed.

Additionally, The Base Runs measure was created by David Smythe in the early 1990s.  It is based on the idea that we can estimate team runs scored if we know the number of base runners, total bases, home runs and the typical score rate (the score rate is the percentage of base runners that score on average).  Base Runs also works well for individual pitchers.  The complete formula can be found here.

There is now a new pitching metric that goes beyond any of the above measures.  This new measure Deserved Run Average or DRA was developed by Baseball Prospectus Researcher Jonathan Judge with help from Harry Pavlidis and Dan Turkenkopf.  The statistic is introduced in an overview article and in a more complex article explaining all the math.

The first article explains that the calculation of DRA starts by assigning weights to each batting event (similar to the wOBA statistic for batters) according to how much, on average, they contribute to runs scored.  For example, a home run adds 1.4 runs on average and a double play costs 0.75 runs.  That is similar to what ERC and Base Runs do.

The next steps are what separate DRA from its predecessors.  It adjusts for all kinds of context such as:
  • ballpark
  • whether pitcher is pitching on home or road  
  • identity of opposing batter and handedness of batter
  • identity of catcher and how proficient he is at framing pitches
  • identity of umpire and how often he calls strikes versus balls
  • runners on base and number of outs before each plate appearance
  • run differential before each plate appearance
  • quality of defense behind pitcher
  • whether pitcher is starting or relieving
  • game time temperature
  • quality of base runners
  • ability of pitcher to control running game.
  • responsibility of pitcher for wild pitches and passed balls.  

That's a lot of variables!

The result is a number that looks like an ERA but it attempts to isolate the runs for which a pitcher is truly responsible.  For example, Tigers starter Anibal Sanchez has an ERA of 5.46, but other statistics indicate that he has not been so bad and is being charged runs for which he is not responsible.  FIP has him at 4.11 and DRA makes him look even better at 3.93.  So, while over five runs per nine innings have scored while Sanchez has been in the game, DRA is saying that he has been responsible for only slightly less than four per nine innings.

One question you might have is whether the the calculation of DRA is mathematically sound. As a statistical programmer with experience in mixed models, the foundation of DRA, the method looks fine to me as far as I can tell.  Researchers such and Tom Tango and Brian Mills have reviewed it more thoroughly and seem to approve.  If you are mathematically inclined and want to see for yourself, you can read the in depth article.  

Do we need another pitching statistic?  I think we do.  ERA is not adequate in the short term for reasons discussed above. FIP and other DIPS variations get closer to measuring pitcher talent but leave questions unanswered such as what to do about batted balls in play. Components ERA and Base Runs address batted balls but give all the responsibility to the pitcher which is not correct.  

Does DRA need to be so complicated?  That is what I am not sure about.  I like that they control for quality of batters faced and defense behind the pitcher, but do they need to control for game-time temperature?  There is not much written so far about how much each component contributes to a pitchers value.  There are an awful lot of variables and I get a little uncomfortable when we jump from simple to complex so quickly.  These are questions rather than critiques as I don't know the answers and look forward to further research and discussion.  

The question most of you are probably interested in is how does DRA evaluate Tigers pitchers?   The answers are shown in Table 1 below.

Table 1: DRA for Tigers Starters 
Pitcher
ERA
FIP
DRA
Simon
3.13
3.36
3.25
Lobstein
3.91
3.32
3.36
Price
3.48
2.82
3.40
Greene
4.60
3.52
3.53
Sanchez
5.46
4.11
3.93
Data Source: Baseball Prospectus

Right-hander Alfredo Simon leads Tigers starters and is tenth in the American League with a 3.25 DRA.  This is only a little higher than his ERA (3.13) and slightly lower than his FIP (3.36).  So, he looks good by any measure so far.  As seen in the above example with Sanchez, the other Tigers newcomer Shane Greene also fares much better on DRA (3.53) than on ERA (4.60).

So, we've got a brand new pitching statistic with a lot of potential.  It's probably somewhat better than other available numbers   As with any new measure though, we should not view it as the end all of pitching statistics, but rather as another number to view while evaluating pitchers.  Pitching evaluation has traditionally been quite muddy though, so a fresh new approach is very welcome.

2 comments:

  1. B.J. RassamMay 06, 2015

    Interesting take on the stats, yet in almost any which way you cut the stats, the best pitchers seem to always rise to the top of the tier in stats analysis.

    ReplyDelete
  2. Indeed, it's sort of a system of checks-and-balances that new stats correlate to some extent with other stats since any statistical measure inherently tries to present information that describes the quality of a player.

    If you cross-reference those different stat measures you will see that while they generally have similar rankings, there still is some variance in the orders. Even as little as a 10% differential in comparing one stat with another can result in a significant discrepancy. For example, imagine a player that has a .300 BA. If another stat makes that player look 10% better versus 10% worse relative to what a .300 BA represents in comparison to others, then that would result in a potential gap of .270-.330 which is a sizeable spread swing.

    Take this chart for example, Price has close to the full amount of that 60 point range as an advantage on the ERA over Simon, yet the DRA makes up for all of that and then another 15 points to put Simon ahead.

    The key to statistical analysis is to look at each stat individually and to assess that stat precisely for what that stat is worth with respect to the story that said stat tells. It really is an artform to isolate the worth of each independent stat as well as being in tune with appreciating the meaning of what the stat represents. You could take that one step further and then add weights to the value of each stat with respect to others in coming to any overall evaluatory conclusion. A person who does that well is a person who is going to have an advantage in evaluating players based on numerical analysis.

    ReplyDelete

Sabermetrics Book

Sabermetrics Book
One of Baseball America's top ten books of 2010

Blog Archive

Subscribe

501 Baseball Books

501 Baseball Books
Recommended by Tiger Tales

Stat Counter

Site Meter