In a recent post, I used the RE24 statistic to measure batting contribution including situational hitting. The statistic is appealing because it gives batters more credit for hits with runners on base than for hits with the bases empty. This concept can also be applied to pitchers, relievers in particular.
Statistical evaluation of relievers is difficult for a couple of reasons:
- They pitch so few innings that their statistics can be influenced heavily by a couple of really bad outings.
- Their actual value depends on game situations more than any other player (this problem will be addressed here)
Using ERA to evaluate relievers is problematic because relievers often make appearances with runners on base and give up other pitcher's runs. So, a pitcher could have a low ERA without actually being that effective. FIP which is based on walks, strikeouts and home runs allowed rather than runs allowed is better but it still does not consider the game environments in which a reliever pitched.
The RE24 metric estimates the number of runs a pitcher saved or cost his team based on his numbers of singles, doubles and all other events allowed including outs. It also considers the situations in which these events happened. For example, if Tigers southpaw Tom Gorzellany enters a game with two men on base and nobody out and retires the side he will get more credit than if he comes in with the bases empty. Gorzellany gets more points in the first scenario because there was greater potential for run scoring. Thus, Gorzellany saves the Tigers more runs if he frequently pitches well with runners on base than if he always starts an appearance with the bases empty.
The RE24 for all American League teams is shown in Table 1 below. The Tigers have an RE24 of -17.9 which says that their bullpen has saved them an estimated 18 runs compared to an average staff with the same number of outs. The interpretation is a little misleading because the average also includes starters. However, all bullpens are compared to that same average, so the ranks are telling and only the Royals (35.9) and Astros (23.6) have been better than the Tigers.
Table 1: AL Team RE24, May 24, 2015
Team
|
RE24
|
Royals
|
35.9
|
Astros
|
23.6
|
Tigers
|
17.9
|
Rays
|
7.3
|
White Sox
|
4.2
|
Angels
|
3.0
|
Twins
|
3.0
|
Blue Jays
|
-0.5
|
Yankees
|
-1.0
|
Red Sox
|
-1.1
|
Orioles
|
-4.0
|
Indians
|
-7.7
|
Rangers
|
-8.6
|
Mariners
|
-11.6
|
Athletics
|
-22.5
|
Data source: FanGraphs.com
The American League RE24 leaders among relievers are shown in Table 2 below. Astros right hander Will Harris heads the list at 11.1 followed by Roberto Osuna of the Blue Jays (9.9). The Tigers have two pitchers in the top 15 - closer Joakim Soria (6.8) and middle man Alex Wilson (6.0).
Table 2: AL RE24 Leaders, May 24, 2015
Name
|
Team
|
RE24
|
Will Harris
|
Astros
|
11.1
|
Roberto Osuna
|
Blue Jays
|
9.9
|
Wade Davis
|
Royals
|
8.8
|
Dellin Betances
|
Yankees
|
7.4
|
Brandon Gomes
|
Rays
|
6.9
|
Joakim Soria
|
Tigers
|
6.8
|
Glen Perkins
|
Twins
|
6.6
|
Andrew Miller
|
Yankees
|
6.3
|
David Robertson
|
White Sox
|
6.2
|
Alex Wilson
|
Tigers
|
6.0
|
A.J. Ramos
|
Marlins
|
6.0
|
Shawn Tolleson
|
Rangers
|
5.9
|
Brad Boxberger
|
Rays
|
5.8
|
Zach Duke
|
White Sox
|
5.7
|
Data source: FanGraphs.com
Table 3 shows the performance rest of the Tigers in terms of RE24 including Gorzellany (3.5) and left hander Blaine Hardy (2.4). The only current Tigers relievers below zero are Joba Chamberlain (-0.2) and Al Alburquerque (-0.7) and they are no where near the bottom of the league.
Table3: Tigers RE24, May 24, 2015
Player
|
RE24
|
Joakim Soria
|
6.8
|
Alex Wilson
|
6.0
|
Tom Gorzelanny
|
3.5
|
Blaine Hardy
|
2.4
|
Angel Nesbitt
|
0.7
|
Joba Chamberlain
|
-0.2
|
Al Alburquerque
|
-0.7
|
Data source: FanGraphs.com
Can the Tigers bullpen keep up their surprising third place ranking? Without much dominant stuff on the staff, it will not be easy. Their 3.65 FIP is good enough for 5th in the league, but their strikeout rate of 6.9 per nine innings is second worst only to the Twins. The fact that the starting staff leads the league with 6.2 innings per start has helped limit the exposure of what was supposed to be a leaky pen. With Anibal Sanchez struggling and Kyle Lobstein now on the disabled list with a sore shoulder, it is questionable how long that will last.
Still, it's hard not to be happy with the bullpen performance so far. This was a group which many predicted would be close to the bottom of the league. They will likely need re-enforcements soon and hopefully rehabbing flame thrower Bruce Rondon will be one of them. They might not need as much help as originally thought though.
Many Tigers fans have been complaining that the Tigers are leaving way too many men on base and not scoring nearly as many runs as they should. That is something fans talk about every year and it's usually not true. This year, however, the masses seem to be correct. The Tigers currently lead the American League with a .774 OPS but are only sixth with 4.4 runs per game. So, something seems amiss both by observation and by the numbers.
Entering today's action, the Tigers lead the American League with 216 Runs Created. Simply stated, this means that a typical team with a .280/.346/.428 batting line a quarter of the way through the season would be expected to have 216 runs scored. The Tigers have only scored 190 runs which is 26 (or a whopping 12%) short of where they should be. Table 1 below shows that no team in the league has a bigger negative differential between Runs and Runs Created (RC).
Table 1: Differences In Runs and Runs Created for AL Teams, May 22, 2015
Team
|
Runs
|
RC
|
Diff.
|
Detroit
|
190
|
216
|
-26
|
Toronto
|
226
|
209
|
17
|
Kansas City
|
207
|
205
|
2
|
Oakland
|
187
|
191
|
-4
|
New York
|
188
|
190
|
1
|
Cleveland
|
179
|
188
|
-9
|
Houston
|
189
|
178
|
11
|
Texas
|
176
|
178
|
-2
|
Tampa Bay
|
170
|
176
|
-6
|
Baltimore
|
180
|
173
|
7
|
Boston
|
162
|
169
|
-7
|
Seattle
|
156
|
165
|
-9
|
Minnesota
|
185
|
159
|
26
|
Los Angeles
|
163
|
143
|
20
|
Chicago
|
143
|
143
|
2
|
Data source: Fan Graphs.com
A quick look at basic situational statistics begins to explain what is happening. The Tigers lead the league with a .289 batting average with base empty, but are sixth in the AL batting .270 with runners on base. As Neil Weinberg of TigsTown points out, they are particularly bad in one specific situations - runner at first base only:
...the Tigers are horrible with men on first base only. You’re probably thinking that’s a weird thing, and you’re right. The Tigers have a .704 OPS in 273 PA with a man on first only. In every other situation (1258 PA), they have a .780 OPS.
Another way to look at situational hitting is with the RE24 statistic. S tatistics like on base percentage, slugging average and OPS don't address situational hitting. Traditional fans like to use Runs Batted In, but that is a team dependent statistic. A player has more or less opportunity to drive in runs depending on who is batting in front of him.
Other fans point to batting average with runners in scoring position, but that is based on a limited number of plate appearances. It also doesn't consider the number of outs, the specific base runners (e.g. bases loaded versus second base only) or the type of hit (single, double, triple or home run). It also ignores a player's performance when no runners are in scoring position.
What we want is a statistic which gives a player credit for everything he does including situational hitting. Batting RunsBatting Runs Above Average by the 24 Base/Out States (RE24) - found at FanGraphs - does just that. In the past, I have discussed just plain Batting Runs (see the bottom section of the linked article). Batting Runs (RAA) is an estimate of how many runs a player contributed to his team beyond what an average hitter would have contributed in his place. RE24 is similar to RAA except that it uses base/out states in the calculation. An example of a base/out state is "runners at first and third and one out". There are 24 possible base/out states and RE24 takes all of them into consideration.
In the calculation of Batting Runs, a double with the bases loaded and two outs counts the same (0.770 runs) as a double with the bases empty and no outs. On the other hand, RE24 counts the bases loaded double more than the bases empty double (2.544 versus 0.632) because it does more to increase the expected runs scored in the inning.
RE24 for one at bat is the difference between run expectancy at the beginning and end of a play. For example, suppose JD Martinez bats with a runner on first and one out. In that situation, we would expect 0.556 runs to score by the end of the inning. Assume that Martinez then doubles, putting runners on second and third with one out. In that situation, we would expect 1.447 runs to score by the end of the inning. Therefore, Martinez's double is worth 0.891 runs.
Summing RE24 over all of a batter’s plate appearances yields his season total RE24. For
example, Martinez has a RE24 of -2.8 this year. So, by this measure, he has contributed about 3 runs below what an average batter would have been expected to contribute given the same opportunities. This is quite a bit lower than his 6.4 Batting Runs, which means that Martinez has not been very good in situations with high run expectancy. We can estimate that he has contributed 9 fewer runs than RAA indicates.
Table 2 below shows the differential between RE24 and RAA for all Tigers regulars. Most of the Tigers have negative differentials indicating their situational hitting has been poor. The worst offenders have been Martinez and Nick Castellanos (-7). The only Tiger with a positive differential so far is catcher Alex Avila at +2.
Table 2: Differences In Batting Runs and Runs Created for AL Teams, May 22, 2015
Player
|
RAA
|
RE24
|
RE24-RAA
|
Cabrera
|
19
|
15
|
-4
|
J. Martinez
|
6
|
-3
|
-9
|
Gose
|
5
|
6
|
-1
|
Cespedes
|
5
|
5
|
0
|
Kinsler
|
4
|
4
|
0
|
Iglesias
|
4
|
1
|
-3
|
Davis
|
3
|
2
|
-1
|
McCann
|
2
|
-2
|
-4
|
Avila
|
0
|
2
|
2
|
Castellanos
|
-2
|
-9
|
-7
|
V. Martinez
|
-7
|
-8
|
-2
|
Data source: Fan Graphs.com
So, it;'s true that the Tigers are not scoring as many runs as they should. This is something that should even out over the course of the season though, especially since much of the problem lies in one random situation (runner on first only) as Weinberg found. If they remain at the top of the league in OPS and wOBA, you can expect them to start scoring more as the season progresses.
When the Tigers lost starters Max Scherzer to free agency and Rick Porcello to trade, there was a lot of concern about the Tigers staff in 2015. Concern turned to angst when another starter Justin Verlander went down with a triceps strain in spring training. They would now have to rely on Shane Greene, Kyle Lobstein and Alfredo Simon to keep opposing offenses in check. That combined with a very questionable bullpen made the Tigers ability to prevent runs appear to be daunting task. But here we are almost a quarter of the way through the season and the the Tigers have allowed only 4.1 runs per game which is actually down from 4.3 last year. How is this happening?
As it turns out, Geene, Lobstein and Simon have held their own and the bullpen, led by new closer Yoakim Soria, has not been bad. However, right-handed starter Anibal Sanchez has struggled and Price has not quite replaced Scherzer's performance of the last two years. Various measures of pitcher contribution to run prevention show that the Tigers staff has not been as good this year. The standard FIP statistic has increased from 3.60 in 2014 to 3.76 in 2015. Other measures make the staff look even worse with xFIP ring from 3.76 to 4.11 and SIERA from 3.71 to 4.15.
So, while the staff has not been awful, it is not the reason for the improved run prevention. That leaves the defense and the gains there have been remarkable. The upgrades have been clear to anyone who follows the team closely. A now healthy Jose Iglesias has been a magician at shortstop and the outfield defense has improved markedly with the additions of left fielder Yoenis Cespedes and center fielder Anthony Gose and the subtraction of right fielder Torii Hunter. Even third baseman Nick Castellanos has gotten better at third base
Last year, defense cost the the Tigers 65 runs according to the Defensive Runs Saved (DRS) and I pointed out before the season that the Tigers could gain roughly six wins with only any average defense. Thus far, the Tigers have been even better than average with fielders saving them 15 runs by DRS. If they continue to save runs at the same place, they would be up to +60 by season's end which would be a 12 game improvement over last year! More conservatively, if they stay at +15, that would be an eight win improvement over last year. Either way that is a lot of wins for a defensive unit.
After years of watching fielders stumble and fumble around Comerica Park, the Tigers finally have a defensive team that is fun to watch. The improvement is obvious both to the eyes and to the calculator.
Data from the post were abstracted from FanGraphs.com.
Most readers of this blog are aware of the limitations of ERA or Run Average (RA) in evaluating pitcher performance. Two of the biggest issues are:
- RA gives pitchers full credit/blame for results of batted balls in play despite the fact that they share that responsibility with fielders. For example, a pitcher with a strong defense behind him will tend to give up fewer hits (and thus fewer runs) than if he has a poor defense behind him.
- RA gives pitchers full responsibility for sequencing or timing of events, that is, it assumes that they can control when they give up hits and walks. For example, if a pitcher pitches extraordinarily well with runners in scoring position in a given year, he will have a lower ERA than if he had a typical year in those situations. Additionally, a pitcher who tends to bunch base runners together in single innings will have a higher ERA than if he had a typical year distributing base runners more evenly.
In reality, pitchers have limited control over both the number of batted balls that drop for hits and sequencing of events. Thus, Defense Independent Pitching Statistics (DIPS) such as FIP, xFIP, tERA and SIERA have been developed to remove some of the noise of RA. DIPS are based on things that pitchers do control for the most part - walks, hit batsmen, strikeouts, home runs and types of batted balls (ground balls , fly balls, line drives, pop flies).
Because they are based on things that pitchers essentially control, the DIPS metrics are said to be better measures of true talent than RA. As a result, they are also better than RA at predicting future performance. However, they only measure a portion of a pitcher's talent and should be used as complements to RA rather than as replacements.
It is not known exactly how much control pitchers have on the results of balls in play, but recent research tells us that some pitchers are better than others at preventing hits on balls in play. For example, Mike Fast, formerly of Baseball Prospectus and now a MLB sabermetrician, used Sportsvision's hit f/x data to show how pitchers varied on the speed of balls off the bat.
So, rather than making the big leap from RA to FIP, it seems to be a good idea to first meet half way. Instead of removing hit prevention and sequencing in one step, it might be better to remove one factor at a time. Bill James did that with his Component ERA (ERC). Applying the runs created methodology to pitchers, he determined what a pitcher's ERA should have been based on walks, hit batsmen, strikeouts, homers AND hits allowed.
Additionally, The Base Runs measure was created by David Smythe in the early 1990s. It is based on the idea that we can estimate team runs scored if we know the number of base runners, total bases, home runs and the typical score rate (the score rate is the percentage of base runners that score on average). Base Runs also works well for individual pitchers. The complete formula can be found here.
There is now a new pitching metric that goes beyond any of the above measures. This new measure Deserved Run Average or DRA was developed by Baseball Prospectus Researcher Jonathan Judge with help from Harry Pavlidis and Dan Turkenkopf. The statistic is introduced in an overview article and in a more complex article explaining all the math.
The first article explains that the calculation of DRA starts by assigning weights to each batting event (similar to the wOBA statistic for batters) according to how much, on average, they contribute to runs scored. For example, a home run adds 1.4 runs on average and a double play costs 0.75 runs. That is similar to what ERC and Base Runs do.
The next steps are what separate DRA from its predecessors. It adjusts for all kinds of context such as:
- ballpark
- whether pitcher is pitching on home or road
- identity of opposing batter and handedness of batter
- identity of catcher and how proficient he is at framing pitches
- identity of umpire and how often he calls strikes versus balls
- runners on base and number of outs before each plate appearance
- run differential before each plate appearance
- quality of defense behind pitcher
- whether pitcher is starting or relieving
- game time temperature
- quality of base runners
- ability of pitcher to control running game.
- responsibility of pitcher for wild pitches and passed balls.
That's a lot of variables!
The result is a number that looks like an ERA but it attempts to isolate the runs for which a pitcher is truly responsible. For example, Tigers starter Anibal Sanchez has an ERA of 5.46, but other statistics indicate that he has not been so bad and is being charged runs for which he is not responsible. FIP has him at 4.11 and DRA makes him look even better at 3.93. So, while over five runs per nine innings have scored while Sanchez has been in the game, DRA is saying that he has been responsible for only slightly less than four per nine innings.
One question you might have is whether the the calculation of DRA is mathematically sound. As a statistical programmer with experience in mixed models, the foundation of DRA, the method looks fine to me as far as I can tell. Researchers such and Tom Tango and Brian Mills have reviewed it more thoroughly and seem to approve. If you are mathematically inclined and want to see for yourself, you can read the in depth article.
Do we need another pitching statistic? I think we do. ERA is not adequate in the short term for reasons discussed above. FIP and other DIPS variations get closer to measuring pitcher talent but leave questions unanswered such as what to do about batted balls in play. Components ERA and Base Runs address batted balls but give all the responsibility to the pitcher which is not correct.
Does DRA need to be so complicated? That is what I am not sure about. I like that they control for quality of batters faced and defense behind the pitcher, but do they need to control for game-time temperature? There is not much written so far about how much each component contributes to a pitchers value. There are an awful lot of variables and I get a little uncomfortable when we jump from simple to complex so quickly. These are questions rather than critiques as I don't know the answers and look forward to further research and discussion.
The question most of you are probably interested in is how does DRA evaluate Tigers pitchers? The answers are shown in Table 1 below.
Table 1: DRA for Tigers Starters
Pitcher
|
ERA
|
FIP
|
DRA
|
Simon
|
3.13
|
3.36
|
3.25
|
Lobstein
|
3.91
|
3.32
|
3.36
|
Price
|
3.48
|
2.82
|
3.40
|
Greene
|
4.60
|
3.52
|
3.53
|
Sanchez
|
5.46
|
4.11
|
3.93
|
Data Source: Baseball Prospectus
Right-hander Alfredo Simon leads Tigers starters and is tenth in the American League with a 3.25 DRA. This is only a little higher than his ERA (3.13) and slightly lower than his FIP (3.36). So, he looks good by any measure so far. As seen in the above example with Sanchez, the other Tigers newcomer Shane Greene also fares much better on DRA (3.53) than on ERA (4.60).
So, we've got a brand new pitching statistic with a lot of potential. It's probably somewhat better than other available numbers As with any new measure though, we should not view it as the end all of pitching statistics, but rather as another number to view while evaluating pitchers. Pitching evaluation has traditionally been quite muddy though, so a fresh new approach is very welcome.
|
|
|