Saturday, August 31, 2013

Scherzer Trails Darvish in Raw Run Prevention

It looks like it could be a somewhat contentious Cy Young race in the American League this year with traditionalists pointing to Tigers current ace Max Scherzer's incredible 19-1 record and MLB.COM's Brian Kenny rallying to eliminate W-L record.  Most readers of this blog are not going to put much, if any, weight on pitcher W-L record in evaluating pitchers, but what is the best statistic to use?  There is no easy answer of course and it's best to look at a variety of numbers.  I will discuss some of my favorites in the coming weeks.

Before getting into more complex stuff like FIP and BABIP and team defense, I like to look at simple run prevention: How many runs did the pitcher allow in how many innings?  We know that not every run allowed is the pitcher's responsibility, but it's a good place to start.

Scherzer has allowed 62 runs (all runs, not just unearned runs) in 183 1/3 innings (IP) for a Run Average (RA) of 3.05.  Table 1 below shows that Scherzer is fourth in the league in RA behind Rangers right hander Yu Darvish (2.84), teammate Anibal Sanchez (2.87) and Bartolo Colon of the Athletics (3.00).

Table 1: AL RA Leaders

Pitcher
Team
IP
R
RA
Yu Darvish
Rangers
174.2
55
2.84
Anibal Sanchez
Tigers
144.2
46
2.87
Bartolo Colon
Athletics
159.1
53
3.00
Max Scherzer
Tigers
183.1
62
3.05
Hiroki Kuroda
Yankees
171.1
60
3.16
Felix Hernandez
Mariners
187.2
67
3.22
Hisashi Iwakuma
Mariners
184.0
66
3.23
James Shields
Royals
189.0
68
3.24
Derek Holland
Rangers
180.0
66
3.30
Chris Sale
White Sox
180.1
67
3.35
Jered Weaver
Angels
128.1
49
3.44
Matt Moore
Rays
121.1
47
3.49
Justin Masterson
Indians
188.1
74
3.54
John Lackey
Red Sox
155.0
61
3.54
Ervin Santana
Royals
180.2
72
3.60
 Data source: FanGraphs.com

The first thing you may have noticed is that Scherzer has more innings pitched than all the pitchers ahead of him and he needs to get credit for that.  In order to give pitchers credit for quantity of innings pitched as well as quality, Pete Palmer introduced the Pitching Runs (PR) statistic in 1984.  Pitching Runs tells us the number of runs saved or lost by a pitcher compared to league average.  It is based on a pitcher's IP, runs (R) and league RA.  Palmer actually used earned runs, but I prefer runs. 

The American League RA is 4.33, which is .48 runs per inning.  So, you would expect the average pitcher to have allowed 88.2 runs in 183 1/3 innings.  Thus, Scherzer has allowed 62 - 88.2 = 26.2 fewer runs than an average pitcher in the same innings, that is, he has 26.2 Pitching Runs.  The complete formula is:

PR = IP * (Lg RA / 9) - R  

or ( if you prefer earned runs):

PR = IP * (Lg ERA / 9) - ER

The AL leaders are listed in Table 2 below.  Darvish is still the leader with 28.8, but Scherzer moves up to second on this metric. 

Table 2: AL Pitching Runs Leaders

Pitcher
Team
IP
R
PR
Yu Darvish
Rangers
174.2
55
28.8
Max Scherzer
Tigers
183.1
62
26.1
Bartolo Colon
Athletics
159.1
53
23.5
Anibal Sanchez
Tigers
144.2
46
23.4
Felix Hernandez
Mariners
187.2
67
23.1
James Shields
Royals
189.0
68
22.9
Hisashi Iwakuma
Mariners
184.0
66
22.5
Hiroki Kuroda
Yankees
171.1
60
22.3
Derek Holland
Rangers
180.0
66
20.6
Chris Sale
White Sox
180.1
67
19.6
Justin Masterson
Indians
188.1
74
16.5
Ervin Santana
Royals
180.2
72
14.7
John Lackey
Red Sox
155.0
61
13.6
Jered Weaver
Angels
128.1
49
12.6
Chris Tillman
Orioles
167.0
68
12.3

Data source: FanGraphs.com

You can go one step further by considering ballpark environment. According to the FanGraphs.com, ballpark factors, Comerica Park allows about 2% more runs than average, so we would multiply Scherzer's pitching runs by 1.02 yielding 26.6 Adjusted Pitching Runs.  Table 3 shows that Darvish leads with 30.5 Adjusted Pitching Runs followed by Scherzer 26.6 and Sanchez (23.8).

Table 3: AL Adjusted Pitching Runs Leaders

Pitcher
Team
IP
R
PR Adj
Yu Darvish
Rangers
174.2
55
30.5
Max Scherzer
Tigers
183.1
62
26.6
Anibal Sanchez
Tigers
144.2
46
23.8
James Shields
Royals
189.0
68
23.4
Bartolo Colon
Athletics
159.1
53
22.8
Hiroki Kuroda
Yankees
171.1
60
22.8
Derek Holland
Rangers
180.0
66
21.8
Felix Hernandez
Mariners
187.2
67
21.7
Hisashi Iwakuma
Mariners
184.0
66
21.2
Chris Sale
White Sox
180.1
67
20.4
Justin Masterson
Indians
188.1
74
15.8
Ervin Santana
Royals
180.2
72
15.0
John Lackey
Red Sox
155.0
61
14.3
Chris Tillman
Orioles
167.0
68
12.7
Jered Weaver
Angels
128.1
49
12.1

Data source: FanGraphs.com

Pitching analysis does not end with runs scored of course.  We need to try to isolate a pitcher's responsibility for runs allowed from that of his defenders, but that's complicated and to some extent unknown.  I'll talk more about that later, but in terms of pure runs scored, Scherzer is currently a little behind Darvish with a month to go.   

26 comments:

  1. Well Scherzer blows away Darvish on my OE% calculation. Last time I did it Scherzer had a .298 when I compared it to Bucholz and Sale. Scherzer is holding steady at .300. But Darvish scores in at .381. And this statistic is a performance and efficiency based measure so that would make the difference for me if I was voting. I think Scherzer is going to win.

    ReplyDelete
  2. I'm guessing that Scherzer wins OE% because he has lower walk and home run rates. He also beats him in FIP for the same reason. However, if you include situational pitching (RE24), Darvish takes the lead so I don't think it's clear cut. Right now, I'd probably vote for Scherzer, but I think there are reasonable arguments for both.

    I think Scherzer is going to win the Cy Young, but not necessarily for the right reasons.

    ReplyDelete
    Replies
    1. Well plus the 19-1 posting will score him some points in the process. He has too much of an edge. Stepping up when Verlander has a "down" season is a great story. The dude is pitching great every time and he has done one hell of a job. He's the winner unless he blows it.

      Delete
    2. And yeah the HRs are part of the score, all the numbers really do is explain the score of what happened and thus suggest how dangerous the pitcher is with respect to what he has produced. So if he pitches the exact same relative to Darvish, then that would represent a nice statistical advantage in a head's up competition. This is the story that the numbers of this particular statistic has to say.

      Delete
    3. Hi Lee: Why is RE24 relevant to starting pitching? I mean a starting pitcher doesn't inherit a high level situation. He is responsible for all baserunners. Maybe that he can *leave* an inning with less than three outs and runners on (i.e. higher leveraged situation)? If this is so then what the stat is capturing (beyond FIP or PR) is sequencing? Cheers, Kevin

      Delete
    4. Kevin, you are right that RE24 is less relevant for starters than relievers. It does capture sequencing such as how he pitches with runners on base. I was trying to capture Darvish's performance with runners in scoring position. In a later comment, I used LOB% which is probably more relevant.

      Delete
    5. Is performance with runners in scoring position really a skill? I'm sure there are some small aspects of it that are, but isn't it mostly luck? Over a large enough sample size, isn't a pitcher's performance in a particular situation going to be very close to their performance overall? If so, it doesn't seem any more meaningful to me than W/L record.

      Delete
    6. Jeff, I don't know how much of a skill it is, but if we are measuring how much value a player added to his team in a given year, I think clutch hitting/pitching is useful to track even if it might be largely luck. I wouldn't use it to predict future performance, but I might factor it into awards voting.

      It's not really like W/L record. The biggest problem with W/L record is that it's largely dependent on offensive support which is beyond the control of the pitcher. Pitching with runners on base is something that a pitcher could have some control over. There are pitchers (like Porcello) who consistently pitch poorly with runners on base, so it's not all luck.

      Delete
    7. I don't see the difference. The key to a good W/L is to consistently allow fewer runs than the opponent. Obviously Scherzer has no control over what the offense provides him, but once those runs are on the board, the only thing that matters is that he keeps the opponent's number smaller. Whether he does that by hurling a 1-2-3 inning, by stranding a runner at third, or by letting a meaningless run score in order to avoid a big inning makes little difference to me.

      I'm sure there's *some* amount of skill to any particular player's handling of any particular situation, but it's virtually always dwarfed by their overall skill level.

      Delete
    8. Winning and losing is something that a team does. It doesn't really measure anything specific about an individual pitcher. If a pitcher gets 10 runs of support in every game, he can win a lot of games with a 7.00 ERA, but it wouldn't make him a good or valuable pitcher. What makes him a good pitcher is preventing runs.

      RE24 does measure something about a pitcher's performance. If two pitchers give up the same numbers and kinds of hits, same walks, same strikeouts but one pitcher pitches better with men on base, it will help him prevent runs better. You can argue whether it's a skill or not, but it's something that the pitcher did.

      Now, if you argued that ERA with a one or two run lead was similar to ERA with RISP, then you might have an argument. However, I don't see the RE24 versus wins analogy. RE24 may or may not be useful, but it's not like wins/losses.


      I don't think the MVP award should measure skill. It should measure production, so I think it's reasonable to give a pitcher more credit for

      Delete
    9. Well one scenario that Jeff is bringing up is what if you know you have a big lead and you willingly allow or plan to allow a cheap easy run in exchange for getting a critical out. If the game was won purely by the pitching statistics then maybe the pitcher wouldn't be so quick to concede a run and will take a more aggressive pitching strategy. So if you get double the run support of a normal pitcher, then perhaps you are more prone to giving up runs in exchange for outs whereas a pitcher with less run support might be forced to try for a better statistical performance because they are desperate.

      So if there is that meaningless man on 3B and the game is tied, you might pitch hard for a strikeout and do anything you can to avoid that run scoring. But if you have a big lead then maybe your focus on the runner is less. So with more run support could come more intentional bad statistical pitching on purpose. Just like an NFL team will go into a Prevent Defense knowing it's worth it to get gashed for a lot of extra easy yards so long as you are making it trickier for the other team to get enough points to overtake you, but you might concede a lot of the irrelevant points along the way which you wouldn't otherwise do if it was a tight game.

      Delete
    10. W/L is certainly a team effort, but so is preventing runs. All the pitching in the world won't help you if you have a bunch of little leaguers in the field. Even if you completely adjust for the fielders, pitching stats are still influenced by the offense. Pitchers are more apt to pitch to or away from contact depending on the score and inning. On top of that, opposing managers are certain to adjust their lineup, hitting, and running strategies based on the score and inning. All of those adjustments will impact run scoring.

      Stats have come a long way in the last few decades (and I'm a big fan of the improvements) but you can't simply dismiss a stat as being "team based" because they're *all* team-based to varying degrees. The perfect stat may arrive one day, but it isn't here yet.

      With that in mind, it takes wins to get to the postseason. Runs may be more reliable for predictive purposes, but you're right that production is what matters for awards. The starting pitcher has a far greater impact on the game than anyone else, so I don't have a problem with crediting him for leading the team to victory, even if he got a lot of help.

      I'm not a "wins are all that matters" guy by any stretch of the imagination, but if you ask me to pick between two guys with similar peripherals, W/L seems like the only sensible tie breaker to me.

      Delete
    11. Yes, preventing runs is a team effort which is why defense needs to be considered which I mentioned in the post. This kind of gets away from your argument though. Your argument was that RE24 is limited in the way that W/L is limited which I don't agree with. It does not seem that likely to me that team defense will make a pitcher pitch better with runners on base than he does with the bases empty. It's possible that could happen, but it's not a natural correlation. On the other hand, the number of runs an offense scores has a fairly strong correlation with W/L record. So, I don't really see wins as a useful measure for an individual player.

      TSE pointed out that a pitcher could pitch to the score which is true. Baseball is not like football because there is no time limit, but it's possible a pitcher pitches to the score to some extent. Jack Morris insists that he pitched to the score, but there is no evidence that the reason for his high ERA was a result of pitching to the score. It seems that he pitched the same with big leads as small. If one could prove for a particular pitcher that his ERA was higher because he was pitching to the score, you could make some kind of adjustment to his ERA.

      I understand the tradition of W/L record and I'll look at it for the fun of it, but I don't see it as a useful measure of individual performance. Not just because it's inaccurate, but because it doesn't really make sense to assign a team victory to an individual player. I might use it as a tie breaker, but it would be a last resort tie breaker.

      One other thing is the perfect stat is not going to ever arrive and no good analyst would pretend that it has or will. I always recommend looking at more than stat and using different stats to answer different questions.



      Delete
    12. Anyway, I have a feeling this whole discussion is about Scherzer's W/L. I won't just look at W/L record. However, If you can show me evidence that he is doing something like pitching to the score which makes his W/L record better than his ERA, then I'll look at that.

      Delete
    13. Yeah I agree with that philosophy. Every stat has it's own place because every stat tells it's own unique story. The key is in wrapping your head around all the stats and being able to see the bigger picture for what it truly is to the best of your ability to pierce that complicated and mysterious veil.

      Delete
    14. My argument is that *every* decision is influenced by the score and the inning. Whether or not any particular situation impacts any particular stat is way above my paygrade, but you must acknowledge that the score and the inning color everything.

      * Batters on both sides are more or less willing to swing at different pitches in different situations
      * Batters on both sides are more or less willing to guess that a particular pitch is coming in different situations
      * Runners on both sides are more or less willing to steal bases in different situations
      * Runners on both sides are more or less willing to pursue an extra base on a ball in play in different situations
      * Pitchers are more or less willing to throw certain pitches and/or certain locations in different situations
      * Fielders are more or less willing to position themselves in certain ways in different situations
      * Managers are more or less willing to call certain plays in different situations
      * Heck, even the umps call things differently in different situations, though you'd never get them to admit it

      My point is that *every* decision made on a ballfield on any given night is colored by the situation. Even the team's position in the pennant race plays a factor. I'm not saying that any of this supports Scherzer in the debate for Cy Young; I'm just saying that it's pure folly to look at one piece of context (RE24) while simultaneously ignoring all of the others.

      It may certainly be true that given equal run support, Darvish is the better man, but it's also very plausible that he wouldn't be. Unless you have a simulator that allows us to scientifically examine all of the different permutations, all we can honestly say is "hypotheticals, shmypotheticals."

      Delete
    15. "Shmypotheticals"....I love it and it should be the name of a movie starring Will Ferrell as a baseball manager.

      Delete
    16. You're right Jeff. We can't capture everything, but we should track what we can measure. I think that RE24 is cleaner than what you are making it out to be. It is a natural extension of runs, hits, walks, etc allowed. It captures something that is easily measurable - how the pitcher pitched in different base/out situations. I don't know how much of a skill it is, and wouldn't use it for forecasting, but in looking back at a season, it describes something meaningful to me.

      If the other variables you are talking about were clean enough, I would measure them too, but they don't seem useful to me in their current state.

      Delete
  3. The two places where Darvish has an edge are strikeouts and stranding base runners. He has held batters to a .136 batting average with runners in scoring position and has a 84.3 left on base percentage. Those numbers don't carry a huge amount of weight for me, but they are too extraordinary to ignore.

    ReplyDelete
    Replies
    1. Well also Scherzer makes almost $3MM less this year, so that could get score some intangible tiebreaker points with some voters.

      Oh and what about the double that Scherzer hit this year, shouldn't that count too? I don't see Darvish providing that! Maybe it's just easier to subtract one double from Scherzer's pitching stats before running any numbers! Darvish is 0-3 and Scherzer is 1-3.

      Delete
  4. I agree with everything you've said, Lee. I just don't see why you don't apply the same logic to W/L. While probably not useful for forecasting, it's *at least* as meaningful as any other stat when it comes to measuring production.

    It should probably be divided by run support somehow in order to transform it into a rate stat, but even as counting stat, it's a great way to measure how well a pitcher has protected the leads he's been given.

    ReplyDelete
  5. My problem with W/L is I don't know that it adds anything to a pitchers value beyond what other stats already do better. A pitcher with a really good W/L record is a good pitcher more often than not especially when you look at his entire career, but what is it really measuring? It's combining many things at the same time, too many of which have nothing to do with the pitcher. If there is a way to isolate out the pitcher responsibility for wins, I'd be interested in that. As it stands now though, I see it getting in the way of the story instead of adding to it.

    As you say, everything is interconnected and complex. Runs allowed also combines a lot of things, but I believe the percentage a pitcher contributes to runs allowed is much greater than what he contributes to wins. He potentially contributes to all of runs allowed in some way, because he's the one who allows the batter to hit the ball. Obviously, defense also has an impact, but it all starts with the pitcher. With wins, half the equation is totally out the hands of the pitcher.

    ReplyDelete
  6. >What is [W/L record] really measuring?

    It measures how well a pitcher protects the leads he is given. It's a catch-all stat, so you're right that there's a lot of noise in it, but it also captures a lot of things (such as the list I gave above) that aren't measured independently yet.

    > With wins, half the equation is totally out the hands of the pitcher

    That's certainly true, but I think anyone with a serious interest in the game already adjusts for that fact intuitively. For a mathematical approach, all you'd really have to do is divide the wins by the total run support to get an indication of how well a pitcher turns his run support into wins.

    ReplyDelete
  7. It could measure that in some cases. However, a pitcher could be good at protecting leads but not not have very many leads. Or a pitcher could be bad at protecting leads, but his offense keeps bailing him out. If a stat measures something and the pitcher is good at that something, he should do good on that stat. I'd rather see splits that look at his RA when he has a small lead versus when it's a blow out.

    To your second point, you could look at how many runs a pitcher would normally have given his runs allowed and run support and see if his win total total exceeds that. I'd consider that, but not just W/L.

    Sometimes you can tweak the traditional stats to make them a little more meaningful. Another example is RBI. RBI is is a team stat (although not to the extent of wins), but becomes more meaningful as an individual measure if you divide by RBI opportunities.

    ReplyDelete
  8. Good points. Any idea how to define a "win opportunity"? Maybe take the Save % approach with something like 1 - (Leads Surrendered / Leads)?

    ReplyDelete
  9. I'm not sure how I'd define an opportunity. I'd probably want to look at a bunch of real pitchers and see what made sense first, but something like that might work. You could make it like a save where you only give a pitcher credit for the win if he has a 3-run lead or 4-run lead or something and has an ERA under 4.00 for the rest of the game. I don't know if that would work - just throwing out some numbers.

    ReplyDelete

Twitter

Blog Archive

Subscribe

My Sabermetrics Book

My Sabermetrics Book
One of Baseball America's top ten books of 2010

Other Sabermetrics Books

Stat Counter