Friday, February 22, 2013

OPS+ versus wRC+

Over the last few years, fans have become increasingly familiar with Adjusted OPS (OPS+), a statistic, first introduced by Pete Palmer back in the 1980s.  This OPS+ metric is OPS adjusted for league average and home ballpark.  It is useful because it allows us to compare players who played in different ballparks and/or different eras.  It is calculated using the following items:

OBP = On-Base Percentage
SLG = Slugging Average
MLB OBP = MLB Average OBP (with pitchers removed)
MLB SLG = MLB Average SLG (with pitchers removed)
BFP = Ballpark Factor

The formula is

OPS+ =  (OBP/MLB OBP + SLG/MLB SLG - 1) x 100/BPF

Using Miguel Cabrera's 2012 season as an example yields

OPS+ = ((.393/.324 + .606/.413 - 1) x 100)/ 102 = 165

In general, an OPS+ of 100 is average, an OPS+ of above 100 is above average and an OPS+ of less than 100 is below average.  There is a popular misconception that OPS closely matches the ratio of a player's OPS to league OPS.  However, an OPS+ of 165 does not mean that Cabrera had an OPS 65% better than league average.  We know it's a really high OPS+ because it was the highest in the American League, but it has no concrete meaning.

Another limitation of OPS+ is that it counts OBP and SLG the same when OBP actually contributes about 80% more to run scoring than SLG.  Thus, players who get most of their production from OBP will be short changed by both OPS and OPS+ .  At any rate, the OPS+ figures can be found at Baseball-Reference.com for all players.

The OPS+ metric is OK for many purposes as long as you understand the shortcomings.  If you want a more reliable statistic, you can use Weighted Runs Created Plus (wRC+), a creation of Tom Tango.  Because it is based on Weighted On-Base Average (wOBA), wRC+ more accurately weights batting events (1B, 2B, 3B, HR, BB, HBP, outs) than OPS does.

It is calculated as:

a = MLB Runs per PA ( with pitchers removed)
b = Park Adjusted wRAA/PA (with pitchers removed)
c = b/a + 1
wRC+ =c x 100

Cabrera had the following numbers in 2012:

a = .1164
b = 53.7/697 = .0770
c= .0770/.1164 + 1 = 1.66
wRC = 1.66 x 100 = 166.

Another benefit of wRC+ beyond it's accuracy is that it has a concrete interpretation.  In Cabrera's case, he created 66% more runs than would be expected by an average hitter in 697 PA.  The numbers for all players can be found at FanGraphs.com.

The wRC+ metric is on the same scale as OPS+ and does not generally produce wildly different results.  The biggest difference I found for the Tigers was Austin Jackson (135 wRC+ versus 130 OPS+).  The OPS+ vs. wRC+ comparison for the rest of the Tigers in 2012 is shown in Table 1 below.  The third colum is the percentile among players with 250 or more PAs.

The lesson to be learned here is that wRC+ is a little better than OPS+ and should be used for more serious evaluation of players.  However, if you prefer using the Baseball-Reference site, OPS+ is a reasonably good estimate of a player's relative hitting value in most cases.

Table 1: wRC+ versus OPS+ for Tigers, 2012


Player
OPS+
wRC+
PCTL
Cabrera
165
166
99
Fielder
152
153
97
Jackson
130
135
88
Dirks
130
133
87
Hunter (LAA)
132
130
85
Avila
100
104
52
Infante (DET/MIA)
93
92
34
Berry
86
89
29
Young
89
89
29
Peralta
85
86
24
Boesch
77
77
13
Santiago
52
55
1
Raburn
30
28
0


Wednesday, February 20, 2013

Champ Summers Tribute Video

Earlier in the year, I wrote an article on former Tigers outfielder Champ Summers, who passed away last fall.  Today, I received an e-mail from Stan Beaubouef, a good friend of Champ, who had read the piece.  Mr. Beaubouef gave the eulogy at Summers' memorial service and part of the service was a slide show which he later posted to You Tube.  The video is well done.  I hope it brings back some memories to Tigers fans of the late 1970s, early 1980s. 

Monday, February 18, 2013

What is the Best Tigers Line-up?

Every fan has his own idea of the ideal line-up.  Traditionalists tend to like to have a speedster lead off, a bat-control guy hit second, the best hitter third and the best slugger (who is not also the best hitter) bat fourth.  Some just want the numbers one and two hitters to get on base a lot and don't care as much about speed. Others follow The Book by Tom Tango, Mitchel Lichtman and Andrew Dolphins which claims that the best hitter should not bat third, bat rather first, second or fourth.  Still others toy with the idea of having the best hitter on the team lead off, the second best hitter bat second, etc. with the reasoning that the best hitters should get the most at bats.   

One thing I like to do before every season is check out the line-up tool at Baseball Musings.  Developed by analysts Cyril Morong, Ken Arneson and Ryan Armbrust, it estimates the number of runs a line-up would score based on every batter's on-base percentage (OBP) and slugging average (SLG).  Since getting on base (OBP) and advancing runners with hits (SLG) are the two most important elements of run scoring, their method makes some sense.

However, the line-up algorithm also has limitations.  Perhaps most importantly, it does not consider the speed of base runners.  It also does not address psychological factors such as batters feeling comfortable in certain spots.  What it does do is try to determine the best line-ups based purely on hitting which is a good place to start.

Using the Bill James Handbook projections, I plugged OBP and SLG for the nine Tigers starters into the line-up analyzer.  The Handbook projections tend to be optimistic, but this is the time of the year to be optimistic.  Anyway, one possible line-up is shown in Table 1 below.  The line-up tool says that line-up would score 5.687 runs per game or 921 runs in 162 games.  That's a lot of runs, but that's because we are assuming that all nine players are going to play 162 games which, of course, won't happen.  That's OK though.  The goal is just to compare different line-ups.

Table 1: Tigers Projected Line-up


Player 1:
Player 2:
Player 3:
Player 4:
Player 5:
Player 6:
Player 7:
Player 8:
Player 9:

The line-up tool considers every possible permutation of those nine batters and estimates that the best line-up would score 5.766 RPG or 934 runs, while the worst would score 5.502 RPG or 891 runs.  That is a difference of 43 runs which is not huge, but not insignificant either - between four and five wins.

Table 2 shows that four of the five best line-ups have Prince Fielder leading off!  In fact, eight of the top ten have Fielder at number one and all of the top thirty have either Fielder or Alex Avila.  Remember though that this only looks at hitting and does not consider speed of which Fielder and Avila have none.  More interesting to me is Cabrera in the two hole in all of the top thirty line-ups.  That actually makes some sense, but I'd probably want someone with at least a little speed (as well as the ability to get on base) in front of him.

You also might notice that all of the long list of "best" line-ups have Omar Infante batting ninth preceded by Andy Dirks, Torii Hunter and Jhonny Peralta in some order.  That also looks good to me, although we already know that Hunter will hit second in Jim Leyland's line-up.   

Table 2: The Five Top Run-Producing Line-ups

5.766 Fielder Cabrera Jackson Avila Martinez Peralta Hunter Dirks Infante
5.766 Avila Cabrera Jackson Fielder Martinez Peralta Hunter Dirks Infante
5.766 Fielder Cabrera Jackson Martinez Avila Peralta Hunter Dirks Infante
5.765 Fielder Cabrera Martinez Avila Jackson Peralta Hunter Dirks Infante
5.765 Fielder Cabrera Jackson Avila Martinez Hunter Peralta Dirks Infante

Table 3 looks at the worst line-ups.  Right away, you see the first problem - that Cabrera is batting ninth  which would obviously never happen.  As bad as those line-ups are, they would still produce less than 5% fewer runs than the best line-ups.  We want those five percent though, so those line-ups are out. 
 
Table 3: The Five Lowest Run-Producing Line-ups

5.502 Peralta Infante Jackson Dirks Hunter Martinez Fielder Avila Cabrera
5.502 Dirks Infante Jackson Peralta Hunter Martinez Fielder Avila Cabrera
5.502 Peralta Infante Jackson Hunter Dirks Martinez Fielder Avila Cabrera
5.503 Peralta Infante Jackson Dirks Hunter Fielder Martinez Avila Cabrera
5.503 Dirks Infante Jackson Peralta Hunter Fielder Martinez Avila Cabrera


It's doubtful than any manager would ever have Fielder or Avila bat leadoff, but suppose we have Jackson lead off followed by Cabrera, an idea that appeals to me.  The bottom four will be Dirks, Hunter, Infante and Peralta in some order.  Fielder, Martinez and Avila will bat 3-4-5 in some order.  I played around with various combinations and came up with the line-up in Table 4.  This one would score and estimated 930 runs, 9 more runs or one win better than the Table 1 line-up.  That's probably not worth the uproar caused by having Cabrera batting second, but I like it in theory. 

Table 4:  One More Line-up


Player 1:
Player 2:
Player 3:
Player 4:
Player 5:
Player 6:
Player 7:
Player 8:
Player 9:

Sunday, February 17, 2013

Putting Ballpark Effects into WAR

One of the most misunderstood elements of the Wins Above Replacement (WAR) framework is ballpark factors.  Most fans realize that extreme parks like hitter-happy Coors Field in Colorado and pitcher-friendly PETCO Park in San Diego tend to have significant effect on the production of hitters that play there.  However, some are not grasping how ballpark effects are used in WAR.  It's not important that people comprehend the complex formulas used in measuring ballpark effects.  It's more crucial that they see the purpose of ballpark factors, but some seem to be a little confused.  So, I'll try to clarify that here.

Rockies outfielder Carlos Gonzalez and Cardinals catcher Yadier Molina and had very similar batting lines in 2012.  Gonzalez hit .303//.371/.510 with 27.2 Runs Above Average (wRAA) in 579 plate appearances (PA), while Molina batted .315/.373/.501 with 27.3 wRAA in 563 PA.  However, Gonzalez played half his games in Colorado's Coors Field which was the best hitters park in the majors.  On the other hand, Molina played his home games in  Busch Stadium which was more of a pitcher's park.  These different environments need to be accounted for in our evaluation.

Some fans say they do not like ballpark factors because we don't really know what either Gonzalez or Molina would do in a different park.  In fact, Gonzalez's radical home/road split in 2012 (1.046 OPS versus .706 OPS) suggests that he might have benefited more from Coors than the average player.  In Molina's case, he actually did slightly better at home the last two years, so there is no guarantee that playing in a better hitter's park would have helped him.

It is true that a park can influence players in different ways.  A player's skill-set might be more tailored for his home park than another player.  For example, some parks are designed for left-handed batters over right-handed batters or power hitters over contact hitters.  In other cases, a player just might be more comfortable in one park over another.  These factors would all be important if we are interested in a player's ability to perform in a given park in the future.

WAR, however, is not designed to determine a player's ability or potential to play in a different park.  In calculating WAR, it doesn't matter what Gonzalez would do if he were traded to the Cardinals or if the dimensions at Coors were changed.  WAR is not concerned in a player's ability or innate talent.  The purpose of WAR is to estimate a player's value to his team.

The question of interest is how much is a run worth in Coors versus Busch?  Because it is easier to score runs in Coors than Busch, a run scored in Colorado is less valuable than a run scored in St. Louis.  In other words, the average player will contribute more runs playing in Coors than in Busch, so a Rockies hitter needs to produce more runs in order to have the same value as a Cardinals hitter.

Again,it does not matter if a player like Juan Pierre is not benefiting from hitting in Colorado.  If he is playing for the Rockies, he needs to have his value adjusted downward (as any other player would) because the Coors high-run scoring environment reduces the value of his runs contributed. 

Calculation of ballpark factors is complex.  A detailed description can be found here at Toirtap's sabermetrics site.  This is also the method used by FanGraphs.  In short, five years of home and road runs scored data are used to calculate park effects.  If a park is new or has been renovated within five years, then a shorter time period is used.

It has been determined that it is 26% easier to score runs in Coors Field than a neutral park.  A Rockies hitter plays half of his games in Coors and half in a combination of parks that averages to near neutral (Since schedules are unbalanced, that's not quite right in every case, but we'll keep it simple here).  So, the value of the average run produced by a Rockie is 13% less than if his home park was neutral like Turner Field in Atlanta.

Given the above, We say that that the Rockies Ballpark factor (BPF) is 113.  The BPF for a neutral park is 100.  The most pitcher-friendly park is PETCO with a BPF of 92.  See the FanGraphs Guts section for a list of all ballpark factors by year.

Getting back to the opening example, remember that Gonzalez had 27.2 wRAA in 579 PA and Molina 27.3 wRAA in 563 PA.  Let's now calculate their ballpark-adujusted run values: 

The average player created .114 runs per PA in 2012, so he would  have an estimated .114 x 579 = 66.0 Runs Created in 579 PA in a neutral park. The same player would have 66.0 x 113/100 = 74.6 Batting Runs if his home park was Coors.  So, Gonzalez's value is reduced by 74.6 - 66.0 = 8.6 runs. Gonzalez had 27.2 wRAA in 2012, so his ballpark adjusted batting value becomes 27.2 - 8.6 = 18.6.  

An average player would create .114 x 563 = 64.2 runs in a neutral park.  Using the Busch Stadium BPF of 97, that player player would create 64.2 x 97/100 = 62.3 runs playing for St. Louis.  So, Molina's value is increased by 64.2 - 2.3 = 1.9 runs.  Thus, his ballpark adjusted batting value is 27.3 + 1.9 = 29.2.  After correcting for ballpark, Gonzalez and Molina no longer look like the same hitter.  Instead, Molina comes out ahead 29.2 to 18.6.

Monday, February 11, 2013

PECOTA Comparables

Baseball Prospectus has released its Player Empirical Comparison and Optimization Test Algorithm (PECOTA) projections for 2013.  PECOTA is a complicated projection system created by Nate Silver and more recently improved by Colin Wyers.  It uses the statistics and characteristics (age, height, weight, position) of a given player and the statistics and characteristics of similar players to arrive at projections for that player.

Accessing the data requires a subscription, so I can't reveal too much, but I will give you a look at a few players.  Rather than giving you statistical projections, I will present lists of players which were considered the closest comparisons to Tigers players.  These comparisons are fun, but should probably not be taken all that seriously. 

Nick Castellanos

Aramis Ramirez Brooks Robinson Kevin Bell

Like most projection systems, PECOTA does not believe that Castellanos is ready for the majors yet, but Ramirez is a nice comparison to think about for the future.  Robinson doesn't make as much sense, since Castellanos isn't likely to be an elite defender regardless of where he ends up.  Bell is not a comparison we want to be thinking about.      

Avisail Garcia

Claudell Washington Jose Guillen Roberto Clemente

There is certainly a lot of ground between Washington/ Guillen and Roberto Clemente!  I'm going to guess that Garcia will be a lot more like the former than the latter. 

Andy Dirks

Alejandro De Aza Darin Erstad Omar Infante

De Aza is not too interesting as we don't know where he is headed anymore than Dirks.  Erstad seems like a reasonable comparison.  The bad news is that Dirks is already older (27 years old) than Erstad was in his career year in 2000.  Dirks will get to play on the same team as his other comparable (Omar Infante).  As an aside, PECOTA likes Infante quite a bit giving him the fourth highest WARP on the Tigers for 2013. 

Bruce Rondon

Ambrioux Burgos Matt Anderson Boone Logan

Remember Matt Anderson?  The Rondon/Anderson association may be your greatest fear heading into the 2013 season.  I do believe Rondon has more ability, but it's really hard to project young relievers.

Drew Smyly

Cole Hamels John Danks James McDonald


Hamels is probably a little too much to hope for.  John Danks? Perhaps.

Prince Fielder

Miguel Cabrera Jeff Bagwell Frank Thomas

You can't go wrong with that trio. 

Sunday, February 10, 2013

Accounting for Situational Hitting with RE24

Many fans grumble that statistics like OPS and Batting Runs don't account for situational hitting.  For example, if Tigers slugger Prince Fielder singles with a runners on second and third to drive home two runs, he gets the same credit as he would for a single with the bases empty.  Some will argue that this is not fair because he contributes more to his team in the former scenario than the latter.  In this post, I will introduce a statistic which accounts for a hitter's performance in different circumstances.

Traditional fans like to address situational hitting with the familiar Runs Batted In statistic, but that is a team dependent measure.  A player has more or less opportunity to drive in runs depending on who is batting in front of him.  Thus, a player gets acknowledged for driving home runs, but does not get penalized for failing to drive home runs.  So, the RBI count is not an adequate measure of situational hitting.

Other fans point to batting average with runners in scoring position, but that is based on a limited number of plate appearances.  It also doesn't consider the number of outs, the specific base runners (e.g. bases loaded versus second base only) or the type of hit (single, double, triple or home run).  Additionally, it ignores a player's performance when no runners are in scoring position. 

What we want is a statistic which gives a player credit for everything he does including situational hitting.  Batting Runs Above Average by the 24 Base/Out States (RE24) - found at FanGraphs - does just that.  The RE24 statistic is also sometimes referred to as "Value Added".  This metric will give a player credit for his singles, doubles, and all other events, and gives him extra credit for hits occurring with runners on base.  It even gives him points for a scenario which most other metrics ignore - moving a runner over with a ground out.  Conversely, it subtracts extra points for hitting into double plays.

In a recent post, I discussed just plain Batting Runs or Weighted Runs Above Average (wRAA) which is an estimate of how many runs a player contributed to his team beyond what an average hitter would have contributed in his place.  The RE24 metric is similar to wRAA except that it uses base/out states in the calculation.  An example of a base/out state is "runners at first and third and one out".  There are 24 possible base/out states and RE24 takes all of them into consideration.

In the calculation of wRAA, a double with the bases loaded and two outs counts the same (0.770 runs) as a double with the bases empty and no outs.  Conversely, RE24 counts the bases loaded double more than the bases empty double (2.544 versus 0.632) because it does more to increase the expected runs scored in the inning.

The RE24 metric for one at bat gives us the difference between run expectancy at the beginning and end of a play.  For example, suppose Fielder bats with a runner on first and one out. In that situation, we would expect 0.556 runs to score by the end of the inning.  Assume that Fielder then doubles, putting runners on second and third with one out. In that situation, we would expect 1.447 runs to score by the end of the inning. Therefore, Fielder's double is worth 0.891 runs.

Summing RE24 over all of a batter’s plate appearances yields his season total RE24. For
example, Fielder had a RE24 of 51 this year.  So, by that measure, he contributed 51 runs above what an average batter would have been expected to contribute given the same opportunities. This is a little higher than his 46 wRAA, which means that Fielder was especially good in situations with high run expectancy and added more to his team’s runs total than wRAA indicated.  We can estimate that he has contributed an extra 5 runs with his situational hitting.

Since situational hitting is largely (although not completely) random, RE24 is less predictive than wRAA and should not be used as a measure of ability.  It is, however, a good alternative to wRAA for looking at past performance.

Table 1 below shows us the RE24 in 2012 for some past and current Tigers.  Other columns in the table include, wRAA and the difference between RE24 and wRAA (RE24-BatRuns).  Fielder was the Tigers RE24 leader followed by Cabrera who had 47 RE24.  Cabrera was the runaway leader in wRAA (57), but cost the Tigers runs with situational hitting. Much of the reason for that was probably his high double play total.

Other Tigers who added value with situational hitting included Quintin Berry (+10) and newly acquired Torii Hunter (+6).  Other batters who tended to cost the Tigers runs in high-leverage situations were Delmon Young (-20) and Jhonny Peralta (-7).  Unlike Cabrera, those two did not hit well overall either.

Table 1: RE24 for Tigers (2012)

Player
RE24
wRAA
RE24-wRAA
Fielder
51
46
+5
Cabrera
47
57
-10
Hunter (LAA)
25
19
+6
Jackson
23
28
-5
Dirks
10
15
-5
Berry
7
-3
+10
Avila
-1
4
-5
Infante (DET-FLO)
-1
-2
+1
Boesch
-8
-11
+3
Peralta
-14
-7
-7
Santiago
-16
-13
-3
Raburn
-20
-18
-2
Young
-25
-5
-20

Twitter

Blog Archive

Subscribe

My Sabermetrics Book

My Sabermetrics Book
One of Baseball America's top ten books of 2010

Other Sabermetrics Books

Stat Counter