What I wanted to do here is rate the major league Tigers on hitting for average in 2008. I'm not a scout so I just considered statistics and not tools. Since there are more statistics available for major league players, I used a different algorithm than the one used at TigsTown. I started by looking at batting average. The Tigers 2008 batting averages can be found in Table 1.
Table 1: Tigers Batting Averages in 2008
avg | |
Magglio Ordonez | .317 |
Placido Polanco | .307 |
Miguel Cabrera | .292 |
Carlos Guillen | .286 |
Curtis Granderson | .280 |
Edgar Renteria | .270 |
Marcus Thames | .241 |
Gary Sheffield | .225 |
Brandon Inge | .205 |
The problem with batting average is that what happens after the batter hits the ball is largely out of his control. He can hit line drives that are caught or soft bloopers that escape the grasp of infielders. Often times, these fortunes and misfortunes even out throughout the course of a season but sometimes they don't. Thus, a player's batting average is not repeatable. That is, it varies a lot from season to season (correlation =.43).
A statistic which is much more repeatable than batting average is contact percentage (correlation = .90). Contact% is the percent of balls that a batter swings at which result in the batter making contact. This stat was abstracted from Fan Graphs which has fast developed into one of my favorite sites on the internet. Another contact hitting statistic is strikeouts per at bat which has a year to year correlation of .80. I could have used K/PA instead but the Fan Graphs database doesn't have all the items needed to calculate plate appearances and merging with my other database would have been more trouble than it worth right now. The Tigers leaders on contact% and K/AB are presented in Tables 2 and 3 below.
Tables 2: Contact percentage for Tigers in 2008
player | contact % |
Placido Polanco | .927 |
Carlos Guillen | .862 |
Edgar Renteria | .856 |
Magglio Ordonez | .851 |
Gary Sheffield | .830 |
Curtis Granderson | .796 |
Miguel Cabrera | .775 |
Brandon Inge | .758 |
Marcus Thames | .743 |
Table 3: Strikeouts per at bat for Tigers in 2008
player | K/AB |
Placido Polanco | .074 |
Edgar Renteria | .127 |
Magglio Ordonez | .135 |
Carlos Guillen | .160 |
Gary Sheffield | .199 |
Curtis Granderson | .201 |
Miguel Cabrera | .205 |
Brandon Inge | .271 |
Marcus Thames | .301 |
It probably comes to no surprise that Placido Polanco led the Tigers in both categories. In fact, he led the American League in both. Contact% and K/AB give us information about ability to make contact but they tell us nothing about how solid the contact was. Line drive percentage helps us there. You can often tell about a batter's fortunes by looking at line drive percentage. A player with a high line drive percentage relative to his batting average is possibly hitting into a lot of hard outs. Conversely, a batter who has a low line drive rate relative to his batting average is possibly getting a lot of cheap hits. The Tigers line drive percentages are listed in Table 4.
Table 4: Line drive percentages for Tigers in 2008
player | line drive % |
Edgar Renteria | .222 |
Magglio Ordonez | .204 |
Carlos Guillen | .202 |
Miguel Cabrera | .196 |
Curtis Granderson | .191 |
Placido Polanco | .187 |
Marcus Thames | .170 |
Brandon Inge | .164 |
Gary Sheffield | .143 |
It's a little surprising to see Edgar Renteria's high line drive rate. This suggests that he may have been unlucky and that his batting average rebound might rebound in 2009.
I combined the above four items to arrive at one statistic which describes the hitting for average skill. First, I normalized each number, so that they all had the same scale - an average of 0 and a standard deviation of 1. Then I assigned weights to each statistic denoting their importance. The most important statistic is batting average (after all the skill is called hitting for average) so I gave twice as much weight to batting average as the other numbers:
0.4 x BA + 0.2 x contact% - 0.2 x KPCT + 0.2 x LD%
Finally, I reverse normalized the result so that we get back to the original batting average scale.
The way it works is like this: Edgar Renteria had only a .270 batting average. However, his contact, strikeout and line drive rates were all very good. So, his adjusted batting average goes up to .284. The results for all Tigers are listed in Table 5.
Table 5: Tigers hitting for average summary in 2008
player | avg | contact % | K/AB | line drive % | adjusted avg |
Placido Polanco | .307 | .927 | .074 | .187 | .304 |
Magglio Ordonez | .317 | .851 | .135 | .204 | .299 |
Carlos Guillen | .286 | .862 | .160 | .202 | .285 |
Edgar Renteria | .270 | .856 | .127 | .222 | .284 |
Miguel Cabrera | .292 | .775 | .205 | .196 | .275 |
Curtis Granderson | .280 | .796 | .201 | .191 | .271 |
Gary Sheffield | .225 | .830 | .199 | .143 | .245 |
Marcus Thames | .241 | .743 | .301 | .170 | .239 |
Brandon Inge | .205 | .758 | .271 | .164 | .227 |
Renteria and Sheffield both had significantly better adjusted batting averages than real batting averages. Magglio Ordonez and Miguel Cabrera had adjusted BA which were signficantly lower that their real averages. Finally, according to this algorithm, Polanco was the most skilled Tiger at hitting for average in 2008 and Brandon Inge was the worst.
The good news is that if hitting for average correlates with average Sheffield is as likely to hit 270 as 220.
ReplyDeleteThe bad new is if hitting for average correlates to batting average Inge is as likely to bat 200 as 250.
I have just looked at one year of data so far so I'm not sure how predictive adjusted average is. However, those low line drive percentages for Sheffield and Inge tell me that their low batting averages last year were probably not the result of bad luck.
ReplyDeleteLee, I don't see the differences between the the actual and projected averages as being significant. The most dramatic change difference was Inge with a 22 point difference or roughly slightly over 1.1 hit every 50 ABs. It seems that randomness or other factors such as speed and power may be more significant and defeat any predictive value the analysis might hold. I do see it's value for projecting minor leaguer's BA. I suspect you would see greater differences in their figures.
ReplyDeleteThe reason why there were not big differences between batting average and adjusted batting average is because batting average was part of the calculation. It would be the same for minor leaguers. it might be interesting to try to predict BA from contact rate, line drive rate and K pct.
ReplyDeleteI don't think the adjusted BA has any significant predictive value. I think there are three basic skills that contribute to statistics like OPS and RC: hitting for average, power and plate discipline. This was an attempt to isolate the hitting for average skill. I think it might work a little better than batting average. Next, I'm going to look at power which is a lot simpler because isolated power is more repeatable than BA and probably doesn't need to be adjusted.
Great work as always on this stuff, Lee.
ReplyDeleteMike
www.DailyFungo.com