Many have argued that some players have the ability, because of their speed, to force errors and that reaching base on fielder miscues is more like a hit than an out. Therefore, they believe that an error should not count in calculation of batting average and that it should count as reaching base in on base percentage. At the very least, they think it should be recorded separately from outs and more commonly reported.
The purpose here is to investigate whether an error is a random event or an event which some players are more likely to create than others. I looked at the retrosheet play be play database from 2000-2008 and found ROE data for all players during that period. I considered all ground balls that did not result in hits as opportunities to reach base on error and counted the ROEs. I did not include balls hit in the air because it would be hard to argue that those errors were forced by the batter's speed. I calculated ROE percentage (ROE%) for each player by dividing ROE by opportunities. The MLB average ROE% was .034 (or 3.4%).
There were 281 players with 500 or more opportunities during that period and their ROE% ranged from .016 (Alex Cintron) to .065 (Rondell White). Considering statistical probability, the distribution of ROE% did not look like one that came from a random event. There were many more ROE% that were further above .034 than would be expected if reaching on error was a random event. The more mathematically inclined can see the math at the end of the post*.
The top ROE% from 2000-2008 are listed in Table 1 below. The first thing you might notice is that the list is not comprised of speedsters. There is no Juan Pierre or Ichiro Suzuki or other players who would come to mind when you think of batters who might force fielders to make errors. Rather, it looks to me like a random list of players with no distinguishing quality.
So, while reaching base on errors is probably not a random event, it also doesn't seem to be the result of speed. It could have something to do with the way the ball spins off a players bat, the ballpark infields or official scorers or something else. It's worth further investigation.
Table 2 lists the current Tigers. Gary Sheffield with his 4.6% is a player who reaches base more than would be expected if it were a random event.
Table 1 - ROE% for MLB players 2000-2008
Opps | ROE | ROE% | |
Rondell White | 883 | 57 | .065 |
Sammy Sosa | 806 | 49 | .061 |
Gabe Kapler | 578 | 35 | .061 |
Ty Wigginton | 637 | 38 | .060 |
Marlon Byrd | 506 | 30 | .059 |
Jeff Cirillo | 806 | 46 | .057 |
Joe Randa | 945 | 53 | .056 |
Tony Graffanino | 626 | 35 | .056 |
Aaron Boone | 751 | 40 | .053 |
Mike Cameron | 807 | 42 | .052 |
Tim Salmon | 528 | 27 | .051 |
Jeff Bagwell | 745 | 38 | .051 |
Craig Biggio | 1296 | 66 | .051 |
Reggie Sanders | 653 | 33 | .051 |
Jeff Kent | 970 | 49 | .051 |
Benito Santiago | 560 | 28 | .050 |
Table 2: ROE% in 2000-2008 for current Tigers
Opps | ROE | ROE% | |
Gary Sheffield | 1017 | 47 | .046 |
Adam Everett | 598 | 22 | .037 |
Carlos Guillen | 997 | 36 | .036 |
Placido Polanco | 1449 | 49 | .034 |
Magglio Ordonez | 1165 | 33 | .028 |
*Math:
There were 281 players with 500 or more opportunities to reach base on an error on a ground ball between 2000-2008. The population proportion (p) = .034. To test whether a player's ROE% differed significantly from a chance event, we can do a normal approximation of the binomial. The z-score is z = (roe% - p)/SE where SE (standard error) is SQRT (p(1-p)/n).
For example, Rondell white had 883 opportunities and a ROE% of .065. Thus,
SE=SQRT ((.034*.966)/883)) = .0061 and z=(.065-.034)/.0061 = 5.08.
Z-scores of 1.64 or above suggest that an event may not be not random. With 281 players, we would expect about 14 (or 5%) of the players to have z-scores above 1.64 and 3 (or 1%) to have z-scores about 1.96. Instead, we have 37 with z-scores above 1.64 and 16 with z-scores above 1.96. This leads me to believe that reaching base on error is not random (but not necessarily a skill either).
The information used here was obtained free of charge from and is copyrighted by
Retrosheet. Interested parties may contact Retrosheet at 20 Sunset Rd., Newark, DE
19711.
Lee, Your TigerTalesBlog is always informative with these types of stats. Is it possible to add the Tigers players in this time frame?
ReplyDeleteI just added the five current Tigers who had enough opportunities to qualify. Sheffield has a high percentage.
ReplyDeleteBatters who get lots of at bats are skewed toward the high end of ROE%. Speed does not seem to be the cause.
ReplyDeleteSo, while the effect is real, I'm having trouble with the explanation that it is a hitter skill. So I'm trying to figure how it could have something to do with the fact that these batters get a large number of at bats.
I was looking for a distribution of errors by inning. I thought that if in general teams made more errors early, or late in a game it might be a place to start to look for an explanation. I couldn't find the data summarized anywhere I could get with a search engine.
Jeff, I'm also struggling to see a hitting skill here. I'm thinking it might be a ballpark effect but I have not looked into it yet. I agree speed is not the reason
ReplyDeleteI'm not sure I understand what you mean about ROE% being skewed towards players with a high number of at bats.
RE Skewing: It might be skewed for all batters. But among those who hit over 500 ground balls as specified in your entry, the tail of the distribution is larger than you would expect if it were a normal distribution. (Basically this is just what you said.)
ReplyDeleteI was just looking for something the group has in common to look for an explanation.
It looks like most of these hitters are RH. I've seen data that shows RH hitters are more likely to reach base on infield hits than LH hitters. It's possible the same is true of creating errors, as the SS and 3B have less time to compensate for a fielding mistake and still record an out.
ReplyDeleteGood suggestion Nick and it's something that could be checked as retrosheet gives hit location, fielder who made the error. It does make sense that players who hit the ball to the left side would reach base on errors more often.
ReplyDelete