Sunday, August 21, 2016

Strikeouts, Walks and Comparing Pitchers

In 1973, Nolan Ryan struck out 383 batters out of 1355 batters faced for a 28.3% strikeout rate. In 2015, Clayton Kershaw struck out 301 batters out of 890 batters faced for a 33.8% strikeout rate. With that raw data, you'd say Kershaw was better than Ryan and, objectively, you'd be right. That, however, doesn't take in to account the rest of the league. Strikeouts now are much higher than they were back in 1973, so how can we compare the two? With any luck, like this.


The rate at which batters struck out in 1973 was 13.7%. In 2015, that number had climbed to 20.4%. Comparing two seasons pitchers had that were a little more than 40 years apart doesn't really tell the whole story. Rather than using K% (strikeout rate) as a standalone, we can take into account what batters have done throughout the majors and come up with a statistical comparison that's more equitable.

A simple way of comparing pitchers between eras would require something that balances the individual K% against the league K%. This takes into account not only how good the pitcher was in his season, but how much better he was than the rest of the league. To do this, you divide the player's K% by the league's K% and multiply by 100 to get a league average score of 100. The 100 average is the same as OPS+, ERA+ and other + stats, so from here on, I'll refer to it as K%+. The equation written out would look like this:

(Ind. K%/Lg. K%)*100=K%+

Given this equation, we can calculate a K%+ score for pitchers. By doing so for both Kershaw and Ryan, we get this:

Ryan: 207
Kershaw: 166

While Kershaw had the better K%, he was only 66% better than the rest of the league while Ryan was 107% better. Strikeouts in 1973 were much harder to come by than they were in 2015. By using K%+, we can get a better idea of individual performance versus what the rest of the league was doing and can compare two different pitchers from two different eras.

This time, let's use pitchers who had similar numbers. We'll use K% for the similar numbers and then calculate K%+. For the comparison, I used Aaron Harang's 2015 season and found pitchers with similar K% numbers. Below, I'll show the format for the information and then the players.

Year
League K%
Player
Player K%
K%+

2015
20.4%
Aaron Harang
14.4 K%
K%+ 71

1989
14.8%
Dave Stewart
14.3 K%
K%+ 97

1965
15.7%
Steve Barber
14.5 K%
K%+ 92

As you can see, not all years are made the same. Even with similar numbers, the K%+ score varies based on how easy it was to strike out hitters. I say easy, but you get the idea. Given this information, I'll leave you with this to chew on. In 2015, Aroldis Chapman had a 205 K%+ score. In 1973, Ryan's was 207. Give that a little thought and then continue below for the walks portion, which will be a bit shorter.

Now that we have strikeouts out of the way, what about walks? In essence, its the same as K%+, but with the equation altered slightly. Instead of dividing the player's BB% by the league's BB%, we do the reverse. Why reverse it? Unlike strikeouts, where the higher the K% the better, walks are the opposite. To be better than the league, you need to be lower than the league. Due to this, the number becomes more of a ratio than an actual percentage.  To keep it simple, we'll still call it BB%+ because in the end, we're still comparing percentages. Here's what it would look like:

(Lg. BB%/Ind. BB%)*100

What does this mean for the numbers? To keep it simple, we multiply by 100 to keep the league average 100. For anything other than 100, it becomes a ratio rather than a percentage of what that specific pitcher has done, but we'll get back to percentages soon. For example, in 2015, Clayton Kershaw had a 4.7% walk rate against the league's 7.7%. After putting the numbers in, his BB%+ would be 162. In essence, for every batter Kershaw walks, the league walks 1.62 batters, which comes out to a 62% increase for the league against the pitcher. It becomes a bit complicated, but the upshot is that the higher the number, the better the BB% for the pitcher versus the league. As before, below is the format first, then player.

Year
League BB%
Player
Player BB%
BB%+

2015
7.7%
Clayton Kershaw
4.7 BB%
BB%+ 164

1992
8.5%
Greg Maddux
4.6 BB%
BB%+ 189

1970
9.2%
Claude Osteen
4.8 BB%
BB%+ 192

Claude Osteen. I didn't target him because I knew he had a decent BB% against the league, he just happened to have a BB% close to Maddux and Kershaw. I plugged the numbers in and, well, there you go. To be clear, I'm not saying Kershaw has terrible control compared to Osteen or Maddux.
I'm saying that, in 1970, Osteen had better control than the rest of the league compared to what Kershaw did in 2015. To be able to compare players across different eras, you have to include league-wide numbers in your calculation because baseball changes.


Are these perfect? No. Is it a decent start? I'd like to think so. That said, it could evolve in the future, stay the same or wither on the vine. Comparing players is difficult when they're playing in the same era, let alone decades apart. Why not come up with something that can make it just a little bit easier? With any luck, we can get people thinking about it.

No comments:

Post a Comment