Statcast Goofin’, Vol. 1: Let’s find some regression candidates

I’ve resolved to make this the year I write about something other than basketball. The last two offseasons, I’ve done long interview series across all levels of college hoops; this year, I want to try something different. I’ve always liked baseball to some extent, with my love for the sport waning from the time I started college in 2011 until I realized how much I actually did love it when it’s good in 2020. With six months to go until the next college basketball game, it makes sense for me to pivot to writing in some form about baseball.

Why not give it a shot? Baseball is the sport that’s had the greatest advancements in statistics, not only with Bill James’ work in the late 1970s/early 1980s and the Fangraphs posts you may think of. It’s an easier stats-heavy sport to get invested in than nearly anything else.

For my first crack at it in a very long time, I thought I’d apply a similar principle I’ve used in my basketball writing to this MLB season thus far: positive and negative regression. I write frequently about the idea of a player being due or not due for something based on the quality of their offensive/defensive efforts; the same absolutely applies to baseball, which is a sport with higher variance and a healthy amount of luck involved. If you let yourself get too attached game-to-game, you’ll go mad; if you appreciate the madness and embrace it, you can laugh and laugh when small sample sizes go haywire.

Below, I’ve singled out some teams, batters, and pitchers who could be in line for positive/negative regression based on MLB’s amazing depth of tools, including Statcast/Baseball Savant. Whether this proves to be useful or “true” is still to be seen – these are very small sample sizes, after all – but I’d rather start with a rough-draft post than never start at all. Hopefully, it’ll give you something to watch for, if nothing else.

Teams

One of the biggest stories of the first month of the season was the suddenly-moribund New York Yankees offense. After a few years of owning a roster full of mashers, it was widely expected that New York would have the best offense in MLB this season. Why wouldn’t they? If your roster contains Giancarlo Stanton, Aaron Judge, Gleyber Torres, and D.J. LeMahieu, it should be the best in baseball. Even in a shortened season last year, the Yankees were on a 162-game pace of 254 homers and 851 runs scored, which would have been downgrades from their amazing 2019 (306 HRs, 943 runs) but were both great numbers. And yet: April looked like the Yankees would somehow continue this downward turn.

Through one month of play, the Yankees were hitting just .224, scoring barely 3.9 runs a game, and were on pace for 231 homers, a fine total that would’ve ranked barely above the league average in 2019. (There’s discussion to be had on how much the deadened ball MLB has introduced has hurt the Yankees in particular, but I’d prefer to leave that to smarter writers.) Prior to starting a weekend series against Detroit on April 30, New York was 11-14 and in a world of trouble, at least if you pay attention to the New York press. The Yankees struggles on offense, strangely enough, could be explained in one simple fact: they were very unlucky.

No team had a worse batting average on balls in play with an expected batting average of .300 or higher than the Yankees, per Statcast. They were hitting just .542 on these; they were expected to hit .623, which was much closer to league average. Along with that, only three teams were less lucky on balls with an exit velocity of 95 MPH or higher (.473 BA, .553 xBA). Regression would come soon enough; it was just a matter of how quickly it arrived. Luckily for the Yankees, there are few things that can provide the positive boost you need than playing the Detroit Tigers.

The Yankees swept that series and, as of this writing, are on a five-game win streak. They trail the division-leading Red Sox by a game-and-a-half, just a week or so after being in last place in the division. Baseball’s a weird sport, and team-wide cold streaks can exist. It did for the league’s probable best offense, and it wouldn’t surprise me to see them return to stardom pretty quickly.

A team with lower expectations and a lower high-end that’s still due for something better offensively: the Cleveland baseball team. Lake Erie’s faves are in a three-way tie for the AL Central lead, which is great and all, but their offense has been quite unbearable to watch. They’re hitting .209 (third-worst in the league) with a 90 OPS+ (22nd), and while batting average is a pretty tired metric at this point, it’s still notable that over half of Cleveland’s main lineup is below .200 on the season. It’s not like anyone expects Cesar Hernandez to be this amazing hitter, but he’s a .277 career hitter with remarkable consistency over the last eight seasons. He’s hitting .187 at the time this sentence is being typed.

In the metric mentioned above with the Yankees – expected BA on a 95+ MPH exit velocity – Cleveland was .120 below their expected value in April. It’s really hard for that to stay the same for an entire season. In particular, Hernandez is hitting an unbelievable .235 on balls with an exit velocity of 95+ MPH; his expectation based on the speed and launch angle of these balls in play is .498. Four of Cleveland’s generally-everyday starters have a batting average .163 worse than their expected batting average on these swings. (Jose Ramirez is also due for some serious positive regression.) In a time where the Central looks a little topsy-turvy thanks to Minnesota’s horrid start, Cleveland could be in position to capitalize on some positive regression soon enough.

Because of the nature of the deadened ball, a lot of teams are hitting slightly below what they’re expected to through a month-plus of play. In terms of negative regression, most of it looks to come on the pitching side, particularly from the Washington Nationals and the Seattle Mariners. Consider it the flipside of the offensive issues for Cleveland and New York. Both Washington and Seattle allow an above-average amount of hard-hit balls, and per Statcast’s xwOBA metric, Washington should be the team most victimized by hard-hit balls so far this season. Instead, only San Diego has allowed a lower batting average on these pounded pitches.

Considering the Nats were expected to have one of the better pitching staffs in baseball this year, perhaps it’s not a huge surprise they’re doing well in this department, but you still should see some sort of regression coming. The same should doubly go for the Mariners, who were expected to have a bottom-six staff in MLB but have instead allowed fewer runs than the Rays, Braves, and Athletics per game this year. Again, this is because Seattle’s failed to be victimized on hard-hit balls this year. They’re allowing the third-lowest batting average on these swings right now, which would be fine if their expected value wasn’t closer to league average. The hardest hit ball they’ve given up this season was a Shohei Ohtani line drive that became a 342-foot out:

The Mariners being in wild card position is very funny, however, and I hope it lasts.

Batters

Unsurprisingly, if you read the previous section, you’ll be expecting to see Cleveland’s Cesar Hernandez in the positive regression category. If you look at Hernandez’s Statcast profile:

You see a guy who’s doing a lot of things well. He doesn’t swing and miss often; he draws a lot of walks; he’s got a very good expected batting average. So color me skeptical that a guy who simply appears to be a consistently good offensive player is going to hit .187 forever. Through a month-plus of play, Hernandez actually has his highest average exit velocity (89.4%), hard-hit percentage (39.5%), and xwOBA (.376) in the Statcast era. I don’t play fantasy baseball, but if it’s up your alley, it could be a good time to load up on Hernandez.

Likewise, Houston’s Kyle Tucker is overdue for some good batted-ball luck. Tucker’s metrics aren’t as all-around solid as Hernandez, but he still should have a better hitting profile through 30ish games of play than he’s posted. Tucker ranks in the 79th percentile in average exit velocity, the 73rd percentile in hard-hit percentage, and the 81st percentile in whiff percentage. He’s far from a high-average hitter, sitting at .230 for his career, but he appears to be unusually unlucky this year. Tucker’s expected batting average is .279 with an xwOBA of .359; he’s at .183 and .255 on both, respectively. Tucker hits more fly balls than the average hitter, so I can see where his profile could be boom-or-bust, but there still should be something positive coming Tucker’s way.

On the negative side of regression-land: the wonderful Yermin Mercedes (White Sox) and Jared Walsh (Los Angeles Angels). Mercedes has been an absolute joy so far, posting a hilarious .386/.426/.614 split that’s made him one of the best hitters in baseball so far. Mercedes actually has a very high expected batting average (.311) and, if he keeps a similar contact profile, should still be one of the better hitters in baseball this year. However: .311 isn’t .386, and his hard-hit percentage is essentially league average. Still, this is worth enjoying as long as he’s able to ride it.

Walsh is probably a less-likely breakout candidate but a pretty fascinating one in his own right. A 27-year-old, Walsh garnered 63 games worth of action across the 2019 and 2020 seasons after a scalding-hot AAA run in 2019 (.325/.423/.686, 161 wRC+, 36 HRs). His first season was awful (.203/.276/.329, -0.2 WAR); his second was great (.293/.324/.646). Year three would’ve fairly been expected to be something in-between. Instead, Walsh has turned into the second-best hitter on a team that employs Mike Trout, Shohei Ohtani, Anthony Rendon, and Justin Upton.

Walsh has a low hard-hit percentage and a low average exit velocity, which normally spells trouble for any batter and is why his current average is nearly 60 points higher than his expected average. Still, even if Walsh falls to merely being an Upton-level contributor, he’s going to be an important piece to an Angels team that desperately needs a playoff bid.

Pitchers

At least in the city where I reside (Knoxville, TN), no pitcher has had a more upsetting start to the season than Atlanta’s Max Fried. After a fabulous 2019-20 run that resulted in Fried becoming a consensus top-20 pitcher in the league, he’s been a disaster in 2021. Fried’s ERA sits at an awful 8.44, he’s allowed more runs in four starts than he did in 2020’s eleven, and most surprisingly, his slider has gone from one of the best in the league to an eminently-hittable pitch.

My case for positive regression in Fried is both statistical and hopeful; I would like for a great young player to get back to being great. Fried’s expected ERA is still 5.72, which is way above what he or anyone else would want it to be, and he’s allowing a shockingly high amount of hard-hit balls – 40.7% of balls in play, in fact. Fried’s slider has lost a full four inches of break compared to last year:

Which makes a huge difference in terms of where it lands on the bat. Here’s his slider this year, from the same Truist Park camera angle:

If he’s able to find his slider again, he should be able to recover this season’s trajectory somewhat. If he doesn’t…well, pretend you didn’t read this. Luckily, Fried just had a pretty solid outing against Washington, so perhaps he’s on the mend.

Another guy I’m hoping to see positive things from soon is Logan Webb (Giants) He’s been middling so far, with his Statcast profile suggesting better fortunes ahead. Webb’s profile is particularly intriguing: his average exit velocity is just 85.8 MPH (league average 88.3%), he induces a ton of ground balls (60.4%; league average 45.3%), and his sinker has gained three inches of drop compared to last season. All of this should add up to a pretty good starting pitcher. It’s added up to a 5.34 ERA, a full two runs above his expected value based on his performance thus far.

I’m curious to see what happens to Webb, as it’s not like he’s had an amazing run of play in his career to this point (5.36 career ERA, .334 xwOBA, .272 xBA). However, he’s a guy who deserves better results than he’s gotten.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s