Probably the clearest and most obvious difference between Freakonomics and the sequel is that in the original book, Dubner & Levitt were writing about Steven Levitt’s actual research. People like this research! He won important prizes for it. And not only is the research mathematically sophisticated and prize-worthy, it’s often about quirky, interesting subjects. The sequel, by contrast, has basically nothing to do with Levitt’s research. They just decided to deploy the brand to help sell copies of what’s really just a lot of third-rate political punditry. Interestingly, though, Levitt’s still doing the kind of work that made him famous in the first place.
For an example, check out this recent paper “Professionals Do Not Play Minimax: Evidence from Major League Baseball and the National Football League”:

In the perfect world of game theory, two players locked in a zero-sum contest always make rational choices. They opt for the “minimax” solution — the set of plays that minimizes their maximum possible loss – and their play selection does not follow a predictable pattern that might give their opponent an edge. But minimax predictions typically have not fared well in lab experiments. And real-world studies, while more supportive, have often used small samples.
Now a new study, Professionals Do Not Play Minimax: Evidence from Major League Baseball and the National Football League (NBER Working Paper No. 15347), looks at two of the biggest high-stakes examples of zero-sum contests: pitch selection in Major League Baseball and play-calling in the National Football League. Authors Kenneth Kovash and Steven Levitt find that: “Pitchers appear to throw too many fastballs; football teams pass less than they should.” They also find that the selection of pitches or plays is too predictable. The researchers conclude that “correcting these decisionmaking errors could be worth as many as two additional victories a year to a Major League Baseball franchise and more than a half win per season for a professional football team.”
Kovash and Levitt examine all Major League pitches – more than 3 million of them — during the regular seasons from 2002 to 2006 (excluding extra innings). They categorize them as fastballs, curveballs, sliders, or changeups. They measure the outcome of each pitch using the sum of the batter’s on-base percentage and slugging percentage (a measure they label OPS) and they determine that fastballs lead to a slightly higher OPS than other types of pitches.
That’s interesting! With the world mired in the most serious recession in decades, arguably not the most important subject for economists to be focused on. But still interesting. And it suggests additional research issues. Are pitchers and managers just making a mistake in throwing too many fastballs? Or is it maybe that for biomechanical reasons most pitchers can’t throw the optimal number of breaking balls without wrecking their arms?
October 19th, 2009 at 12:30 pm
I think Levitt suffers from a basic epistemic problem here. He analyzes interesting data and concludes that baseball and football strategy doesn’t coincide with the behavioral predictions of a particular game-theoretic model. Fine. That game-theoretic model does not provide a good positive explanation for real-world outcomes.
But then comes this: “correcting these decisionmaking errors could be worth….” Errors? Suddenly what used to be a model becomes the Right Answer and what used to be a semi-interesting empirical research agenda becomes yet another economist whining about inefficiencies and irrationalities and failures of agents to understand economics.
This is a philosophical problem with the field, but it also reflects economic research that doesn’t misses the point about economics. Individuals’ (and aggregate) behavior is supposed to provide evidence of what is in the individual’s interest. Do the authors know what’s in the team manager’s interest better than the team manager?
October 19th, 2009 at 12:31 pm
What about pitch location and sequence? Most pitchers use the fastball to set up other pitches. It’s more difficult to throw breaking stuff for a strike, and if you hang something it gets crushed. Guess I’ll have to read the whole thing.
OPS is a standard statistic.
October 19th, 2009 at 12:34 pm
Actually, this is the most important subject for economists to be focused on. The economic theories that led to all our problems assume perfectly rational players. Proving this is not true is critical in getting professional economists to pull its head our of you-know-where.
October 19th, 2009 at 12:37 pm
It also reflects economic research that doesn’t misses the point about economics
should be
It also reflects economic research that misses the point about economics
And the problem (IMHO) isn’t that agents are irrational, Mark, it’s that they’re solving a problem that is different than the one Levitt and Kovash think the agents are trying (and failing) to solve.
October 19th, 2009 at 12:37 pm
As a Nebraska fan, I’m going to bet that if these economists studied Mike Leach’s Texas Tech, they’d find that his team throws plenty. -grumble grumble-
2nd, I think that they’re overstating the benefits to the teams in the long run. Early adapters might be able to benefit (and they do, as Mike Leach’s “Throw all the time” takes 2nd tier or third tier talent and gets winning seasons from them) but like “Moneyball” the benefits dry up as more people come to adapt.
October 19th, 2009 at 12:37 pm
“a measure they label OPS”? That’s what everyone calls it- that’s like saying “the ratio of hits to plate appearances (not counting walks, sacrifices, or HBP,) a measure they label batting average.”
I think the issue here is partially that not all pitchers even throw breaking balls- some relievers only have fastballs. Everyone knows how to throw a fastball, the availability of other pitches depends on who’s pitching.
That said, this is one of those stupid pieces of baseball folklore, like bunting, where people have believed it so long that it almost certainly is wrong. Announcers always say, “You want to beat him with your best pitch,” which is usually a fastball, but that doesn’t work if the batter is expecting it.
October 19th, 2009 at 12:49 pm
Biomechanics.
There’s not an equal opportunity cost for each pitch type, especially for the long-term health of the pitcher and the pitching style that got him to the big leagues (plenty of guys fall with injuries long before).
October 19th, 2009 at 12:51 pm
#2 has it nailed…you can’t decontextualize the fastball and say it should be thrown more often.
There is, to be sure, a political side to it (in the generic sense of the word). Several years ago in ESPN The Magazine there was an article about why football coaches don’t go for it on fourth down, even though there’s pretty compelling evidence that it would be a net utility gain overall. I think it was Bill Cowher who was quoted as saying, more or less, it’s a great idea until that one time it doesn’t work, and then you’re fired.
October 19th, 2009 at 12:51 pm
A worthless study. It’s not the fastball, nor the speed of it. It’s whether there’s any movement on the fastball, or the inability to change speeds on the fastball in order to mess with the hitter’s timing. All Yankee great Mariano Rivera throws is cut fastballs and at 39 he’s still the best closer in the game. Another important factor is that too often major league pitchers give too much credit to hitters and nibble rather than challenging hitters, thereby falling behind in the count and having to come in with something too fat.
Nolan Ryan told this story of a conversation he had with Satchell Paige early in his career:
Satch: Young Ryan, what’s the best pitch?
Ryan: Uh, the fastball.
Satch: No. Bowtie.
Ryan: What? Bowtie?
Satch: Yeah, bowtie. A fastball – right here (drawing a line against his Adam’s apple).
October 19th, 2009 at 12:52 pm
With two strikes, a batter trying to avoid a called strike three is going to swing at a greater variety of pitches, thereby making a breaking pitch more effective than a fastball. Earlier in the count he may *not* have swung at that breaking pitch, which is more likely to be a ball (which with two strikes it may designed to be). There’s a similar effect with the change-up, which is designed to throw off the batter’s timing. That’s why fastballs follow change-ups–curveballs are closer to the speed and break of a changeup than is the fastball. So if you threw a curve in that situation (as Levitt apparently suggests), there’s a greater chance of the hitter adjusting to the speed and break of the pitch.
That’s not to say that pitchers/catchers maximize their pitch calling. Some do it better than others (Greg Maddux was famous for this). But the pitchers who can throw a curveball for a strike early in the count *do* that–the problem is that batters don’t swing at curveballs on the first pitch. So if you throw it and you miss, you end up often *having* to throw a fastball with two strikes, which is of course a really bad idea.
October 19th, 2009 at 12:53 pm
There are a few other problems with this study. I’d have to look at it more closely, but how are pitches like cut fastballs (movement similar to a slider) or split-finger fastball (movement similar to a 12-to-6 curve) catagorized? A lot of pitchers use these as their 2nd pitch, and so are throwing more non-fastballs as Kovash and Levitt think.
Second, for both of these situations, there is a correlation problem between strings of plays/pitches. On 2nd and 8, a coach is not necessarily aiming to get a touchdown–merely 4-5 yards to set up a manageable third down. Calling a crazy pass play may be the minimax solution, but an off-tackle run that milks the clock and keeps the defense in the dark about your more exotic pass patterns is a better call in the real world.
October 19th, 2009 at 12:53 pm
Then there’s the whole mano e mano thing where sometimes you just gotta throw the heat . . . no shame either way . . . winner takes all on the at bat. Besides, if you can’t control the bender 3-2, you’re going with the fastball even if the batter is sitting on it, especially if walking the batter will lead to a run or more runners on for a more feared hitter up next.
October 19th, 2009 at 12:57 pm
Er, that last sentence should read “having to throw a fastball for a strike.”
October 19th, 2009 at 1:02 pm
Meh. This is the kind of work that can only come from researchers who don’t follow baseball.
Using OPS to measure a pitch’s effectiveness is pretty last century.
October 19th, 2009 at 1:06 pm
I guess I’ll have to read the paper, but this sounds way
too simplistic. If you throw fewer fastballs, you’ll throw
more balls and fewer strikes. And that means not just that
you’ll give up more walks, but also that your pitchers will
throw more pitches per out and last fewer innings; and in
particular, you’ll get fewer innings from your starters.
And going deep into the bullpen every game is not a strategy
that works for long.
For an example of what goes wrong if you take Levitt’s
suggestion, look at Daisuke Matsuzaka. He’s notorious for
nibbling round the edge of the strike zone and not being
aggressive enough, with the result that his pitch counts
are high and he rarely gets through 6 innings.
Furthermore, the choice is governed by the situation. If
you’ve got a 5-run lead in the 7th inning and no-one on base,
then you can be aggressive with a fastball: the worst that
happens is a solo homer.
October 19th, 2009 at 1:08 pm
If you’re going to lump shit pitchers with great pitchers, then I’m not surprised they saw a higher OPS for fastballs. Everyone throws a fastball; only people with good breaking pitches throw breaking pitches. If you throw a crappy curve or slider, you’re going to get absolutely hammered, so no one throws those pitches unless they’re pretty good. It also strikes me as weird to group starting pitchers with relievers since starters generally have 3+ good pitches whereas relievers generally have only 2. I would think that leads to pretty different strategies for the two groups. And do minimax strategies change depending on whether it’s the first time through the order or second, etc.?
From a non-strategy point of view, there’s the whole issue of torque on the elbow joint from throwing breaking balls. Part of the reason why sliders and curves aren’t thrown more frequently is that those pitches will make your arm fall off at the elbow joint after a while.
October 19th, 2009 at 1:09 pm
Yeah, sign me up with the above. Far from proving anything interesting, it just proves that Levitt (and to a degree, economists in general) often try to quantify things that aren’t necessarily perfectly quantifiable and then spend a lot of time making very dumb assumptions off of that.
October 19th, 2009 at 1:10 pm
Seriously using a straight OPS against is so stupid as to prove Levitt has no clue waht he’s talking about.
October 19th, 2009 at 1:11 pm
Actually, examining sports is one of the best ways to test economic theories because sports are excellent, often naturally occuring experimental environments with limited numbers of agents (economic agents not sports agents), factors and so on.
“And the problem (IMHO) isn’t that agents are irrational, Mark, it’s that they’re solving a problem that is different than the one Levitt and Kovash think the agents are trying (and failing) to solve.”
We don’t know whether the agents are irrational (or conversely, that Levitt doesn’t understand rationality) or if they’re rationally optimizing under different constraints than Levitt recognizes.
October 19th, 2009 at 1:12 pm
No. No. No. And No. Also, no. That’s a complete myth pushed by Andrews & Co. that’s been disproven by actual research. A properly thrown curveball or slider places no more strain (and sometimes less strain) on the elbow tha throwing a fastball. Improperly thrown breaking balls can be a problem, which is why it’s still a good idea not to overdo it with Little Leaguers and breaking balls, but the idea that breaking balls will wreck an arm per se is just straight up mythology.
October 19th, 2009 at 1:17 pm
There are two pitches. Fastball and loaded fastball. The rest is infotainment.
October 19th, 2009 at 1:18 pm
I just browsed the study itself. It makes lots of bizarre assumptions, such as:
“If a pitching staff were able to reduce the share of fastballs thrown by 10 percentage points while maintaining the observed OPS gap on fastballs, this would reduce the number of runs allowed by roughly 15 per season, or two percent of a team’s total runs allowed.”
AND
“Executives of Major League Baseball teams with whom we
spoke estimated that there would be a .150 gap in OPS between a batter who knew for a certain a fastball was coming versus that same batter who mistakenly thought that there was a 100 percent change the next pitch would not be a fastball, but in fact was surprised and faced a fastball.”
AND
“If one makes the further assumption that the OPS gap is linear in a hitter’s expectations about what type of pitch will be coming, then knowing that a fastball is 4.1 percentage points less likely if the last pitch was a fastball (and conversely more likely if the last pitch was not a fast ball) is worth roughly .006 OPS points to a batter.”
So essentially the paper is saying “if you throw more pitches that the hitters can’t hit, they will score fewer runs.”
October 19th, 2009 at 1:20 pm
There are more kids coming up into MLB now throwing 95 mph+ than ever before. Genetics, workout regimens, superior coaching – they all factor in. And, even though there’s a big difference between the ball coming in 90 or 95, good major league hitters can hit 95 mph+ fastballs, especially if they’re straight as a clothesline. Location, command, movement, two-seam/four-seam unpredictability – they matter more than sheer power. Hitting a 90 mph fastball is very hard. If you can find a batting cage that throws that hard (most don’t) – dig in and see. But, then the motherf breaks 2-4 inches or rises 2 and forget about it.
Ricky Henderson was the last batter up when Nolan Ryan threw the last of his 7 no-hitters and he went down like a dog. When asked about it – he just shook his head and was talking about the ball, not Ryan when he said: “It gave me no chance.”
October 19th, 2009 at 1:25 pm
Assuming that the defensive coordinator was retarded, and never noticed that the other team was pass-happy. Not to mention that there’s only one QB, he might get tired if you increase his passes by a third.
It’s like these guys are assuming that the change in behavior of the pitcher or offense won’t cause a change in behavior of the hitter or defense.
October 19th, 2009 at 1:26 pm
It’s even simpler.
Levitt is very likely confusing correlation with causation.
All the points about bad pitchers, situational pitching, etc point to that.
A more subtle analysis would measure the OPS by pitch type / batting count / on base conditions / outs. (or some subset of those situations.) There may be real meat in to be found, but their simple analysis won’t get it.
October 19th, 2009 at 1:29 pm
I don’t know about that, but I would about guarantee you that if a team passed 70% of the time:
a) the offensive lineman would be beat to all hell by the end of the season
b) the opposig defensive line would mount a pretty effective pass rush at some point by pinning its ears back and abandoning the run
October 19th, 2009 at 1:29 pm
And of course, the converse is also true. You see more fastballs in hitter’s counts. A pitcher is going to come with a fastball on a 3-1 count, and everyone in the stadium knows this. It’s more likely to get drilled.
There’s some speculation that Greg Maddux actually used to purposely throw off the plate in 2-2 counts, figuring that if he didn’t get the batter to chase, the worst case scenario was 3-2. He’d then throw a 3-2 change when no one was expecting it (and with Maddux, if often didn’t matter if you were expecting the change).
October 19th, 2009 at 1:31 pm
Dude, that would be work. Famous pop-social scientists don’t work that hard.
October 19th, 2009 at 1:33 pm
One more element perhaps missing from Levitt’s analysis:
a ground ball may lead to a double play, an excellent outcome
for the pitcher but one which is not at all reflected in
the OPS metric.
I found this about the frequency of double plays:
“Using the stats for the 2006 season, taken from the mlb.com site, there were 4649 double plays turned by the 30 major league clubs in 2429 games. That averages 1.913956 DPs per game or a little less than one per team per game.”
So if you’re counting in a way which ignores double plays
(as OPS does), then you’re roughly counting 26 of the 27
outs in the game and you might be 4% or so off in your
statistics. If you then suggest, as Levitt does, that
following his statistics offers a 1.2% advantage (2 games
out of 162), you’re probably talking crap.
I’d bet Bill James has a much better analysis than Levitt,
and if there’s any truth to it we probably would have heard
about it long ago.
October 19th, 2009 at 1:34 pm
From a baseball standpoint, I’d also point out that one of the more glaring problems is a mechanical one; Levitt assumes all pitches exist independent of one another, which isn’t true at all. As anyone with even a modicum of baseball knowledge knows, the key to an effective breaking pitch (or changeup), is giving the illusion of it being a fastball. A good slider, for example, looks like a fastball (unless the hitter sees “the dot”) and then moves away from the hitting zone at the last minute. But because of that, establishing a fastball is important to being able to throw effective breaking/off-speed pitches. If we’re going to generalize a bit, throwing fewer fastballs would generally make your breaking pitches less effective.
October 19th, 2009 at 1:38 pm
The going-for-it-on-fourth-down thing though, is almost
certainly true, and some NFL teams seem to be waking up
to it. In particular the Patriots seem to be doing a lot
of fourth-down attempts this season. But that also points
out the reason: with 3 SuperBowl victories, Bill Belichick’s
position is not in jeopardy and he can call whatever plays he
likes without risking his job. Most other coaches don’t have
that luxury and take the safer option of punting: that might
reduce their chance of winning but it reduces their chance of
taking the blame for the loss.
October 19th, 2009 at 1:46 pm
I don’t really know about this either, because it’s another thing that seems relatively context dependent, and I don’t see any sort of risk-reward adjustment being made.
October 19th, 2009 at 1:46 pm
“I guess I’ll have to read the paper, but this sounds way
too simplistic. If you throw fewer fastballs, you’ll throw
more balls and fewer strikes. And that means not just that
you’ll give up more walks, but also that your pitchers will
throw more pitches per out and last fewer innings; and in
particular, you’ll get fewer innings from your starters.
And going deep into the bullpen every game is not a strategy
that works for long.”
This highlights the other problem with using raw OPS to measure pitching performance: OPS overstates the value of power hitting and understates the value of OBP. Most “adjusted” OPS measurements use something around 1.8*OBP + SLG to get a more accurate measurement, but anyone willing to go that far is usually going to be smart enough to use something like EqA (or the non-BP versions out there). So if more breaking balls means more walks but fewer extra-base hits, the resulting OPS change will be greater than the actual impact on scoring.
October 19th, 2009 at 1:58 pm
So if more breaking balls means more walks but fewer extra-base hits, the resulting OPS change will be greater than the actual impact on scoring.
Wouldn’t that be the other way around?
October 19th, 2009 at 1:59 pm
The going-for-it-on-fourth-down thing though, is almost
certainly true, and some NFL teams seem to be waking up
to it.
I’m very much a dilettante when it comes to football, but am I correct in saying that Pete Carroll goes for the 4th down more often than other coaches?
October 19th, 2009 at 2:00 pm
professional athletes adapt. Maybe most teams run too much, but passing more?? Name the football team that wins the superbowl with a passing ratio significantly higher than the league average. Maybe the rams, and the patriots by a hair and thats it. Same with the fast ball. If every pitch is a fastball, the curve is tough to hit. Once you start getting fed a diet of curves you hit it (or get replaced by someone who can). Did you see what manny ramirez did to Hamel’s change -up (considered a very good one) when he got 3 in a row the other day? Or does anyone believe that the fastest fastball is always the best pitch? Not people who play. It has occurred to me that those who do economics have rarely played sports, and think that results occur without adaptations. The better players win the most, and smart aggressive strategy beats reckless or overly cautious (ie. predictable) ones.
October 19th, 2009 at 2:09 pm
What percentage of fastballs are thrown for strikes vs. other pitches? Giving up one extra walk is worth .3 runs over the marginal expected run total per Tangotiger’s runs formula. If you take Leavitt’s research on strikes as gospel you still have to figure the walk rates for varying pitches. If a pitching staff throws 50 extra walks in a season due to throwing more breaking balls and offspeed pitches with 2 strikes then it’s a wash. That’s less than a third of a walk per game extra from throwing breaking balls ahead in the count, which sounds about right.
The ability to locate a pitch is a skill that varies widely among even MLB pitchers but I don’t see that accounted for anywhere in this study. It treats different pitches like they are numbers on a die with an equal consistent chance of occuring in the model. I don’t think that’s how pitchers see them or how they exist in reality.
October 19th, 2009 at 2:19 pm
I don’t think that changes things much.
There are two aspects of “context” for going for it on 4th. One is score and time. When score and time matter – 4th quarter, late – coaches are already more likely to make a rational decision based on the consequences of making it or not.
The other contextual factor concerns the expectation of the final score. If two great defensive teams are playing, the value of possessions is reduced. The yards gained from a punt, or the three points from a FG look pretty good if another 4 downs are likely to lead to the exact same thing. The value of a possession is further decreased if the potential for turnovers is high.
I think coaches already act on the first set of contextual information, and the second set is fairly rare.
October 19th, 2009 at 2:26 pm
It’s all been said. A stupid study. Fangraphs has a similar stat to measure the value of each pitch, but it’s meaningless.
October 19th, 2009 at 2:31 pm
Frankly, writing about baseball stats seems just as out-of-expertise as writing about geoengineering. There are other people who professionally specialize in this field. Even if Levitt’s study had a slightly different objective (i.e. if nobody in baseball stats has ever looked at pitch selection via game theory), statisticians in the field would have a million methodological fixes — some of which are detailed in earlier comments — that he would not have known to incorporate.
October 19th, 2009 at 2:32 pm
My guess is if they changed the sample to exclude pitches to power hitters who draw more walks, wait for fastballs, and get more extra-base hits they’d get a different result. The quality of a pitch is very dependent on who’s batting.
October 19th, 2009 at 2:34 pm
And yeah, Levitt’s jumping into a sport that’s already been subject to too much analysis as is. My guess is that most teams keep track of each pitcher’s track record based on pitch type, batter handedness, batter identity, pitch count at the time of pitch, runners on base, number of outs, count at the time of the pitch, etc.
October 19th, 2009 at 2:38 pm
A more subtle analysis would measure the OPS by pitch type / batting count / on base conditions / outs. (or some subset of those situations.)
Dude, they did that. A lot of the issues everyone is bringing up are actually addressed in the paper. There are still problems but it is not as flawed and simplistic as the comments here imply.
October 19th, 2009 at 2:40 pm
@38
See, I would argue that you’re undervaluing potential risk vs. potential reward. Let’s say it’s 4th and 2 with 5:00 left in the first quarter and you’re on your own 30 yard line. Let’s even say there’s a reasonably high probability (say, 60%), of getting 3 yards. The problem is, what’s the payout? You have a 1st down on your own 33 60% of the time. The other 40% of the time, the other team has a 1st down inside of your 32 yard line. Which is what I mean, it seems to me that the problem with using pure probability measurements in these ways ignores the reality of asymmetrical impacts of results. It might be true in this case that you have a better chance of making the 1st down than not making it, but it may also be the case that the impact of not making it will be orders of magnitude higher than the impact of making it.
It would be like asking me if I want to place a bet where I have a 65% chance of winning; it depends on how much I have to bet. If it’s $5, sure. I’ll do that all day. If I have to put down every asset I own? Probably not a certain enough bet to justify the potential risk.
October 19th, 2009 at 2:42 pm
Yet another flaw in the use of OPS: it ignores the contribution
of small-ball productive-out tactics such as bunting and
sacrific flys. And while those may not generate a huge
number of runs, they get used in situations where the runs
are really important to winning the game.
“I don’t really know about this either, because it’s another thing that seems relatively context dependent, and I don’t see any sort of risk-reward adjustment being made.”
I saw some research about 4th-down conversions a while back
(I think MattY linked to it) and mostly it just depends on
field position and yards-to-first-down. If you only need a
couple of yards, running for the first-down is a high
percentage play. If you’re in field-goal range, then
taking the 3 points is fine. But if you’re in the middle of
the field and need a couple of yards, attempting the 4th
down works out better than punting for field position.
Since running for the 4th-down also gives you a good
chance of keeping possession, all you’re risking is field
position and there are few situations where that’s critical.
But this choice really does suffer from the agency problem:
the coach’s interests are not perfectly aligned with the
interests of the team.
October 19th, 2009 at 3:06 pm
The baseball portion of the analysis has very serious flaws, starting with the fact that they calculate OPS incorrectly. Focusing on PAs that end on a given pitch also provides a very inaccurate read on outcomes. See Sabermetric Research discussion: http://sabermetricresearch.blogspot.com/2009/09/game-theory-study-on-pitch-selection.html, and also discussion at The Book blog: http://www.insidethebook.com/ee/index.php/site/comments/game_theory_on_pitch_selection/
The paper really needs to be redone, using 1) a correct measure of offensive performance and 2) all PAs that follow a pitch in a given count. As it stands, it really tells us nothing about whether pitchers are optimizing their pitch selection.
October 19th, 2009 at 3:15 pm
As for the general value of intensive statistical analysis
of baseball, I think the experience of the Red Sox this year
might suggest that statistics have gone about as far as they
can go, and the pendulum may swing back towards old-fashioned
scouting and gut feel. On paper the Red Sox looked like a
a pretty good statistical team: they scored a lot of runs
(IIRC 3rd most in baseball), had good starting pitching and
excellent bullpen stats. And then they reached the postseason
and they couldn’t hit against good pitching, and couldn’t
close out an elimination game against good hitting, and
suffered a 3-game sweep.
Going deeper into the regular season stats, they beat up on
the mediocre teams, but didn’t do very well against good
teams.
Now a 95-win season isn’t a disaster, but I suspect you might
see some changes in the Red Sox approach, Bill James or
no Bill James.
October 19th, 2009 at 3:16 pm
A lot of the issues everyone is bringing up are actually addressed in the paper.
Those problems are mentioned, but I wouldn’t say they are addressed properly.
October 19th, 2009 at 3:19 pm
I was going to point to Tango and Birnbaum’s sites, and Guy’s analysis specifically. Then I saw he’d already posted here. Crazy.
Oh, and Guy, I think you’re the Guy who had a run in with Berri that spilled over to the apbrmetrics site a while ago. I just wanted to say that sort of analysis coming from someone not already on apbrmetrics was good to see. I was a little worried that reading the apbr take was clouding my judgement. Having a rational uninterested third party come up with the same concerns was nice.
October 19th, 2009 at 3:23 pm
@47
I’m pretty sure most people who know how to accurately use statistical information in baseball acknowledges that the playoffs are very much a crap shoot because of the short series factor.
October 19th, 2009 at 3:24 pm
I think the experience of the Red Sox this year
might suggest that statistics have gone about as far as they
can go, and the pendulum may swing back towards old-fashioned
scouting and gut feel. On paper the Red Sox looked like a
a pretty good statistical team: they scored a lot of runs
(IIRC 3rd most in baseball), had good starting pitching and
excellent bullpen stats. And then they reached the postseason
and they couldn’t hit against good pitching, and couldn’t
close out an elimination game against good hitting, and
suffered a 3-game sweep.
Well, no. The Red Sox lost three games to a very good Angels team. You can’t make any assumptions about the state of statistical analysis from three games.
And what’s more, baseball has never moved away from scouting–it’s just that the scouting parts and the analysis parts are working much better now than before, so depending on your perspective, you might be seeing more trust placed in those scouting reports.
October 19th, 2009 at 3:28 pm
“If you’re in field-goal range, then taking the 3 points is fine.”
Well, “fine” isn’t very precise, is it? Neither is “field-goal range.” Let’s say you go for it on 4th and goal from the 2: you’ll convert quite a lot of the time, but even if you don’t the opposing team gets the ball on their own 2. That’s not worth many points to them; you might even get a safety, and you’re likely to get the ball back with very good field position. And none of that is taking into account the scoreboard, time remaining, defenses…there’s a lot to take in, but many coaches automatically do the safe thing when it’s not necessarily optimal.
Then there’s the high school coach who literally never punts or kicks a field goal (even all of his kickoffs are onside kicks). He’s had good results. But that probably only works in a high school context.
October 19th, 2009 at 3:40 pm
Matthew, A pitcher in difficulty will throw fastballs more than breaking balls to avoid walks. So this may explain some of it. Pitchers throwing well use one to setup the other. Also, location, location, location: A fastball high and tight followed by a fastball down and away is a good combination.
October 19th, 2009 at 3:42 pm
“Well, no. The Red Sox lost three games to a very good Angels team. You can’t make any assumptions about the state of statistical analysis from three games.”
That would be fine if it were *just* those 3 games. But
there was evidence of the problems at times in the regular
season – one stretch where the team went something like 35
innings without scoring a run; Papelbon putting a lot of
runners on base and having to pitch out of those jams; a
bullpen that suffered some spectacular implosions (didn’t
they blow a 10-run lead against the O’s ?); beating up
on the Yankees early in the season, and then being swept
twice late in the season; losing Wakefield for the second
half of an All-Star season (though that tends to happen for
43-year-old pitchers).
Maybe it’s just a concatenation of unfortunate circumstances:
Big Papi’s decline, Lowell’s injury, Daisuke’s poor
conditioning and WBC hangover, Theo’s curious inability to
find a productive shortstop. But the statistical mojo
looked shaky this year, compared to 2003-2008. Maybe the
answer is an even more elaborate statistical approach
which breaks down performance according to quality of
opposition. Or maybe the success always depended on a little
sprinkle of old-fashioned basdeball magic – Pedro’s
intensity, Manny Ramirez’ classic swing, Big Papi’s awesome
clutchness, the idiots came back from 3-0 down – that the
current team doesn’t quite have.
October 19th, 2009 at 3:53 pm
“Well, “fine” isn’t very precise, is it? Neither is “field-goal range.””
No, but google for “Romer fourth down” and you can get the
pdf of the paper and read it for yourself. I’m not a big
football fan myself, but I get the impression that this
particular research is well-regarded and has even had some
influence on real-world decisions.
October 19th, 2009 at 4:04 pm
@54, I’m not quite sure what you’re expecting from statistical analysis. It’s a tool used to make informed personnel and strategic choices. The final decisions are still made by people with limited options, and the game still has to be played by people against another team with a management using many of the same tools for the same decision-making purposes. Sometimes things go your way, and sometimes they don’t. If I’m a Red Sox fan in the first decade of the 21st century, I don’t complain too much. And if I understand what you’re saying, you think the two WS victories were the result of grittiness and heart, and not statistical analysis to begin with?
October 19th, 2009 at 4:07 pm
Richard, I know about the research. I was saying that “go ahead and take the three points” is actually a very simplistic version of what Romer claims, which is that going for it deep in opponents’ territory can be preferable to taking the FG (even aside from situations where that would be obvious, like if you’re down by 7 late in the game).
October 19th, 2009 at 4:09 pm
@Will
As a fellow Nebraska fan, I concur.
Yet another way in which the 2000s have sucked.
October 19th, 2009 at 4:15 pm
Or, you know, that those guys were really good. That probably helped too.
I get that people like narrative’s and romanticism, but they should really just keep that to themselves.
October 19th, 2009 at 4:22 pm
“And if I understand what you’re saying, you think the two WS victories were the result of grittiness and heart, and not statistical analysis to begin with?”
I think statistical analysis is very valuable. Youkilis -
featured in Moneyball as a classic OPS prospect – has been
very good for the Sox. But it only gets you so far.
I suppose everyone admits that: seem to remember comments
by Sox management that the goal is to reach 95 wins and get
into the playoffs every year, and after that you need luck
and timely performances to win the World Series.
However, it did seem this year – even before the ALDS loss -
that the team’s statistics were not an accurate reflection
of its quality, because they were achieving blowouts against
weak teams while not doing much against stronger opposition.
Ortiz, for all his heroics in the past, looks like a guy
who can’t hit good pitching any more, in spite of reaching
28 HR and 99 RBI. Beckett and Papelbon were merely
good, rather than being dominant as in 2007. And the team
was mediocre on the road.
I don’t know all the latest sabermetrics, but it will be
interesting to see what changes the Red Sox make. The
current lineup doesn’t scare anyone, even though they
scored a lot of runs. Are there statistics that can
quantify that ?
October 19th, 2009 at 4:32 pm
“Or, you know, that those guys were really good. That probably helped too.”
Sure. So my question is why it would be that the 2009
Red Sox – the result of 6 years of player selection by
a management team using lots of statistical analysis, *and*
with one of the highest payrolls in baseball – appeared
to not have so many “really good” players ?
I know why Pedro and Manny are gone; I know – as much as
anyone does – why Ortiz isn’t so great any more; but why
hasn’t scientific management been able to find equally
great replacements for those talents ? Is there some
X-factor in truly great players that *doesn’t* show up in
statistics ? Or are the Red Sox identifying those players
but not bidding high enough ? Or is there still a big
element of luck ? I think that probably applies to Ortiz,
who wasn’t anything great before he came to the Sox, but
not to Pedro and Manny, who had showed what they could do,
cost a fortune, but were worth it.
October 19th, 2009 at 4:37 pm
However, it did seem this year – even before the ALDS loss -
that the team’s statistics were not an accurate reflection
of its quality
Okay but there’s not a whole lot of evidence that you can support that with.
October 19th, 2009 at 4:43 pm
why hasn’t scientific management been able to find equally
great replacements for those talents ?
Because they’re not readily available on the open market?
October 19th, 2009 at 4:44 pm
“Is there some X-factor in truly great players that *doesn’t* show up in statistics ?”
No. The players you keep coming back to—Pedro, Ramirez, and Ortiz—all have very clear statistical advantages over other players. I don’t think I even know what you’re arguing anymore. If the Red Sox losing a playoff round discredits statistical analysis in your mind, then you never supported statistical analysis in the first place.
October 19th, 2009 at 4:47 pm
Okay, let’s say the Red Sox win the World Series next year. Is statistical analysis meaningful again, or does it mean that the Sox are gritty? What if they win the LCS but then lose the World Series?
October 19th, 2009 at 4:54 pm
You realize that that’s true of basically every good team right?
October 19th, 2009 at 4:56 pm
That’s fairly obvious isn’t it? They’re older in spots, and they replaced Manny with Bay.
October 19th, 2009 at 4:58 pm
Are you for real?
October 19th, 2009 at 6:38 pm
You guys are kidding me, right? Pitching & hitting aren’t just matters of decisionmaking, they’re matters of execution, a word I see nowhere in the comments so far. The reason pitchers throw more fastballs is that most pitchers can’t consistently throw other pitches for strikes.
Jesus…
October 19th, 2009 at 7:02 pm
My “Or is it maybe that for biomechanical reasons most pitchers can’t throw the optimal number of breaking balls without wrecking their arms?”
Yup. I still have this little floating thingy in my elbow region from throwing curveballs when I was twelve. I touch it on occasion, for nostalgic purposes.
It’s about faith, too. The fastball, even when it gets turned around, can always be thrown harder or in a better location. But the off-speed stuff requires mega amounts of faith. When a curveball pops in mid-flight -and hangs in the strike zone for all eternity- oh man, you want to reach out and bring that pitch back and swear never use that motherfucker again, whether the batter crushes it, or not.
October 19th, 2009 at 7:05 pm
tune in next week, when Levitt establishes that the jab is a more or less useless punch, and that boxers would score far more knockouts if they threw 10% more hooks.
October 19th, 2009 at 7:12 pm
“Pitching & hitting aren’t just matters of decisionmaking, they’re matters of execution, a word I see nowhere in the comments so far.”
Nice use of Ctrl-F, but if you actually read the comments you would find many other people making the same point. See comments 12 and 15 just for starters.
Levitt is such a clown. Of course fewer people reach base on curveballs, because pitchers don’t throw curveballs on 3-ball counts. To Levitt, that would be evidence that the curveball is easier to control!
October 19th, 2009 at 7:54 pm
“No. The players you keep coming back to—Pedro, Ramirez, and Ortiz—all have very clear statistical advantages over other players. I don’t think I even know what you’re arguing anymore. If the Red Sox losing a playoff round discredits statistical analysis in your mind, then you never supported statistical analysis in the first place.”
Well, Pedro and Manny were signed by the pre-Moneyball Dan
Duqette Red Sox. So while they were *obviously* terrific
players, they aren’t any advertisement for sabermetrics.
Ortiz was pretty much a fluke, a minor player picked up
cheap who surprised everyone by becoming great.
I guess I’m re-evaluating the Sox’ success of 2003-2009 and
finding that the whole OPS/Moneyball/sabermetric complex
may have played less of a part than it seemed, compared to
the dumb luck of picking up some players cheap who then
outperformed expectations (Ortiz, Lowell), and doing a good
job of development in the farm system (Pedroia, Lester,
Papelbon) – and of course having enough payroll for a few
superstars (Pedro, Schilling, Manny, Beckett).
That’s not to say that the OPS stuff is no good; just that
it only gets you so far, and some players deliver more or less
than what the stats show (even with Bill James on the
payroll, they picked Renteria and Lugo as shortstops …)
October 19th, 2009 at 8:00 pm
“You realize that that’s true of basically every good team right?”
I’m not sure. I think it’s a question that could use some
statistical analysis: was the variance of runs-scored
particularly high for the Red Sox ? Did they overperform
more than expected against pitchers with high ERA ? Did they
underperform more than expected against pitchers with low ERA ?
It felt like it, of course feelings can be deceptive.
And the Yankees, after a slow start, beat up on *everyone*
after the All-Star break.
October 19th, 2009 at 8:01 pm
Um, David Ortiz is pretty much the pinnacle of the scouting model espoused in Moneyball.
I think it’s pretty clear that your problem is that you’re not at all familiar with what’s being talked about.
October 19th, 2009 at 8:15 pm
“Um, David Ortiz is pretty much the pinnacle of the scouting model espoused in Moneyball.”
OPS with Twins 2000-2002 .810 .799 .839
OPS with Red Sox 2003-2007 .961 .983 1.001 1.049 1.066
He wasn’t bad with the Twins. But his OPS jumped 122 points
his first year with the Sox, and then it kept going up.
Big surprise even to his fans.
October 19th, 2009 at 8:18 pm
“I guess I’m re-evaluating the Sox’ success of 2003-2009 and finding that the whole OPS/Moneyball/sabermetric complex may have played less of a part than it seemed, compared to the dumb luck of picking up some players cheap who then outperformed expectations (Ortiz, Lowell), and doing a good job of development in the farm system (Pedroia, Lester, Papelbon)”
Well, what the hell else does a front office do, except for draft and develop players and pick them up from other teams? So you’re saying that if we classify every successful move inspired by statistical analysis as “dumb luck,” and pretend that the GM has nothing to do with drafting players (or is that “dumb luck” too?), then statistical analysis doesn’t look so good? Knock me over with a feather!
To put it another way, isn’t picking up players who “outperform expectations” the very definition of a good GM? Isn’t that what Moneyball was actually all about?
Also, and this has been bothering me for a while, why do your comments have such weird line breaks in them?
October 19th, 2009 at 8:22 pm
Again, the problem is pretty clearly that you haven’t read Moneyball, and apparently haven’t even read a good review of it. David Ortiz was a lifetime .310/.383/.533 hitter in the minors, which is what James and Beane held out as the best thing to look at in evaluating players. He didn’t come out of nowhere, he’s the picture of a player who was there to be had relatively cheaply because baseball teams were overvaluing the wrong things.
Plus, Ortiz was always good at driving the ball the other way, something that obviously played well in Fenway.
October 19th, 2009 at 8:23 pm
Or, why don’t I let Fire Joe Morgan speak for me, since these points were made better four freaking years ago:
“If you’re going to make this sarcastic point, do yourself a favor and only cite players who predated Theo’s reign. Picking Ortiz off the scrapheap three years ago was one of the greatest GM moves in history. Did Epstein know Ortiz would be this good? No way. But he got him for $1 million because he saw value in his stroke and batting eye. Then, even better, Epstein locked him up for roughly $6 million/year for three years. Do you understand how amazing that is, Caple? He makes less than 25% of what ARod makes, and their offensive numbers are nearly identical.
Schilling would not be wearing a 2004 Sox ring (and neither would anyone else on the team) had Epstein not flown out to Arizona two years ago, had Thanksgiving dinner with the fam, and used his unique blend of statistical analysis and interpersonal skill to convince Schilling that Fenway was a place he could be effective (he won 21 games) and that the city and team needed him to put them over the top. And do you remember whom Epstein traded for Schilling? No, you don’t, because it was all junk. (Casey Fossum, anyone?) One of the great trades of the past few years. “
October 19th, 2009 at 8:48 pm
“To put it another way, isn’t picking up players who “outperform expectations” the very definition of a good GM?”
Obviously Theo is a pretty good GM – going to the playoffs
5 years out of 6 is not easy when you share a division with
the Yankees. But when you factor in the players inherited
from the previous regime (Pedro, Manny, Wakefield) and the
better-than-*anyone*-expected performance of cheap pickups
like Ortiz and Lowell, and the ongoing can’t-pick-a-shortstop-
for-shit saga, it seems that maybe the statistical analysis
part of the job isn’t giving as big an edge as it seemed.
Not that it’s a bad thing, just that a large part of the
success may be from old-fashioned scouting and coaching
and some dumb luck.
Sure, it’s a good organization, but signing an elite
pitcher like Schilling isn’t about tricky statistical
analysis, it’s mostly about having a big pile of money
and being willing to spend it.
October 19th, 2009 at 8:56 pm
You realize you can’t make that true by saying it over and over while ignoring evidence to the contrary right?
October 19th, 2009 at 9:43 pm
“You realize you can’t make that true by saying it over and over while ignoring evidence to the contrary right?”
I think Theo expected Ortiz to be substantially more
productive than he had been with the Twins. I don’t think
he, or anyone else, expected a 1.050+ OPS and 40+ home runs,
ever, from a guy picked off the scrap heap. Being a good
GM is getting more than *other* people expect from a player;
being lucky is getting more than *you* expect from a player.
The Ortiz pickup was very good (his minor league stats showed
he was better than he looked at Minnesota, and worth taking a
chance on); but also very lucky (because I don’t think Theo
expected the phenomenal stats he achieved at his peak).
As for Lowell, he was picked up as an afterthought in the
Beckett deal, and seemed to be an overpaid player on the
decline (.658 OPS in 2005), then leapt back up to OPS of
.814 .879 .798 and was the World Series MVP to boot.
Now if *everyone* who came to the Red Sox got better like
that, you could put it down to excellence in scouting and
coaching. But the debacles with Renteria and Lugo show
that it’s not so simple (and the jury is still out on the
Daisuke deal – $16M/year for an ace would have been a bargain,
$16M/year for a shaky number-3 starter looks pricy).
Don’t get me wrong: Theo is a good GM, and it’s a good
organization. I just don’t think the statistical analysis
is as big a part of the success as had been claimed. And
I fear the next few years may be less successful, if what
is needed is good management *and* a few strokes of luck.
October 19th, 2009 at 10:08 pm
That seems like a pretty ridiculous rhetorical bar. Ortiz had a career OPS of over .900 in the minors, 1.002 in his final full year there. Someone like minded to Beane would clearly expect him to potentially put up .850 or so OPS numbers, especially since he managed an .839 mark his last season in Minnesota, and given the hitter friendly ballpark. Saying “no one could have predicted” he’d be a 50 HR, 1.000+ OPS player is a pretty ridiculous standard to set. Almost no one is ever expected to do that.
1. Claiming Lowell was an after thought is just flatly incorrect. He was the guy Florida was trying to trade, to dump his salary, and it was understood that he was going to have to be included in a deal for Burnett, the pitcher Florida was looking at moving. Boston only got Beckett instead because they agreed to give up Hanley Ramirez.
2. That Lowell was thought to be on the decline was probably overblown, but nevertheless, there’s the Fenway Effect to be accounted for here as well.
October 19th, 2009 at 10:29 pm
“Saying “no one could have predicted” he’d be a 50 HR, 1.000+ OPS player is a pretty ridiculous standard to set. Almost no one is ever expected to do that.”
Indeed. And yet the fact that he achieved such outlandish
numbers, which is a gift from the gods to a GM (and a .944
postseason OPS, IIRC) was one of the major factors in the
Sox success over 2003-2007.
“1. Claiming Lowell was an after thought is just flatly incorrect”
From the Red Sox point of view, what they wanted was Beckett,
and they had to take Lowell to get the deal done. No-one
was looking at him and thinking “he could be a World Series
MVP”. Excellent defense, decent offense, good clubhouse guy,
but overpaid. Heck, they already had a damn good third-base
guy in Youkilis, they would have much preferred to keep him
there and pick up a first-base slugger.
“there’s the Fenway Effect to be accounted for here as well.”
I’ll accept that there’s a Fenway Effect. But why hasn’t it
worked for any shortstop since Nomar ??
October 19th, 2009 at 10:38 pm
I don’t really see any reason to think they would have been substantially less successful if he’d only managed to match his .916 career minor league OPS.
Lowell wasn’t overpaid. Here’s his salary for 2003-2005: $3.7m, $6.5m, $7.5m. His OPS for those years? .881, .870, .658. For 2006 he made $9m and OPS’d .814. There were plenty of teams perfectly willing to take Lowell on, which is why Boston had to pay such a steep price for him and Beckett.
October 19th, 2009 at 11:07 pm
“I don’t really see any reason to think they would have been substantially less successful if he’d only managed to match his .916 career minor league OPS.”
If you think 150 points in OPS doesn’t make a difference to
team success, why would anyone bother with statistical
analysis ?? Anyway, besides the high OPS, to anyone who was
watching it was pretty evident that he made a big difference:
when the game was on the line, Ortiz would come through with
clutch hits time and time again. And then you have to factor
in the effect of having two truly scary hitters in Ortiz and
Ramirez back-to-back. That was tough for opposing pitchers.
“Lowell wasn’t overpaid. Here’s his salary for 2003-2005: $3.7m, $6.5m, $7.5m. His OPS for those years? .881, .870, .658″
$7.5M for a .658 OPS didn’t seem cheap. And then the fear
was that he’d be $9M for a similar OPS the next year.
I don’t recall it the same way you do: at least around
Boston all the talk was about getting Beckett.
October 20th, 2009 at 1:20 am
Should have better spent their time at the ball parks learning the sports instead of using the computer cycles to prove their ignorance of it.
October 20th, 2009 at 2:05 am
Isn’t the bottom-line unspoken (but obvious) recommendation of the Bill James approach: take performance-enhancing drugs, like Ortiz and Ramirez did? What gets you a higher OPS than juicing yourself into a powerhitter that pitchers are afraid to throw strikes to?
You’ll notice that Bill James had almost nothing to say about steroids for two decades after Jose Canseco’s 40-40 season in 1988 made steroid abuse obvious:
http://isteve.blogspot.com/2009/08/bill-james-sold-his-soul.html
October 20th, 2009 at 3:20 am
… or maybe we are looking at a betting syndicate.
October 20th, 2009 at 10:56 am
“recommendation of the Bill James approach: take performance-enhancing drugs”
Yes, but so what ? He analyses the game and says you’ll win
more games if have a high OPS. Which appears to be true.
And we call them “performance-enhancing drugs” because they
enhance performance. It is what it is. So what do you think
Bill James should have done ? Lie about the statistics ?
Put a warning on every page saying “but of course you shouldn’t
cheat” ? That’s not his job or his responsibility. Heck, the
juicing was at its peak years before Bill James had any
official association with MLB.
I think maybe Scott Boras would be a more appropriate target.
Though it seems clear that many organizations turned a
blind eye to the juicing for a long time.
October 20th, 2009 at 1:20 pm
and the jury is still out on the Daisuke deal – $16M/year for an ace would have been a bargain, $16M/year for a shaky number-3 starter looks pricy
I don’t know, those latter figures seem OK to me. </Giants fan>
To anyone who can’t tell already, this argument has gotten mired because it’s treatign OPS as a standin for sabermetrics when AFAIK it’s just a quick and dirty metric that’s easier to compute (and more intuitive) than VORP or WARP or EqA.
October 20th, 2009 at 1:50 pm
“To anyone who can’t tell already, this argument has gotten mired because it’s treatign OPS as a standin for sabermetrics”
Fair enough. But maybe what I’m groping my way towards is
that statistical analysis of baseball, like all statistical
analysis, works well for events (and types of players) that
exist in sufficiently large numbers; but doesn’t – and can’t
be expected to – predict improbable long-tail events like
the rise of David Ortiz or the freakishly magnificent career
of Mariano Rivera, with his one pitch.
As such it seems that sabermetric/Moneyball tactics are
very useful for building a good team at low cost, or a
really good team at moderate cost; but a dominant team
usually also includes a few extraordinary players – and
the acquisition of those involves either a large amount
of money, or else a considerable share of luck, above and
beyond the scouting and statistics.
A further suspicion is that too much focus on historical
sabermetric figures might eventually lead you down the
wrong track. Suppose you have a perfectly valid and correct
analysis of the performances of baseball teams before 2000 -
managed on traditional principles – and that leads you to
believe that OPS (or some other metric) is a good predictor
of a player’s value. Then you build a whole team based on
that methodology, where a whole lot of relatively cheap
high-OPS guys are playing *together*. But now you have a
whole new phenomenon that isn’t the same as the pre-2000
teams. Do you get the value you expected ? Or do you get
something that’s a little less (or a little more) than the
sum of its parts ?
Now if you *don’t* get quite what your previous models
predicted – or maybe you get what your previous models
predicted in terms of run differential and regular-season
wins, but the team turns out to be ill-adapted to winning in
the playoffs – then that doesn’t mean that statistics in
general are flawed. But it would mean that you need to
find some *better* kind of statistical analysis and come
up with a more useful model of players and how they’re
going to interact and perform against other groups of
players.
Really what’s fascinating about the sabermetrics phenomenon
is not the mere fact that statistics can be so effective in
analyzing and predicting baseball performance; but that
really *simple* measures like OPS can get you a long way. But
I expect Bill James and his associates are working really hard
on ever-more-elaborate analyses ?