Interesting article arguing that lineup protection in baseball exists. This runs counter to the sabermetric wisdom that lineup protection is a myth: Past research (if I recall correctly) has determined that having a better hitter on deck does not, over a significant number of plate appearances, result in a better hitting experience (more hits, walks and bases per plate appearance) than having a worse hitter on deck.

The author sums up this theory and suggests his criticism of it:

[J.C.] Bradbury’s regression analysis [in his book The Baseball Economist] attempts to measure the effect of the on-deck hitter’s quality on the current batter’s outcome (his regression model has the on-deck hitter’s OPS on the right-hand side and the current batter’s outcome on the left-hand side). This approach is intuitive; in fact, my initial instinct might be to perform similar research. However, at bat outcomes involve many moving parts (where the ball lands, reaction of the defense, and luck, to name a few), and Bradbury is trying to measure the effect of an outcome-based rate (OPS) on another outcome. Thus, if there is some noise or randomness within the data, the problem would be compounded in the findings.

Certainly this is true. But this is why a sufficiently large sample size is needed for the study. The question is: Is the set of data used to analyze lineup protection inadequate? The author seems to assume that it is, although that’s never been my impression.

He suggests examining pitch-by-pitch data to see whether batters see more “good” pitches (pitches in the strike zone, and fastballs rather than breaking pitches) with a better hitter on deck rather than a worse hitter. His analysis says yes:

The protection production function seems to tell us conflicting stories. The “input” findings show that protection exists, but the “output” evidence suggests that protection does not exist. So, which answer is correct? In addition to the potential randomness issue discussed earlier, outputs suffer from one other relative disadvantage – the mere volume of data being studied is different. Analysis at the per-pitch level (inputs) employs about four times the number of instances as per-at bat level analysis (outputs). Thus, while prior research may (or may not) point us in the right direction, I would argue that the production function’s inputs push us much closer to the truth.

I don’t buy this argument. The question at hand, as I see it, is not “Does having a better hitter on deck cause the pitcher to throw pitches to the batter that are easier to hit (i.e., more advantageous to the batter)?”, but rather, “Does having a better hitter on deck cause the batter to produce more runs?”

If we grant the result of his analysis (if not the conclusion he draws from it), though, then it does raise an interesting question: If a better hitter on deck causes the pitcher to change his approach, then why don’t batters in such situations experience better outcomes than in other situations? Are pitchers changing their approaches in a manner which is not actually useful? Is there something here that players and teams don’t yet understand and which might be exploitable?

He wraps up with a broader point:

I want to be clear about my broader argument. The sabermetric community will benefit as it moves away from its relatively strict reliance on outcomes and outputs. Events on the field of any sport involve a great deal of processes. While outcome data (e.g., much of what you find online at great sites such as retrosheet and baseball-reference) have generally been more widely available, a full picture of economic analysis in the future will rely much more heavily on whole processes and their inputs.

While both inputs and outputs can be interesting, neither is inherently more or less interesting than the other. It depends on what you’re trying to study. This fellow has failed to persuade me that the input side is as important as the output side in the case of lineup protection.

(I learned about this post through the Red Sox Mailing List. And boy does the list’s page need updating!)

One should always be wary of drawing any conclusions based on a single week of the baseball season. However, I do often find it instructive to see which teams are struggling mightily in the first week, only because it’s a lot easier to squander a 4-game lead than it is to overcome a 4-game deficit.

Three teams are currently occupying the cellar in Major League Baseball:

  • The Washington Nationals are 1-6, 4.5 games behind the lead. The Nationals are widely expected to be the worst team in baseball in 2007, so this isn’t a surprise: There just isn’t much talent there.
  • The Philadelphia Phillies are 1-5, in the same division. The Phillies were expected to contend in their division, but instead they’ve lost 4 close games (3 runs or less), 2 blowouts, and won one blowout. They’re 4th in runs scored, but next-to-last in runs allowed, with plenty of blame to go around on the latter score. Their pitching’s going to have to be more consistent if they’re really going to contend.
  • The San Francisco Giants are 1-5, 3.5 games back. They’re last in runs scored and third-from-last in runs allowed, which is just all-around awful. They’re also the oldest team in baseball. While there’s some reason to hope their pitching will come around (Barry Zito always seems to be awful in April), their hitting is just not that good: Beyond Barry Bonds and Ray Durham, there isn’t a real good reason to think they’ll be above average at any other position. I picked them to finish behind even the Rockies this year, and they’re off to a correspondingly poor start.

The Phillies might just be having a run of bad luck to start the year, but being 4.5 games out with 25 weeks to play isn’t exactly a way to put yourself into contention. Meanwhile, the Nats and Giants have put themselves in position to be the worst teams in baseball.

Over in the American League, the Indians and Mariners have each only played 3 games, thanks to a goodly dose of snow in Cleveland over the weekend.

No one in the AL is looking really awful so far: Even the teams with the worst offenses have shown good pitching so far, and vice-versa. But that just means that no one’s separated themselves from the pack. I figure Baltimore, Kansas City and maybe Seattle will start declining before too long. The difference between these three teams being that KC is arguably on the way up, while the other two seem stuck in neutral (and I think the Orioles removed their clutch sometime around the year 2000).

Me, I’m still hoping this is the year that the wheels come off of the Yankees’ pitching train.