Wednesday, May 27, 2009

Best Stuff of the Cubs vs Pirates

With the Dodgers coming in for four, and the roster refreshed, the Cubs managed to take two out of three against the Pirates. Bottom line, the Cubs out pitched the Pirates.

Using "run values", I've rated all the pitches thrown in the series. Run values should add up to zero across a league-season, but my math is not quite right. I'm off about one run every 2500 pitches for 2009, so I'm either weighting events wrong or mis-estimating the run environment. Either way, the measure is good enough to use to make some direct comparisons.

It works by calculating the value of every pitch, based on the count and the outcome. No attention is paid to other aspects of game state, such as base runners, leverage or whatever. It's just about the pitch. 0 is average, or 0.04 in my world (for now), and negative is better for pitchers.

Once I had classified the pitches (I don't use Gameday's IDs), I ranked the best and the worst four ways each. rv, which is essentially a counting stat (like runs allowed) and rv100, which is a rate stat (like ERA). For rv100, it's the average pitch value x 100, for the plain rv it's simply the cumulative total.

Data covers the three-game series just completed this afternoon. And, yes, Neal Cotts' fastball leads a counting stat despite not seeing much action. He was that bad in his last bit of work before being demoted.


Best Pitches

rv
Zambrano Cutter -1.232
Marshall Slider -1.214
Duke Curveball -1.081
Duke Fastball -1.043
Ascanio Change-Up -1.034

rv100 (5+ pitches)
Ascanio Change-Up -10.34
Gorzelanny Curveball -10.30
Zambrano Cutter - 6.16
Chavez Fastball - 6.15
Guzman Fastball - 6.02

rv100 (10+ pitches)
Ascanio Change-Up -10.34
Zambrano Cutter - 6.16
Duke Curveball - 5.69
Zambrano Splitter - 5.41
Marshall Slider - 3.92

rv100 (20+ pitches)
Zambrano Cutter - 6.16
Marshall Slider - 3.92
Marshall Curveball - 3.48
Duke Fastball - 3.36
Duke Change-up - 0.31

Worst Pitches

rv
Cotts Fastball 3.372
Dempster Slider 2.052
Snell Change-Up 1.478
Chavez Change-Up 1.472
Duke Sinker 1.343


rv100 (5+ pitches)
Cotts Fastball 25.94
Dempster Slider 14.66
Ascanio Fastball 13.36
Snell Change-Up 9.85
Meek Fastball 6.98


rv100 (10+ pitches)
Cotts Fastball 25.94
Dempster Slider 14.66
Snell Change-Up 9.85
Maholm Fastball 6.47
Duke Sinker 6.40

rv100 (20+ pitches)
Maholm Fastball 6.47
Duke Sinker 6.40
Dempster Fastball 2.98
Maholm Change-Up 2.95
Dempster Sinker 2.02


2 comments:

cdw said...

It seems that you are normalizing all pitches to one another by excluding base states from your rv calculation. I think this method probably has advantages and disadvantages.

It seems like it's advantage is the fact that a FA thrown for a strike on a 0-0 count would be the same for bases empty or bases loaded. Therefore, all pitches are created equal, or based on count have the same potential for change in run expectations.

But at the same time this appears to be a disadvantage as well b/c a FA thrown for a strike with bases loaded is more likely to prevent runs than with bases empty.

Anyway, I would think your method is a better predictor, moving forward, for data sets with small sample sizes.

And now for the question! FG lwts has Carlos's CH/100 being awful this year, -11.25, when it's been good in the past, 2.5. Have you seen the same decrease in effectiveness of this pitch? I'm tempted to believe this decrease is due to small sample size but I think that's just the Big Z fan inside me.

(Sorry to be so long winded I'm just trying to think this through and accurately communicate my thoughts.)

Harry Pavlidis said...

I agree, there are pros/cons to this approach. My bigger concern is that it's based heavily on BABIP, so I'm working on a component based measure (think tRA).

As far as Carlos' CH, I assume you're referring to Big Z. This is an example of the limited utility of the FG data, it is based on Gameday's pitch IDs, and Zambrano rarely throws a change-up. It's a splitter. In general, FB/CH splits can be tricky, and Gameday fails on a lot of situations.

In other words, what good are the linear weights if the pitch classifications are so unreliable?