Sunday, September 16, 2007

Clusters

Early results of k-means cluster analysis aren't too bad. Like Dan Fox found, it isn't the best way to do the job, since you have to input the # of clusters you're looking for. It does come pretty close. Using four clusters and five variables (Spin Direction, Spin Rate, Start Speed, pfx_x, pfz_z) worked best of the various combinations I've played with so far - with the toy Rich Hill sample as a test.



Pretty close. Much more work to do.

Here's the same chart, but grouped by velocity, from an earlier post.



----------------
NLC Standings
Cubs -- (13)
Brewers 1.0
Cardinals 7.0


0 comments: