Accuracy

Buckeyes and Sabermetrics

Accuracy

When I started this page, I didn't intend to include anything about the accuracy of the various methods other than mentioning it while discussing them. A RMSE test done on a large sample of normal major league teams really does not prove much. There are other concerns which are more important IMO such as whether or not the method works at the extremes, whether or not it is equally applicable to players as teams, etc. However, I am publishing this data in response to the continuing assertation I have seen from numerous people that BsR is more accurate at the extremes but less accurate with normal teams then other methods. I don't know where this idea got started, but it is prevelant with uninformed people apparently, so I wanted to present a resource where people could go and see the data disproving this for themselves.

I used the Lahman database for all teams 1961-2002, except 1981 and 1994 for obvious reasons. I tested 10 different RC methods, with the restricition that they use only AB, H, D, T, HR, W, SB, and CS, or stats that can be derived from those. This was for three reasons: one, I personally am not particularly interested in including SH, SF, DP, etc. in RC methods if I am not going to use them on a team; two, I am lazy and that data is not available and I didn't feel like compiling it; three, some of the methods don't have published versions that include all of the categories. As it is, each method is on a fair playing field, as all of them include all of the categories allowed in this test. Here are the formulas I tested:

RC: Bill James, (H+W-CS)*(TB+.55SB)/(AB+W)

BR: Pete Palmer, .47S+.78D+1.09T+1.4HR+.33W+.3SB-.6CS-.090(AB-H)

.090 was the proper absolute out value for the teams tested

ERP: originally Paul Johnson, version used in "Linear Weights" article on this site

XR: Jim Furtado, .5S+.72D+1.04T+1.44HR+.34W+.18SB-.32CS-.096(AB-H)

EQR: Clay Davenport, as explained in "Equivalent Runs" article on this site

EQRme: my modification of EQR, using 1.9 and -.9, explained in same article

For both EQR, the LgRAW for the sample was .732 and the LgR/PA was .117--these were held constant

BsR: David Smyth, version used published in "Base Runs" article on this site

UW: Phil Birnbaum, .46S+.8D+1.02T+1.4HR+.33W+.3SB-.5CS-(.687BA-1.188BA^2+.152ISO^2-1.288(WAB)(BA)-.049(BA)(ISO)+.271(BA)(ISO)(WAB)+.459WAB-.552WAB^2-.018)*(AB-H)

where WAB = W/AB

AR: based on Mike Gimbel concept, explained in "Appraised Runs" article on this site

Reg: multiple regression equation for the teams in the sample, .509S+.674D+1.167T+1.487HR+.335W+.211SB-.262CS-.0993(AB-H)

Earlier I said that all methods were on a level playing field. This is not exactly true. EQR and BR both take into account the actual runs scored data for the sample, but only to establish constants. BSR's B component should have this advantage too, but I chose not to so that the scales would not be tipped in favor of BsR, since the whole point is to demonstrate BsR's accuracy. Also remember that the BsR equation I used is probably not the most accurate that you could design, it is one that I have used for a couple years now and am familiar with. Obviously the Regression equation has a gigantic advantage.

Anyway, what are the RMSEs for each method?

Reg-------22.56

XR--------22.77

BsR-------22.93

AR--------23.08

EQRme--23.12

ERP-------23.15

BR--------23.29

UW-------23.34

EQR------23.74

RC--------25.44

Again, you should not use these figures as the absolute truth, because there are many other important factors to consider when choosing a run estimator. But the important things to recognize IMO are:

all of the legitamite published formulas have very similar accuracy with real major league teams' seasonal data
if accuracy on team seasonal data is your only concern, throw everything away and run a regression(the reluctence of people who claim to be totally concerned about seasonal accuracy to do this IMO displays that they aren't really as stuck on seasonal team accuracy as they claim to be)
RC is way behind the other methods, although I think if it included W in the B factor as the Tech versions do it would be right in the midst of the pack
BsR is just as accurate with actual team seasonal data as the other run estimators

Anyway, the spreadsheet is available below, and you can plug in other methods and see how they do. But here is the evidence; let the myths die.

RC Accuracy Spreadsheet

Here are some other accuracy studies that you may want to look at. One is by John Jarvis. My only quibble with it is that he uses a regression to runs on each RC estimator, but it is a very interesting article that also applies the methods to defense as well, and is definitely worth reading:

http://knology.net/%7Ejohnfjarvis/runs_survey.html

And this is Jim Furtado's article as published in the 1999 BBBA. He uses both RMSE and regression techniques to evaluate the estimators. Just ignore his look at rate stats--it is fatally flawed by assuming there is a 1:1 relationship between rate stats and run scoring rate. That is pretty much true for OBAxSLG only and that is why it comes in so well in his survey:

http://www.baseballstuff.com/btf/scholars/furtado/articles/accuracy.htm