Linear Weights
I certainly am no expert on Linear Weight formulas and their construction-leave that to people like Tango Tiger and Mickey
Lichtman. However, I do have some knowledge on LW methods and thought I would explain some of the different methods of generating
LW that are in use.
One thing to note before we start is that every RC method is LW. If you use the +1 technique, you can see the LWs that are
used in a method like RC, BsR, or RPA. A good way to test non-linear RC formulas is to see how they stack up against LW methods
in the context the LW are for. LW will vary widely based on the context. In normal ML contexts, though, the absolute out
value is close to -.1, and the HR value stays close to 1.4. David Smyth provided the theory(or fact, I guess you could say),
that as the OBA moves towards 1, the value of all events LWs converge towards 1.
Now what I understand of how LW are generated:
Empirical LW
Empirical LW have been published by Pete Palmer and Mickey Lichtman. They can be considered the true Linear Weight values.
Empirical LW are based on finding the value of each event with the base/out table, and then averaging the value for all singles,
etc. This is the LW for the single. Another way to look at it is that they calculate the value of an event in all 24 base/out
situations, and then multiply that by the proportion of that event that occurs in that situation, and then sum those 24 values.
Palmer's weights were actually based on simulation, but as long as the simulation was well-designed it shouldn't be an issue.
One way you could empirically derive different LW is to assume that the events occur randomly, i.e. assuming that the proportion
of overall PAs in each base/out situation is the same as the proportion of the event that occur in this situation. For instance,
if 2% of PA come with the bases loaded and 1 out, then you assume that 2% of doubles occur with the bases loaded and 1 out
as well. This is an interesting idea for a method. If you see a double hit in a random situation, you could make the argument
that this method would give you the best guess weight for this event. But that is only if you assume that the base/out situation
does not effect the probability of a given event. Does it work out that way?
Tango Tiger told me that the only event that comes up with a significantly different LW value by the method I have just described
is the walk. This is another way of saying that walks tend to occur in lower leverage situations then most events. But the
difference is not that large.
Modeling
You can also use mathematical modeling to come up with LW. Tango Tiger and David Smyth have both published methods on FanHome.com
that approach the problem from this direction. Both are approximations and are based on some assumptions that will vary slightly
in different contexts. Tango, though, has apparently developed a new method that gives an accurate base/out table and LW
based on mathematical modeling and does it quite well.
The original methods published by the two are very user-friendly and can be done quickly. Smyth also published a Quick and
Dirty LW method that works well in normal scoring contexts and only uses the number of runs/game to estimate the value of
events.
Skeletons
Another way to do this is to develop a skeleton that shows the relationships between the events, and then finds a multiplier
to equate this to the actual runs scored. The advantage of this method is that you can focus on the long-term relationships
between walks v. singles, doubles v. triples, etc, and then find a custom multiplier each season, by dividing runs by the
result of the skeleton for the entity(league, team, etc.) you are interested in. Recently, I decided to take a skeleton approach
of a LW method. Working with data for all teams, 1951-1998, I found that this skeleton worked well: TB+.5H+W-.3(AB-H), with
a required multiplier of .324. Working SB and CS into the formula, I had: TB+.5H+W-.3(AB-H)+.7SB-CS, with an outward multiplier
of .322. When I took a step back and looked at what I had done though, I realized I had reproduced Paul Johnson's Estimated
Runs Produced method. If you look at Johnson's method:
(2*(TB+W)+H-.605*(AB-H))*.16
If you multiply my method by 2, you get:
(2*(TB+W)+H-.6*(AB-H))*.162
As you can see, ERP is pretty much equal to my unnamed formula. Since it is so similar to ERP, I just will consider it to
be ERP. You can then find the resulting LW by expanding the formula; for example, a double adds 2 total bases and 1 hit,
so it has a value of (2*2+1)*.162=.81.
Working out the full expansion of my ERP equations, we have:
ERP = .49S+.81D+1.13T+1.46HR+.32W-.097(AB-H)
ERP = .48S+.81D+1.13T+1.45HR+.32W+.23SB-.32CS-.097(AB-H)
I have recently thrown together a couple of versions that encompass all of the official offensive stats:
ERP = (TB+.5H+W+HB-.5IW+.3SH+.7(SF+SB)-CS-.7DP-.3(AB-H))*.322
ERP = (TB+.5H+W+HB-.5IW+.3SH+.7(SF+SB)-CS-.7DP-.292(AB-H)-.031K)*.322
Or:
ERP = .483S+.805D+1.127T+1.449HR+.322(W+HB)-.161IW+.225(SB+SF-DP)+.097*SH-.322CS-.097(AB-H)
ERP = .483S+.805D+1.127T+1.449HR+.322(W+HB)-.161IW+.225(SB+SF-DP)+.097*SH-.322CS-.094(AB-H-K)-.104K
Here are a couple versions you can use for past eras of baseball. For the lively ball era, the basic skeleton of (TB+.5H+W-.3(AB-H))
works fine, just use a multiplier of .33 for the 1940s and .34 for the 1920s and 30s. For the dead ball era, you can use
a skeleton of (TB+.5(H+SB)+W-.3(AB-H)) with a multiplier of .341 for the 1910s and .371 for 1901-1909. Past that, you're
on your own. While breaking it down by decade is not exactly optimal, it is an easy way to group them. The formulas are
reasonably accurate in the dead ball era, but not nearly as much as they are in the lively ball era.
Regression
Using the statistical method of multiple regression, you can find the most accurate linear weights possible for your dataset
and inputs. However, when you base a method on regression, you often lose the theoretical accuracy of the method, since there
is a relationship or correlation between various stats, like homers and strikeouts. Therefore, since teams that hit lots
of homers usually strike out more than the average team, strikeouts may be evaluated as less negative then other outs by the
formula, while they should have a slightly larger negative impact. Also, since there is no statistic available to measure
baserunning skills, outside of SB, CS, and triples(for instance we dont know how many times a team gets 2 bases on a single),
these statistics can have inflated value in a regression equation because of their relationship with speed. Another concern
that some people have with regression equations is that they are based on teams, and they should not be applied to individuals.
Anyway, if done properly, a regression equation can be a useful method for evaluating runs created. In their fine book, Curve
Ball, Jim Albright and Jay Bennett published a regression equation for runs. They based it on runs/game, but I went ahead
and calculated the long term absolute out value. With this modification, their formula is:
R = .52S+.66D+1.17T+1.49HR+.35W+.19SB-.11CS-.094(AB-H)
A discussion last summer on FanHome was very useful in providing some additional ideas about regression approaches(thanks
to Alan Jordan especially). You can get very different coefficients for each event based on how you group them. For instance,
I did a regression on all teams 1980-2003 using S, D, T, HR, W, SB, CS, and AB-H, and another regression using H, TB, W, SB,
CS, and AB-H. Here are the results:
R = .52S+.74D+.95T+1.48HR+.33W+.24SB-.26CS-.104(AB-H)
The value for the triple is significantly lower then we would expect. But with the other dataset, we get:
R = .18H+.31TB+.34W+.22SB-.25CS-.103(AB-H)
which is equivalent to:
R = .49S+.80D+1.11T+1.42HR+.34W+.22SB-.25CS-.103(AB-H)
which are values more in line with what we would expect. So the way you group events(this can also be seen with things like
taking HB and W together or seperately. Or if there was a set relationship you wanted(like CS are twice as bad as SB are
good), you could use a category like SB-2CS and regress against that) can make a large difference in the resulting formulas.
An example I posted on FanHome drives home the potential pitfalls in regression. I ran a few regression equations for individual
8 team leagues and found this one from the 1961 NL:
R = 0.669 S + 0.661 D - 1.28 T + 1.05 HR + 0.352 W - 0.0944 (AB-H)
Obviously an 8 team league is too small for a self-respecting statistician to use, but it serves the purpose here. A double
is worth about the same as a single, and a triple is worth NEGATIVE runs. Why is this? Because the regression process does
not know anything about baseball. It just looks at various correlations. In the 1961 NL, triples were correlated with
runs at r=-.567. The Pirates led the league in triples but were 6th in runs. The Cubs were 2nd in T but 7th in runs. The
Cards tied for 2nd in T but were 5th in runs. The Phillies were 4th in triples but last in runs. The Giants were last in the
league in triples but led the league in runs. If you too knew nothing about baseball, you too could easily conclude that
triples were a detriment to scoring runs.
While it is possible that people who hit triples were rarely driven in that year, it's fairly certain an empirical LW analysis
from the PBP data would show a triple is worth somewhere around 1-1.15 runs as always. Even if such an effect did exist,
there is likely far too much noise in the regression to use it to find such effects.
Trial and Error
This is not so much its own method as a combination of all of the others. Jim Furtado, in developing Extrapolated Runs, used
Paul Johnson's ERP, regression, and some trial and error to find a method with the best accuracy. However, some of the weights
look silly, like the fact that a double is only worth .22 more runs than a single. ERP gives .32, and Palmer's Batting Runs
gives .31. So, in trying to find the highest accuracy, it seems as if the trial and error approach compromises theoretical
accuracy, kind of as regression does.
Skeleton approaches, of course, use trial and error in many cases in developing the skeletons. The ERP formulas I publish
here certainly used a healthy dose of trial and error.
The +1 Method/Partial Derivatives
Using a non-linear RC formula, you add one of each event and see what the difference in estimated runs would be. This will
only give you accurate weights if you have a good method like BsR, but if you use a flawed method like RC, take the custom
LWs with a grain of salt or three.
Using calculus, and taking the partial derivative of runs with respect to a given event, you can determine the precise LW
values of each event. See my BsR article for some examples of this technique.
Calculating the Out Value
You can calculate a custom out value for whatever entity you are looking at. There are three possible baselines: absolute
runs, runs above average, and runs above replacement. The first step to find the out value for any of these is to find the
sum of all the events in the formula other than AB-H. AB-H are called O for outs, and could include some other out events(like
CS) that you want to have the value vary, but in my ERP formula it is just AB-H in the O component. Call this value X. Then,
with actual runs being R, the necessary formulas are:
Absolute out value = (R-X)/O
Average out value = -X/O
For the replacement out value, there is another consideration. First you have to choose how you define replacement level,
and calculate the number of runs your entity would score, given the same number of outs, but replacement level production.
I set replacement level as 1 run below the entity's average, so I find the runs/out for a team 1 run/game below average,
and multiply this by the entity's outs. This is Replacement Runs, or RR. Then you have:
Replacement out value = (R-RR-X)/O
|