While there are not quite as many winning percentage estimators as there are
run estimators, there is no shortage of them. Maybe one of the reasons is that
the primary method for determining W%, Bill James' Pythagorean method, is a fairly good method that does not have the obvious
flaws of Runs Created--although it does have some. The inadequacy of Runs Created
has always fueled innovation in the run estimation field.

BenV-L from the FanHome board has provided a classification system for win
estimators, which is a little complex but does indeed make sense. He is a genuine
math/stats guy, so I won't tread on his territory. I will propose a different
classification system that approaches it from a slightly different angle.

First, we have the general area of methods that do not vary based on the run
context. Under this, we have linear and non-linear methods. So, first we will look at static linear methods.

The static linear methods all are based in some way on runs minus runs allowed. Most of them take this general form:

W% = RD:G*S + .5

Where RD:G is Run Differential Per Game and S is slope. This is in the form of a basic linear regression, mx + b. Another
way to write this, also very common, is:

W% = RD:G/RPW + .5

Where RPW is Runs Per Win, which of course is just the reciprocal of the slope. Of course, you could call the slope Wins Per Run, but I prefer sticking with the regression
lingo.

It turns out that for average major league play, the slope turns out to be
about .1, or an RPW of 10. For instance, I often use a value of .107 which is
based on a regression on 1970s data(more out of habit than anything). However,
using regression you can generate a formula that does not weight R and RA equally. One
of these methods was published by Arnold Soolman based on 1901-1970 data. He
had W% = (.102*R-.103*RA)/G + .505. This equation appears to be based on multiple
regression. While it is not inevitable that R and RA be given equal weight, and
that a team that scores as many runs as it allows is predicted at .500, it seems like an inevitable choice to me.

Looking at Soolman's formula, a team that scores and allows 4 runs per game
is predicted to play .501 baseball. This doesn't seem like a big deal, but let's
consider the case of a league that has an average of 4 runs per game. The league
would be predicted to play .501 baseball, which is obviously impossible. They
would have to play .500 baseball. That is my logic for R=RA=.500 W%, and whether
it is good enough is up to you.

We also have non-linear methods that use constants. Earnshaw Cook was the first to actually publish a W% estimator, and it is in this category:

W% = R*.484/RA

A team with equal R and RA would be .484, but if you use .5 instead, it will
work.

Another example is the work of Bill Kross:

if R<RA, W% = R/(2*RA)

if R>RA, W% = 1 - RA/2*R

Another is a method that Bill James speculated would work, but never actually
used, is "double the edge". This is as follows:

W% = (R/RA*2-1)/(R/RA*2)

The problem with many of these methods is that they obviously break down at
the extremes. Using a slope of .1 with the linear method causes a W% of 1 at
a RD:G of 5. But a team that scores 5.1 runs per game more than it's opponent
will not play 1.01 baseball. Cook's formula produces a W% over 1 with a run ratio
over 2.07, although it doesn't allow a sub-zero W%, and isn't accurate at all. The
Kross formula simply does not provide a very accurate estimation, at least in comparison to other methods, although it does
bound W% between 0 and 1. Double the Edge does not allow a W% above 1, but if
the team's run ratio is under .5, it will produce a sub-zero winning percentage.

So every method either is inaccurate or produces impossible answers. While all of these formula will work decently with normal teams in normal scoring contexts, we need methods
that work outside of the normal range. There are real .700 teams, and there are
teams that play in a context where the two teams average 13 runs a game. And
if we want to apply these methods to individuals at all, we definitely need a more versatile method.

Enter the Pythagorean Theorem. Bill
James' formula, W% = R^2/(R^2 + RA^2), has a high degree of accuracy and fits the constraints of 0 and 1. These attributes and its relative simplicity has made it the standard for many years. James would later proclaim that 1.83 was a more optimal exponent.
The formula by which he came to this conclusion was exponent = 2-1/(RPG-3). At
the normal RPG of 9, this does produce an exponent of 1.83, but it provides a maximum possible exponent of 2.333 at 0 RPG
and a minimum possible exponent of 2 at infinity RPG, which as we shall see later is a woefully inadequate and illogical formula.

An off-the-wall sort of formula developed by the author is based on an article
in the old STATS Pro Football Revealed work, which estimate the Z-score of winning percentage for a team and then converted
it back into a W%. I applied this idea to baseball. It is automatically bounded by 0 and 1. Anyway, I estimated
Z-score as 2.324*(R-RA)/(R+RA), and then you can use the normal cumulative function to convert it back into a W%.

Now we go into methods that vary somewhat based on the scoring context. This is normally done in terms of Runs Per Game, (R+RA)/G. First, I should just point out that it might be possible to modify the Z-score W% and the Double the Edge
method to somehow account for changing RPG, but no one has done so and since the methods aren't optimal, it would probably
be a waste of time.

The linear methods that do this simply use a formula based on RPG to estimate
RPW or slope before estimating W%. These linear formulas are still subject to
the same caveats as the static linear methods--they are not bounded by 0 and 1. But
they do add more flexibility, especially within the normal scoring ranges. There
are a number of these methods, all of which produce very similar results as BenV-L found.
The most famous of these is one developed by Pete Palmer, RPW = 10*sqrt(RPG/9).
Some others include David Smyth's (R-RA)/(R+RA) + .5 = W%, which just assumes that RPW = RPG. Ben V-L published the same formula except multiplying (R-RA)/(R+RA) by .91, making RPW = 1.099*RPG. Just for example, another one is Tango Tiger's simple RPG/2 + 5. Again, the accuracy is improved more by using any reasonable modified slope than by finding the optimum
one from out of these choices.

Of course, as we said the problems inherent in linear methods are not resolved
just by using a flexible slope. The Pythagorean model provides the bounds at
0 and 1, and is what we want to build upon. This will take the form of R^X/(R^X
+ RA^X).

There have been several published attempts to base X on RPG. One very simple one is RPG/4.6 from David Sadowski. The most
famous is Clay Davenport's "Pythagenport", X = 1.5log(RPG) + .45. Davenport used
some extreme data and modelling to find his optimal exponent, which claims to be accurate for RPG ranging between 4 and 40.

What about RPG under 4 though? Enter
David Smyth. The inventor of Base Runs, the "natural" RC function, came up with
a brilliant discovery, revelation, or what have you that allows for the finding of a better exponent. Although it is a remarkable obvious conclusion, once you have been exposed to it, no one outside of Mr.
Smyth was able to think it up themselves.

The concept is very simple. The
minimum RPG possible in a game is 1, because if neither team scores, the game continues to go.
And if a team played 162 games at 1 RPG, they would win each game they scored a run and lose each time they allowed
a run. Therefore, to make W/(W+L) = R^X/(R^X + RA^X), X must be set equal to
1. This is a known point in the domain of the exponent: (1,1). Sadowski's formula would give an exponent of .22 at 1 RPG, causing a team that should go 100-62(.617) to
be predicted at .526. Davenport comes
up with .45, which would project a .554 W% for the team--closer, but still incorrect, and our formula has to work at the only
point that we know to be true.

So the search was on for an exponent that would 1) produce 1 at 1 RPG 2)maintain
accuracy for real major league teams and 3) be accurate at high RPG. If criteria
1 and 2 were met, but 3 was not, than the Davenport method would be preferable
at some times, and the new method would be preferable at others. We want a method
that can give us a reasonable estimate all of the time.

It turns out that this author, while fooling around with various regression
models fed by the known point and Davenport's exponent at other points, found
that RPG^.29 matched Davenport's method in the range where a match was desired. Although I posted it on FanHome, nobody really noticed. A few months later, David Smyth posted RPG^.287, saying that he thought it was an exponent that would fit
all of our needs. Bingo. Tango Tiger
ran some tests which are linked below and found that RPG^.28 might be the best, but the Patriot/Smyth exponent is the one
that, at least to this time, has been shown to produce the optimal results. Some
people have taken to calling this Pythagenpat, a takeoff on Pythagenport, but it should always be remembered that Smyth recognized
the usefulness of this method to a greater extent than I did and that without his (1,1) discovery, I would have never been
attempting to develop an exponent.

Let's just close by illustrating the differences between the various methods
for a team that is fairly extreme--they outscore their opponents by a 2:1 ratio in a 5 RPG context(3.33 r/g, 1.67 ra/g):

Model
EW%

Cook
.968

Kross
.750

10 RPW
.666

Pyth(X=2)
.800

Palmer
.723

Sadowski
.680

Davenport
.739

Patriot/Smyth
.751

Although all of these methods with the glaring exception of Cook give a similar
standard error when applied to normal major league teams, the differences are quite large when extreme teams are involved. And while a method like Kross might track the Pythagenpat well in this case, there
are other cases where it will not. The same goes for all of the methods, although
Pythagenport and Pythagenpat are basically equivalent from around 5 to 30 RPG as you can see in the chart linked on this page.

Although linear models do not have the best theoretical accuracy, there are
certain situations in which they can come in handy. What I did was use the Pythagenpat
method as the basis for a slope formula. We can calculate the slope that is in
effect for a team at any given point based on the Pythagorean method by knowing the exponent x(which I figured by Pythagenpat),
the Run Ratio, and the RPG. The formula for this, originally published by Smyth
but in a different form, is S = (RR^x/(RR^x+1)-.5)/(RPG*(2*RR/(RR+1)-1)) What
I did was calculate the needed slope for a team with RR 1.01, 1.05, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, and 2 at
each 1 RPG interval from 1-14. I then attempted to regress for a formula for
slope based on RPG. I eventually decided to cut out the teams from 1-4 RPG because
they simply were too different to fit into the model. But using the teams at
5-14, I came up with an equation that works fairly well in that range, S = .279-.182*log(RPG).
You can see in another set of charts linked below the needed slope at 1-14 RPG and a chart showing the actual needed
slope(marked

Series 3) and the predicted slope(Series 2).
The fit is pretty good in that range, but caution should be used if you try to take it outside of the tested region. Applied to actual 1961-2000 teams projected to 162 games, it has a RMSE of 4.015,
comparable to the most accurate methods.

Finally, I think it would be useful to observe that in all of these methods,
four basic components pop up a lot: Runs Per Game(RPG), Run Ratio(RR), Run Percentage(R%), and Run Differential Per Game(RD:G). I have provided the formulas for each of these and formulas that you can use to convert
between themselves, something that is technical math crap instead of real sabermetric knowledge, but I find the conversion
formulas useful:

RPG = (R+RA)/G

RR = R/RA

R% = R/(R+RA)

RD:G = (R-RA)/G

RR = R%/(1-R%)

RR = (RD:G/(2*RPG)+.5)/(.5-RD:G/(2*RPG))

R% = RR/(RR+1)

R% = (RD:G+RPG)/(2*RPG) = RD:G/(2*RPG) + .5

RD:G = RPG*(2*R%-1)

RD:G = RPG*(2*RR/(RR+1)-1)