I used to have 2002 stats up here, but there were some problems with a couple
of the formulas I used and they take up a lot of disk room. If for some reason you have to have those spreadsheets,
I can email them to you. Anyway, they'll be 2003 stats here when the 2003 season is over.
The data is mostly from Doug's Stat Page, which I have linked below. The park adjustments are based on a simple 5 year runs PF(certainly NOT the most optimal approach out there). The data for the PFs came from KJOK's database for the last 4 years and MLB.com for
2003. Here is an explanation of the charts and such:
Team
Expected Winning %(EW%)-estimated from runs and runs allowed
Predicted Winning %(PW%)-estimated from Runs Created and opponent RC(eR)
R and RA are park adjusted
The first set of BA, OBA, and SLG are park adjusted figures for the offense;
the second set are for the defense
RC and eR are park adjusted-eR is opponents RC
The formula for EW% is just a very simple linear formula with a slope of .107.
This is probably too high for today's game and really I should use a custom slope for each team.
Who's Included
Pitchers with 15 starts are classified as starters. Pitchers with 40
appearances and less than 15 starts are classified as relievers(or with 50 IP and 50% relief appearances without meeting the
15 GS/40 G thresholds). Hitters included all players with 300 PA, and assigned one position which is the one at which
they played the most games. Players who split time between teams are listed with the last team they played for.
This is not optimal, as the park factors and league factors get screwed up, but it's the easy way to go, and I'm lazy.
Relievers
Runs Above Replacement(RAR) are measured above a player who performs at 80%
of the league average and are based on RRA(see the "Baselines" article for the reasoning behind this). I also include
RAA, PRAA for hitters.
RA is park adjusted runs per nine innings
RRA is RA adjusted for performance inheriting runners, based just on the basic
numbers of inherited runners and inherited runners scored. This idea is from Sky Andrecheck, and his formula is published
in the August 1999 BTN. My formula is the same except using the current year's IRS% as the baseline and using R instead
of ER. IRSV follows from this and is the number of inherited runs the pitcher saved from scoring that an average reliever
would have
eRA is Estimated Run Average(opponents RC per 9 IP), park adjusted. Beginning
with 2005, I have used actual Double and Triple data rather then estimated total bases(for players only, the team estimates
are still based on the old formula).
GRA is Guess Run Average, a park adjusted DIPS estimator in the spirit of Tango
Tiger's DIPS estimator
G-F is Guess-Future a silly stat that combines eRA and K per IP to represent
future potential; inorganic number, ranges from about 3 to 4.5
IR/G is inherited runners/game
Starters
RAR are above 80% and based on RA
P/S in 2003 is estimated Pitches/Start(based on the STATS formula). For 2004,
it is Pitches/((G+GS)/2)
WCA is Wins Compared to Average, the estimated wins the pitcher was above what
a pitcher with average run support would have done
Hitters
Hitters are grouped by position, although the position is not indicated in
order for the document to be printer-friendly
PA is AB+W
BA, OBA, and SLG are park adjusted
RC are based on ERP and park adjusted
RG is RC/25.5 outs
RAR is above 73%
PRAR is runs above a replacement hitter at the position
SEC is the park adjusted Secondary Average, not including SB and CS
SU is an estimate of speed based on frequency of stolen base attempts, stolen
base %, triples per balls in play, and runs scored per times on base; SU is designed to have an average around 50, but this
year an average player has scored around 45
Formulas
eRA:
eRA = ((.162+.324X)(H)+(1.296-.324X)HR+.324W-.274IP)*9/IP
where X = Lg (TB-4HR)/(H-HR)
For 2005 and on, I have used actual double and triple data, This gives eRA = (TB+.5H+W-.3(AB*x))*.324*9/IP, where x = Lg(AB-H)/IP
divided by PF for adjustment
GRA:
GRA = (9X)(.326IP+1.46HR+.324W-.168K)/IP
where X is LgR/(.326IP+1.46HR+.324W-.168K)
starting in 2006, new formula used:
x is carried over from eRA formula
y = Lg(R-.324W-1.458HR-.097K)/(AB-HR-K)
z = Lg(IP-K/3)/(AB-HR-K)
GRA = 9*(.324W+1.458HR-.097K+y*(IP*x+H-HR-K))/(K/3+z*(IP*x+H-HR-K)
divided by PF for adjustment
G-F
G-F = 4.46+.095(eRA)-.113(KG)
P/S
P/S = (4.81K+5.14W+3.27H+3.16(3IP-K))/G
SEC
SEC = SLG-BA + (OBA-BA)/(1-OBA)
SU
Start by figuring for player and league:
SBFrq = (SB+CS)/(H+W-HR)
WSB% = (SB+3)/(SB+CS+7)
R/TOB = (R-HR)/(H+W-HR) T/BIP
= T/(AB-HR-K)
Then figure:
A = (SBFrq-LgSBFrq)/(.91*LgSBFrq)
B = (WSB%-LgSB%)/(.19*LgSB%)
C = (R/TOB-LgR/TOB)/(.2*LgR/TOB) D = (T/BIP-LgT/BIP)/(.89*LgT/BIP)
Then SU = 50+4.25(A+B+C+D)
There is a a method behind that madness; A, B, C, and D are supposed to
represent a Z-Score in each of those categories. The numerator is an estimate of the stdev of individual performances
based on a study of a few years of data, but the estimates don't really hold up over time. SU is just supposed
to give a quick read on overall speed; don't take it too seriously.
That formula was used for 2003-2005. A new formula is in use from then on:
SBFrq = (SB + CS)/(H + W - HR)
T/BIP = T/(AB - HR - K)
R/TOB = (R - HR)/(H + W - HR)
WSB% = (SB + 3)/(SB + CS + 7)
Then we subtract the league average from each of these, and divide by the 3-year average standard deviation to get a z-score:
sbf = (SBFrq - LgSBFrq)/.0669 = (SBFrq - LgSBFrq)*14.95
tbip = (T/BIP - LgT/BIP)/.0063 = (T/BIP - LgT/BIP)*158.7
rtob = (R/TOB - LgR/TOB)/.0640 = (R/TOB - LgR/TOB)*15.63
wsb = (WSB% - LgSB%)/.1240 + 1.31 = (WSB% - LgSB%)*8.065 + 1.31
Speed Unit = 50 + 4.25*(sbf + tbip + rtob + wsb)
The logic is similar, but this is a little cleaner.
RC
RC = (TB+W+.5H+.7SB-CS-.3(AB-H))*.322
divide by PF to adjust
RG
RG = (RC*25.5)/(AB-H+CS)
RAR
Hitters
RAR = (RG-N*.73)*(AB-H+CS)/25.5
PRAR = (RG-N*.73*PADJ)*(AB-H+CS)/25.5
Pitchers
RAR = (N*1.25-RA)*(IP/9) (use RRA in place of RA for relievers)
N is simply league RPG for one team(used elsewhere on this page as well)
WCA
WCA = (W%-(RS-N)*.107-.5)*(W+L)
WCA is the number of wins a pitcher has beyond the number of wins an average
pitcher with his decisions and run suport would have
Starting in 2006, WCR(Wins Compared to Replacement) is used; first find RW% as (RS/PF)^2/((RS/PF)^2+(1.25*N)^2); then WCR
= W - RW%*(W+L)
Position Adjustments(PADJ)
These specific numbers were based on RC/O for 1992-2001.
C=.89 1B/DH=1.19 2B=.93 3B=1.01 SS=.86 LF/RF=1.12 CF=1.02
Park Adjustment
The park factors used here are based on 5 years when applicable. I
have regressed as MGL suggested on a 2000 thread on baseballboards.com:
Here's a decent rule of thumb set of formulas
for regressing. For 1-year stats, true PF(TPF)=1-(1-PF)*.6, 2-year stats, TPF=1-(1-PF)*.7, 3-year stats, TPF=1-(1-PF)*.8,
and for 4-year or more stats, TPF=1-(1-PF)*.9.
Anyway, the basic pre-regressed formula is PF = (H*T/((T-1)*R+H)+1)/2
where H is RPG @ Home, R is RPG on the road, and T is the number of teams in
the league. This is from Craig Wright in The Diamond Apparaised.
for BA, OBA, SLG, divide by PF^.438
That's pretty goofy, and it's just a fudge, since the runs PF really shouldn't
be adjusting the OBA, etc. The .438 is derived from a study I did trying to keep park adjusted RC/O equal with RC/O
figured from park adjusted BA, OBA, and SLG. It's not the most scientific thing, but it gives fairly good results
If you want to see the entire page, select all the columns and set the column
width to something greater than 6. You'll then see AB, H, etc.
Any further questions? bcheipp@yahoo.com