Seametry: Aggregate MVPs
TOWER GROVE — For one of the Sunday Hot Corners late in the season, we attempted to calculate the “V” in MVP and illustrate what hitters had the best claim to the award. One thing was missing from those aggregate rankings.
Defense.
The National League MVP will be awarded tonight, and and a couple of players with St. Louis ties will finish one-two in the voting. Either St. Louis native Ryan Howard will win or St. Louis Cardinal Albert Pujols will. But who should?
To explore the MVP award, statistically, back in August I lined up all of the candidates (including some surprise names) in both leagues and charted where they ranked in eight categories. Not sure why, but I’m a fan of aggregates and I think using the rankings in those eight categories helps define an argument, if not settle it.
I ran the same numbers last night with the 2006 final stats.
Call the calculations the MVPag.
Wait until you see what Pujols scored.
How it works is simple. Consider New York Yankees shortstop Derek Jeter, a favorite for the American League MVP. In the AL, he ranks second in batting average (.343), 29th in slugging (.483), fourth in on-base percentage (.417), 67th in home runs (14), second in runs scored (114) and 22nd in RBIs (97). Two new-Math categories were also used. Jeter ranked first in Value Over Replacement Player (80.5) and first in Win Shares (33).
Add his rankings together to get his aggregate sum: 128.0.
The lower the sum the better.
The same process with Minnesota first baseman Justin Morneau yields an agg-sum of 73.0. The agg-sums of the 10 AL players I figured last night:
1. David Ortiz, Boston … 57.5
2. Travis Hafner, Cleveland … 59.5
3. Jermaine Dye, White Sox … 62.0
4. Morneau, Minnesota … 73.0
5. Manny Ramirez, Boston … 98.5
6. Vladimir Guerrero, LA Angels … 98.5
7. Alex Rodriguez, NY Yankees … 101.5
8. Jeter, NY Yankees … 128.0
9. Vernon Wells, Toronto … 151.5
10. Joe Mauer, Minnesota … 174
Just look at the rankings, and while it snaps a picture of offensive performance — would anyone doubt that Ortiz and Hafner had two superb offensive years? maybe even most valuable offensive years? — it lacks a defensive quotient. Especially when you’re looking at the AL and you have to weigh DH Hafner against catcher Mauer.
An elementary way to fix that is a plus system.
+ = DH or below average defense
++ = average defense
+++ = exceptional, Gold Glove or demanding defense (catchers would be automatically qualifies; I’d hear an argument for shortstops as well).
Take he agg-sum and divide it by the number of pluses. Sporting a Gold Glove this past season, Jeter’s MVPag drops to 42.7. As a catcher, Mauer’s MVPag moves to 58.0. I’ll show the final rankings for the AL players below, but today is the NL MVP and it’s time to reveal Pujols’ score:
19.5
That’s his agg-sum. That’s before the pluses are used.
That’s, well, incredible.
Pujols ranked in the top 10 in all eight categories. He was first or second in six of the eight and third in another one. He led the NL in VORP (85.4), Win Shares (39), slugging percentage (.671), and on-base percentage (.431). The next closest agg-sum that I could find in either league was Howard’s 38.0. And Houston first baseman/outfielder Lance Berkman’s agg-sum of 51.5 comes next closest to Pujols’ and Howard’s.
The MVPag confirms what many believe: Pujols, on the heels of what was arguably a career year, should be the runaway NL MVP. How the agg-sums and MVPag breakdown for seven NL MVP candidates:
1. Pujols, STL … agg-sum: 19.5; MVPag: 6.5
2. Howard, PHI … agg-sum: 38.0; MVPag: 19.0
3. Berkman, HOU … agg-sum: 51.5; MVPag: 25.8
4. Beltran, NYM … agg-sum: 88.5; MVPag: 29.5
5. Cabrera, FLA … agg-sum: 63.5; MVPag: 31.8
6. Reyes, NYM … agg-sum: 196.0; MVPag: 65.3
7. Soriano, CHI … agg-sum: 159.5; MVPag: 79.8
To be honest, the rankings lineup with how I would vote the top five of the NL MVP, with the possible exception being Jose Reyes, whose stolen bases and speed are not included here. Does that mean this exercise told me nothing, or helped confirm my gut vote? Not sure. But it does discern where a player ranks in regards to his peers, his contemporaries. And when it comes to hitters it helps measure just who, by way of comparison, was most valuable to the team.
The hole is the subjectivity of the defensive plus.
When I ran the numbers in August for the AL, Hafner had the lowest agg-sum and, like Pujols’ final figure, it was so low that any defensive adjustment was only going to move one AL player to a lower total than Hafner’s. The AL race is a more compelling one than the NL because there is not a clear-cut favorite, no one player who dominated all of the numbers. You have to pick a performance and dub it “most valuable”. Where the MVPag for the NL mirrored most voters’ instincts, the MVPag ranking for the AL may assist in revealing the MVP.
The MVPag for the 10 players mentioned above:
1. Dye … 31.0
2. Morneau … 36.5
3. Jeter … 42.7
4. Guerrero … 49.3
5. Wells … 50.5
6. A Rodriguez … 50.8
7. Ortiz … 57.5
8. Mauer … 58.0
9. Hafner … 59.5
10. Ramirez … 98.5
Disappointed that Mauer doesn’t rate higher because these rankings, otherwise, look like a sturdy ballot. Sure it’s an oversimplified way to pool and then separate the statistics and the defensive division is a tad clunky. But it’s a tool. It’s a start.
Discuss.
-30-


Derrick Goold said he was going to Mizzou for capital-J journalism, but after growing up in the Time Zone Baseball Forgot he was really drawn to MU sitting between two major-league cities. Goold joined the Post-Dispatch in 2001 after working for The Times-Picayune and Rocky Mountain News, covering sports from LSU to NHL and every level of baseball in between.
One can only hope that the MVP voters, as they correctly did last year, look at all of the numbers. Not just HR/RBI’s.
Also,
Check out over the seminal blog Viva el Birdos, where Larry did a similar breakdown of the NL rankings — only with much more readable charts. Click here — http://www.vivaelbirdos.com./ — for his approach. We both had the same tack to the subject.
Same conclusion, too.
dg
-30-
Well, is anyone surprised?? I have no problem with Howard getting it, but, even using Derrick’s MVPag, Pujols is a runaway.. Oh well.. I think Albert will take the ring over ANOTHER MVP..
Voters don’t give a crap about reality and statistics; they vote based off of how the STORY makes them FEEL.
Coop,
Sure there is a lot of truth to what you say. Writers are rightfully accused of rooting for the best story and sometimes the quickest story. But it’s not the sole truth, otherwise Trevor Hoffman would have won the Cy Young (better story than any other candidate) and Dan Uggla would have won the Jackie Robinson Award (way better story than any of the other candidates — Rule 5! Pokey Reese!).
Don’t forget, reality and statistics are the makings of good stories. You have no story without one, the other or often both.
dg
-30-
This system is awful. Once again Goold fails to provide any real insight into the world of baseball. It is pretty obvious how to account for defense and speed to me- look at the player’s WARP. Advanced statistics like WARP account for those type of things, but for whatever reason the mainstream media will not accept them.
Instead, Goold gives a system that a third grader could have thopught of.
Oops. Missed my mark. I was going for that old newspaper cliche we’re taught in school – write for a sixth-grade reading level.
I blew it by three grades.
In all seriousness, there’s a fair point made here. For those who don’t speak WARP, VORP, PMLVr, and Win Shares, he’s a quick primer. (Warning: Third Graders may want to turn back, dangerous Sabermetric references ahead.) The VORP mentioned above is Value Over Replacement Players. WARP is the next step, WINS Over Replacement Players. According to Baseball Prospectus’ glossary, WARP is the “number of wins this player contributed, above what a replacement level hitter, field and pitcher would have done.”
That’s WARP-1. WARP-2 adds a wrinkle. WARP-3 equalizes older players into a 162-game schedule.
What does it all mean?
Think of it this way. The aggregate charts above are third-grade level VORP, WARP, whatever. Compiling raw, basic numbers is the start of A.P. Baseball Statistics. Above is the first speed on the statistical blender. It’s a start to smoother, cleaner numbers. WARP is 10 grades ahead. It’s more than a puree of stats — like above — it is a distillation of statistics. Beyond my pencil-and-paper ability.
But I tried to find a simple, accessible way to illustrate things.
Want WARP? Here are the WARP totals: Albert Pujols scores a 11.9. Carlos Beltran rates a 10.4 (career high WARP), Miguel Cabrera has a 10.0, Lance Berkman a 9.0, and Ryan Howard slides in at 8.6. In the other league, Derek Jeter is a 9.8 WARP, Jermaine Dye gets a 8.5 and Travis Hafner goes at 8.0.
It tells a more detailed story than I did — than I could — above. It’s a fantastic tool and it is part of a new, exciting paradigm in the game. But it isn’t introductory stuff. That’s what the goal of the above aggregate was. Introductory stuff. Something you can do without plugging into Big Blue. All you need is yesterday’s boxscores.
Like, you know, we did as sixth graders.
dg
-30-
Congratulations to the Cubs on another big free-agent signing, who fits right into their pattern for the last 20 years! A 31-year-old leadoff hitter who’s never had less than 125 strikeouts in a full season (160 in ‘06, 4th in NL), and never had more than 38 walks in a season before ‘06. A defensive liability who’s never played a single inning at center field, and the Cubs plan to put him there. A guy who’s already 31, and whose speed game is likely to start diminishing soon. Another temperamental player who hasn’t shown much commitment to the team concept in the past. Sammy Sosa revisited? Alfonso Soriano! Heck, if Sammy wants to get back into baseball, the Cubs could have signed him for a lot less money, and a much shorter deal.
Cubs ridicule aside, Soriano would look good in anybody’s lineup, even Cardinals white and road gray, though a team with a DH could probably use him more effectively. What’s shocking about the Cubs is not that they are paying him $17 million a year in 2007, but that they’ll pay that much when he’s 39 years old and likely well past his prime. I’d rather give him more money up front when he’s actually worth it. That contract is going to hang over the Cubs like the A-Rod deal. I predict right now that they’ll pay half of Soriano’s salary to play for somebody else by 2012, if not sooner. Strikeouts and bad defense are the secret to the Cubs long run of mediocrity.
You make a good point. I honestly did not think that you had that type of knowledge in you. I remember an article you wrote a while ago at the trade deadline (2 years ago I believe.) You valued each player based on how many RBIs he had at that point. You called Pedro Feliz �practically un-tradable� and Mark Kotsay �at best a fourth OF on a championship� based on their RBI totals. Obviously, Feliz was hitting 3-4-5 spots and had more RBI opportunities than the leadoff hitter Kotsay. The subjectivity of RBIs makes it a fun stat to look at, but a worthless one for player evaluation (remember Ruben Sierra�s 100 RBI 1993 season where he hit .233/.288/.390) Seriously, just that you have heard of WARP and have any understanding of it probably puts you in the top 1% of St. Louis media when it comes to baseball knowledge (how many times have you heard guys like Ramsey, Slaten or Caliborne talk about baseball and just say, �Wow, they are so far away from the reality�?)
Why don�t you incorporate some of that sabermetric knowledge into some of your articles and shape St. Louis� ultra-old school fan base into some new age baseball thinkers. The bunt, stolen base and hit and run have been so over glorified in this town, based primarily I think on the misperception of their value back in the most overly nostalgic period of St. Louis sports, Cardinals baseball in the 80s. I am tired of the same old subjective old school mentality. If you have the knowledge, use it. I am putting my hope in you, Mr. Goold, change this town.
Sorry, one more thing to remember is that Johan Santana led the AL in WARP with 10.6- MVP in my book.
Happy Birthday Stan-thanks for everything. The best that ever was in all around fine human being and role model to all….many of today’s pros could stand to model themselves after “the man”.
Fuhrig,
I too am surprised that, according to the reports I’ve seen, Soriano’s deal isn’t frontloaded. Overpay him now, when he’s a 40-40 threat, sure, but also to free up some money down the road when salaries are only going to climb and Chicago will have a chunk of its payroll still dedicated to possibly a part-time outfielder. I’m eager to see a breakdown of the contract.
If anyone has seen one please pop it in here or send it to me.
Nick,
No pressure or anything. I went looking for the article you mentioned because it does not ring a bell and could not find it. It wasn’t one of the blogs and its not in the Post-Dispatch morgue. Doesn’t mean it’s not somewhere, but your recall doesn’t seem like something I would have said, except the thing about the importance of RBIs. But, with a home run being the one exception, RBIs are only as important as the runners on base to produce them. Runs are what determine the game.
Which brings me to my feeling on sabermetrics. They are great and exciting tools. People far smarter than me are putting them too good use. Yet, they do not work in a vacuum. The game still has an element of the intangible. I don’t have the numbers to prove it, but I believe in clutch. I cannot site the statistics that support it, but I believe Scott Rolen is the best defensive third baseman I have ever seen.
I know some numbers refute it, but I defend Derek Jeter’s defense.
More and more sabermetrics are working their way into mainstream media — look no further than Bernie Miklasz. We do still have to put them in terms everybody can understand. We cannot throw around Win Shares as easily as we reference home runs, for example. We also cannot ignore some of the elements you mention — stolen bases (which changed the 2004 postseason), bunts (which won how many games for the Cardinals in 2005) and hit-and-runs (which have one purpose I’m fond of …).
Nick, allow me a question, are you a believer in productive outs? I pose the questions to others, too. Do you think productive outs exist? It’s not a yes or no answer, in my opinion.
dg
-30-
P.S. While we’re on the subject, I would like to invite Nick to read a few of the past blogs about defense. I continue to try to find a way within the realm of new numbers to show how one shortstop would perform on another team.
I’m eager to hear your thoughts on some of these:
http://www.stltoday.com/blogs/sports-bird-land/2006/07/seametry-belliards-glove/
http://www.stltoday.com/blogs/sports-bird-land/2005/11/seametry-good-golden-and-gilded-gloves-vol-2b/
http://www.stltoday.com/blogs/sports-bird-land/2005/11/seametry-good-golden-and-gilded-gloves-vol-i/
dg
-30-
Derrick,
Two reactions to your numbers. In the AL, Jeter illustrates the unfairness, or imbalance, of aggregating ranks. Because he only hit 14 homers, is 67th ranking makes up more than half of his total number of 128. Had you just included steals, David Ortiz’s 1 SB would have added several hundred to his aggregate number, and the other power hitters would have suffered similarly. Perhaps a cap at 20 or 25 for how many rankings points can be added to the aggregate from a single statistical category? Your accounting for defense was also an interesting idea, though it might be too heavy a weight. Perhaps adding half a + for each step beyond DH/bad fielder would be a bit more balanced a measure. How about another + (or 1/2+) for being a team leader? That’s certainly a factor for a guy like Jeter or Albert.
Speaking of Soriano, nobody talked about him for MVP because he was with a last-place team. But given his speed-power numbers, it would be interesting to see where his production would rank in your aggregates or other measures. Especially if SBs were taken into account.
Fuhrig,
I messed around with a cap — everybody ranked out of the top 30 got 31 points. But I got the sense when adding stuff up that that skewed it more than anything. I didn’t present this as the foolproof way to judge players, but a way to kickstart the conversation with a sense of where a player ranks against his peers.
Jeter isn’t going to hit as many home runs as Ortiz, but Ortiz isn’t going impact a game with his glove like Jeter. Those two things have to be weighed against each other when voting on the MVP.
As for how heavily to weight defense is half the game. Which maybe heightens Santana’s candidacy, as Nick pointed out, for the award because he dominates half the game on the days he pitches.
Soriano? He finished sixth in the voting. He was seventh in the above discussion.
dg
-30-
I believe that productive outs exist, but I do not think that strikeouts are an important statistic for hitters when it comes to his productivity (they can be a useful tool to predict career paths.)
I use a few examples from 2004 to defend my belief on productive outs. First, the 67-95 led the majors in productive outs. Secondly, I use the study of Edmonds vs. Pujols. Edmonds struck out 150 times and Pujols only 50. Pujols had 9 sacrafice flies, and Edmonds had 8. Pujols had 17 productive outs in 67 opportunitiesfor a PO% of .254, while Edmonds logged 19 productive productive outs in only 58 opportubnities for a PO% of .328. Even with Edmonds high strikeout total, he made more productive outs than Pujols last season in 9 less productive out opportunities.
Edmonds was also the third hardest player to double up last year largely because of his high strikout total. (Edmonds grounded inot 6 double plays in 112 double play opportuinities. opportunities while Pujols grounded into 18 double plays in 141 double play opportunities.) A double play is much larger rally killer than a strikeout. Although on the surface, a strikeout appears to be the ultimate failure, it really is no more harmful than any other out.
Obviously productive outs exist (as defined by Buster Onley; A baserunner advances with the first out of an inning, or a pitcher sacrifices with one out or a
baserunner is driven home with the second out of an inning.) If you use productive outs to support players with a high contact rate then I believe that you need to take a closer look at DP rate and make sure to always look at the number of productive opportunities.
The fielding articles were interesting. (Did Bird land come up with this idea?) I have always believed that range factor was flawed for those reasons that you stated. A team with a bunch of strikeout pitchers will not give their team as many chances as a team with low strikeout guys. I think that it is right to adjust for the pitching staff.
I do not know if I agree with the system. As I understand it (please correct me if i am wrong) you take the percentage of outs that a player made and translate it by applying that percentage to the number of outs and assists that the other “fantasy” team had. But, it seems like while it would work to an extent, but I am sure about how much the % of outs and assists translates. It is certainly interesting and I would like to compare it to the Davenport translations (which can admittedly be flawed) and other advanced fielding statistics.
As for the whole percentage of outs made; I do not know if I necessarily agree with this premise.
Nick,
I wrote about this defensive stuff before Bird Land even existed. As far as I know, nobody else has tried to normalize middle infielder defense from strikeout team to groundball team (the David Eckstein Conundrum, as it were). The theory is that if you deal with a massive quantity of numbers — and total chances by team is a big number — you can eliminate some of the teeny details, like field condition, etc.
It’s hardly a bulletproof theory.
But it works with the numbers and time I had to compile them. My hope is, with help, to refine the process this offseason. Found some people who are interested in working with the theory and making it better.
The “fantasy” team you mention is far from a “fantasy”. All of the numbers in there are real numbers. The “average team” is just that. An average team, differentiated by league. When I first did the figuring, I used one seasons numbers. The second time, I compiled all of numbers for several years to arrive at an average team (like you would to describe a pitcher who is a groundball pitcher — use multiple years, not just one). The average team gives you a touchstone, but the whole process only works if you buy into percentages. A player is going to get to this percentage, this slice, this portion of the groundballs his team produced.
The next step is to get the numbers on balls put into play.
That’s the data I want access to.
Same theory. Better numbers.
dg
-30-
and productive outs…
Right. I thought you presented the case quite well, especially in the articulation of the double play getting off too easy while the strikeout is ridiculed.
Too often productive outs are just dismissed. The object is to win the game. And if I have the winning run at third base and it’s the bottom of the ninth, there isn’t anything more valubale to me than the ability to get that run in. One way to get that run in is to cause an out — squeeze or whatever. Give me an out that wins the game. Every time.
It’s an extreme example, but where does a walk get you in that situation?
Could set up that killer double play you mentioned.
dg
-30-
I agree that in that situation, nothing is more valuable than a successful bunt. How often does that situation come about- not all that often, maybe a couple of times a season, but having a good bunt will win the game, no question.
My main point on productive outs is that they are not very predictable. Look at Edmonds and Pujols; 150 Ks to 50, yet Edmonds had more productive outs in less PO opps. If something like that happens; I’ll take high K like Adam Dunn over high contact guys like Dave Roberts. That probably sounds obvious to you to take Dunn over Roberts, but many people have argued on Roberts side against me, merely because of the Ks that Dunn puts up and the myth that high Ks lead to signifigantly less productive outs
That is part of the reason why I don’t agree with the Pujols-over-Howard School’s platform that Howard struck out way more than Pujols and therefore Pujols deserves the MVP. Pujols’ strikeout total was remarkable and if it were a vote for who is the best hitter, Pujols would win. But the vote was most valuable, and there are better numbers to hang that definition on. (Like, oh, RISP, perhaps …) But mingled in with gobs of strikeouts Howard hit 58 home runs and hit for a high average and walked a 100 times. Time to get over the stigma of strikeouts and accept an era that has three-true-outcome hitters.
That said, I do like a guy who puts the ball in play when appropriate as opposed to one Hall of Fame-bound DH I can think of who once seemed happier to take a walk than bring the run in from third base with an out.
-30-
http://stl.sabr.org/fungoes/
This blog has 2 very good entries on the subject. Win shares and Win Probability Added are two of the statistics that were sited here.
From my point of view
*Why is the fact that the Phillies didn’t make the play-offs not strongly considered?
* Pujols had less RBI…but played somewhere near 15 games less…had fewer oppotunities with RISP…but converted his opportunites at a much higher percentage than did Howard.
* Pujols # of game winning hits dwarfed Howard’s (see also WPA)
* His defense was worth 30 runs this year….that’s off the charts! Meanwhile Howard’s total actually COST the Phillies 20 runs…. (VEB)
Okay, Howard had a better 2nd hald, more Home Runs, and more RBI’s….I guess if you’re going to give the MVP to the person with the most home-runs….oh wait, what about Sosa, McGwire? Didn’t Sosa get the MVP because the Cubs, and not the Cards, made the play-offs??
Coop,
Fungoes is an excellent site to visit and obviously one to keep bookmarked with the SABR convention coming to St. Louis in 2007. (So is the APSE convention, come to think of it.) And I’m not just saying that because they once called me “insightful” because the Web site also once needled me for writing about a starting pitcher’s wins.
As with Viva, I am jealous of Fungoes charts.
Please, Coop, if could, explain how Howard’s defense cost the Phillies 20 runs. Seems like there’s more to that number than just runs that scored after he committed or because he committed an error.
dg
-30-
From this website’s AP story on the AL MVP award:
Earning just $385,000 in his third season as a regular, Morneau proved a bargain. Philadelphia’s Ryan Howard, voted NL MVP on Monday, made $355,000.
So this year’s MVPs earned a combined $740,000. Brandon Webb made 2.5 million this year. What is that telling us in an off-season when free-agent salaries are skyrocketing?
Derrick,
Congrats on the big postings. You’re hitting your blog stride in the hot-stove season!
Given baseball’s confusing roster rules, especially in the off-season, what is the significance of placing certain minor-leaguers on the 40-man roster? I’m only asking because I saw that you wrote the story on the Bigbie and Cali releases. Are the careers of Bigbie and Spivey done?
Fuhrig,
As for your first question (25), it reveals the value of player development and the quality of young players storming the majors. Some teams have seen the future and it comes from within their own organization, not from the outside — where, as we are seeing, supply is low, demand is high and prices are lunar.
As to your second question (26), the value is protect Rule 5 eligible players from the Rule 5 draft, like Cardinals’ prospect Dennis Dove, for example. Larry Bigbie and Carmen Cali aren’t done, they are just free to pursue new deals with other teams — as major- or minor-league free agents. It unburdens Bigbie of the $900,000 contract he carried around, too. Now he’s a free agent and his price is set by negotiations, not by CBA.
dg
-30-