The Problems with the RPI for Baseball

in other news

2013 Bluesknight Cup Standings

<h4 align=center>Bluesknight Cup Standings

Super Regional Storylines

Check inside for some storylines as we look ahead to the Super Regionals this weekend plus our eight for Omaha.

Perception is Reality for Short Rest Pitching

Monday night in Chapel Hill was one of the most exciting, storyline-filled games in recent memory. North Carolina and Florida Atlantic see-sawed for 1

• Mark Etheridge

2013 Week 17

<h3>Division 1 Schedule for Week Beginning Friday, 7 Jun 2013</h3>

• Rivals

SuperRegional Matchups Announced

Here are the matchups for the next weekend's Super Regional round.

• SEBaseball.com staff

in other news

2013 Bluesknight Cup Standings

<h4 align=center>Bluesknight Cup Standings

• Rivals

Super Regional Storylines

Check inside for some storylines as we look ahead to the Super Regionals this weekend plus our eight for Omaha.

• Mark Etheridge

Perception is Reality for Short Rest Pitching

Monday night in Chapel Hill was one of the most exciting, storyline-filled games in recent memory. North Carolina and Florida Atlantic see-sawed for 1

• Mark Etheridge

Published Jul 12, 2004

The Problems with the RPI for Baseball

Paul Kislanko

SEBaseball.com Editor

©Copyright 2004 by Paul Kislanko

Broken by Design

Virtually everyone who has studied team rating systems has found the RPI to be deficient in one way or another. The faults are usually identified by example, either in comparison to team performance or another, presumably "better" ranking. These arguments are less than compelling, because the former don't address the same purpose as the RPI and the latter are necessarily subjective.

The flaws in the RPI can be identified just from breaking down the formula into its component parts and asking whether each component contributes or doesn't contribute to the overall objective, and then ask whether the components are incorporated the right way. When the NCAA have changed the RPI formula, all that has been changed has been the relative weights for the components - they've never addressed the fundamental flaws in the formula itself.

The Objective

There would be no reason to have any ranking other than winning percentage if every team competing for a championship played every other team, more than once for some sports. The RPI is one way to try to infer which of two teams who haven't played each other should be ranked higher, using the quality of their respective schedules to determine the relative values of their records. In principle, this is correct, but the implementation leaves much to be desired.

The description of the RPI says that it combines a team's winning percentage with its opponents' winning percentage, and its opponents' opponents' winning percentage. If it did that, it would very likely satisfy the objective, but it doesn't do that. The factors that are called OWP and OOWP in the RPI are not percentages at all, so whatever it is it's not a "Ratings Percentage Index".

The Factors

The RPI formula is rather simple to state, but requires a very complicated explication.

WP is just the usual formula; number of wins divided by number of games played (counting a tie as half a win). One problem is that the other factors are calculated differently than the WP and each other.

OWP is the average of opponents' winning percentages not counting games involving this team, with each opponent's average included once for each game played. There are two problems with this approach, and each contributes to the overall problem. Some sports are more affected than others, but in principle the problems exist for all sports.

Averaging percentages doesn't result in a percentage

This would be equivalent to defining a batting average in baseball by the average of the BA for each game played. A 0 for 5 day followed by a 3 for 4 day would give (.000 + .750) = .375 instead of 3 for 9 = .333. In basketball, a player in a 3-game tournament who hits 2 of 10 shots, then 3 of 6, then 4 of 10 would have a shooting percentage of (.200 + .500 + .400)/3 = .433, when in fact for the tournament she was 9 for 25 = .360.

There's no other formula in all of sports statistics that makes this mistake.

Including the Opponents WP (not counting games vs the opponent) for each game played sounds right, but what it means is that you can boost your RPI by losing many times to a good team, since the good team's WP against other teams gets counted for each loss in OWP. Also, two good teams who play each other a lot boost their OOWP components, so WP+OOWP counters any losses to each other.

Which brings us to the OOWP component. It is just the average of the opponents' OWP values. This is no more a "percentage" than the OWP is, but it introduces an extra problem. When a team's Opponents' Opponents include the team itself or a team that is also an opponent, there is not the same care taken to remove games that are already accounted for in WP or OWP, so the OOWP can be manipulated by careful scheduling to include your own WP in it as many times as you like. Put plainly, by beating lots of weak teams who play each other, you can make your RPI as high as you want without ever playing a good team.

Bonus and Penalty points are the only part of the formula that can be different for each sport, and are the only ones that are not made public.

The Formula

Even if we were to correct the calculation for OWP to be a true percentage, and correct the OOWP calculation so that no game played contributed to more than one component, the formula would still not be a very good one. A basketball team that is 0-26 or a baseball team that is 0-56 could still have a very high RPI because 75 percent of the formula is based upon who you played, not how you performed against them. And your high OWP value becomes a high OOWP value for every team you lose to, which at least makes scheduling good teams to lose to attractive to the good teams.

Note: This may not be a bad thing. We wouldn't want the emphasis on having the "big boys" play each other more often to cause the "not-as-big" teams who are trying to improve their programs by ambitious scheduling be shut out of the process. The guarantees from "power teams" to fill home dates are often the most significant sources of revenue for programs that are at least attempting to "move up". While we may not want a system that exploits these programs, we also don't want one that prevents them taking "the next step".

When the NCAA have changed the RPI formula, they have concentrated on changing the coefficients for adding the various components together. So now it's 25 percent winning percentage plus 75 percent something else. The "something else" can negatively affect teams that ignore it, and positively contribute to teams that exploit it without much effort. There's a much better way to accomplish the objective that doesn't have any of these specific problems. A true "percentage index" wouldn't be based on addition. "Percentages" in sports are more like probabilities, which must be multiplied, not added.

The statement is usually that the RPI is 1/4 winning percentage and three quarters SOS, but it is as accurately described as 1/2 WP+OOWP and 1/2 OWP (with odd definitions for OWP and OOWP). Some teams' RPI values are where they are because OWP is high, and some are because WP+OOWP is high. It turns out that in college baseball almost all of the high RPI values are due to the error in the OOWP.

In the next installment, we'll define the attributes a rating system should have to accurately compare teams that haven't played the same schedule, and compare the various ones available to those requirements.

This is the first article in a series that describes the Performance Against SOS rating system and sports rating systems in general. For the remaining articles and all of SEBaseball.com's in-depth coverage, subscribe now!