Colley

Ratings

College Football

College Basketball

NFL Football

The problem with the Colley Matrix

The Colley Matrix is a system developed by Wes Colley. He makes a good attempt at a truly unbiased ranking system. His explanation of why margin of victory and other factors need to be ignored is very good. However, the math he uses to come up with his ratings is flawed, and his system is, in fact, subjective.

Here’s Colley’s proof:

Line 1	$p = \frac{1 + s}{2 + n}$	This is the Laplace Rule of Succession.
		n = number of events
		s = number of successes
		p = probability of a success
		Assumptions: 0 < s and s ≤ n
Line 2	$r = \frac{1 + n_{w}}{2 + n_{tot}}$	Applying the Rule of Succession to game results
		n_tot = total number of games played
		n_w = number of wins
		r = probability of a success (in this case, a success is a win). Also represents a team’s rating because a higher probability of winning represents a better team, earning it a better rating.
		Assumptions: 0 < n_w and n_w ≤ n_tot
Line 3	$r = \frac{1 + n_{w}}{2 + n_{w} + n_{l}}$	Total games played equals the number of wins plus the number of losses
Line 3	$r = \frac{1 + n_{w}}{2 + n_{w} + n_{l}}$	n_l = number of losses

So far so good. Changing gears a bit:

Line 4	$n_{w} = \frac{n_{w} - n_{l}}{2} + \frac{n_{w} + n_{l}}{2}$	If you simplify this, the n_l terms will cancel, so this is true.

Also:

Line 5	$\frac{\sum r}{n_{w} + n_{l}} = \bar{r}$	This is just calculating an average
		Σr = sum of ratings of all the opponents that a team faces
		r̄ = average rating of all the opponents that a team faces

Now, if we make the assumption that the average rating of opponents is equal to 0.5, it follows that:

Line 6	$\frac{\sum r}{n_{w} + n_{l}} = \frac{1}{2}$	Assumption:
Line 6	$\frac{\sum r}{n_{w} + n_{l}} = \frac{1}{2}$	r̄ = 0.5
Line 7	$\sum r = \frac{n_{w} + n_{l}}{2}$	Rearranging

Now, if we take the equation from Line 4, and sub in what we know from Line 7, we get:

Line 8	$n_{w} = \frac{n_{w} - n_{l}}{2} + \sum r$	Assumption:
Line 8	$n_{w} = \frac{n_{w} - n_{l}}{2} + \sum r$	r̄ = 0.5

From there, Colley uses the equation in Line 8 to find a team’s effective number of wins, then plugs that into the numerator of the equation in Line 3 to come up with a team’s rating. But there’s a problem: this is all based on the assumption that r̄ = 0.5. When actually running the calculation, r̄ is not necessarily equal to 0.5, which invalidates the proof. Therefore, it no longer has any basis on the Rule of Succession; it becomes nothing more than another equation that ranks teams, just as arbitrary as RPI.

Additionally, let’s go back to the assumptions on Line 2. The Rule of Succession requires that the effective number of wins (i.e., number of successes) must be zero or greater, but when Colley runs his calculations, that’s not always the case. If you use negative numbers, then you won’t get a valid answer from the Rule of Succession. The other assumption in Line 2 is that the effective number of wins cannot exceed the number of games played, which the Colley Matrix also violates. Intuitively, this makes sense, you can’t have fewer than zero wins, and you can’t have more wins than games played. Colley cannot claim his algorithm has a basis in the Rule of Succession because he violates its initial assumptions. There are even situations where teams can be punished for winning a game, and rewarded for losing.

But, for the sake of argument, let’s pretend that this is all ok. We’ll call the substitution from Line 8 the “Colley Exception”. We still find that the Colley Matrix is a subjective system. Looking back at the equation on Line 4:

Line 4 (repeat)	$n_{w} = \frac{n_{w} - n_{l}}{2} + \frac{n_{w} + n_{l}}{2}$	If you simplify this, the n_l terms will cancel, so this is true.

The green term represents the component that is based on a team’s winning percentage. The red term represents the component that factors in strength of schedule (after substituting with the Colley Exception). Colley arbitrarily chooses to split these in half, giving each component equal weight (arguably, a 50/50 split puts too much weight on strength of schedule). Instead of splitting 50/50, you could just as easily split it 75/25:

Line 9	$n_{w} = \frac{3 n_{w} - n_{l}}{4} + \frac{1}{2} (\frac{n_{w} + n_{l}}{2})$	If you simplify this, the n_l terms will cancel, so this is true.
Line 10	$n_{w} = \frac{3 n_{w} - n_{l}}{4} + \frac{1}{2} \sum r$	Substituting with the Colley Exception

Or 80/20:

Line 11	$n_{w} = \frac{4 n_{w} - n_{l}}{5} + \frac{2}{5} (\frac{n_{w} + n_{l}}{2})$	If you simplify this, the n_l terms will cancel, so this is true.
Line 12	$n_{w} = \frac{4 n_{w} - n_{l}}{5} + \frac{2}{5} \sum r$	Substituting with the Colley Exception

Or 41/59:

Line 13	$n_{w} = \frac{41 n_{w} - 59 n_{l}}{100} + \frac{59}{50} (\frac{n_{w} + n_{l}}{2})$	If you simplify this, the n_l terms will cancel, so this is true.
Line 14	$n_{w} = \frac{41 n_{w} - 59 n_{l}}{100} + \frac{59}{50} \sum r$	Substituting with the Colley Exception

Or anything else. And mathematically, they’re all just as valid as Colley’s 50/50 split. Colley hangs his hat on the fact that his formula has no “ad hoc” or “biased” adjustments or constants. But contrary to his claim, Colley is putting an arbitrary weight on strength of schedule, which is exactly the mistake he accuses other ranking methods of making.

In the end, the Colley Matrix ranks teams no better than other ranking systems. It’s subjective and the equations used are not self-evidently true.

Ratings

Article Archive

The problem with the Colley Matrix