Strength of Schedule
It’s not possible to quantify strength of schedule with a single number that can compare apples-to-apples with all other teams. Below is an example to illustrate the point. It’s an extreme example, but it makes the point clear.
We’ll use the 2014 college football season. Let’s say that Team A had a schedule with 6 elite teams and 6 terrible teams.
Team A’s schedule, with each team’s record against FBS opponents:
Georgia State (0-11)
Nevada-Las Vegas (1-11)
Eastern Michigan (1-10)
Southern Methodist (1-11)
New Mexico State (1-10)
Ohio State (14-1)
Florida State (12-1)
Texas Christian (11-1)
Michigan State (10-2)
And we’ll compare it with Team B, who played 12 mediocre teams:
Virginia Tech (6-6)
Penn State (7-6)
Miami (FL) (5-7)
North Carolina (5-7)
East Carolina (7-5)
Western Michigan (7-5)
Texas-El Paso (7-6)
So which schedule is more difficult? The answer is that it depends!
Let’s say that both teams went 9-3. If that’s the case, Team A has a much more difficult schedule. They had 6 easy wins, but then had to go 3-3 against the very best teams in the country. Team B, in order to go 9-3, just had to play above average. So if we’re talking about good teams, Team A has the tougher schedule, because it’s much easier to get 9 wins from Team B’s schedule than Team A’s.
But what about the opposite? What if both teams are 3-9? Team B is clearly below average, but with 3 wins over average talent, isn’t among the worst. But Team A going 3-9 is terrible. Sure, they had 6 elite teams on their schedule, but to go 3-3 against the very worst teams in the country demonstrates that Team A isn’t any good. So if we’re talking about bad teams, Team B clearly has a tougher schedule. It’s easier to get 3 wins from Team A’s schedule than Team B’s.
So whether a team’s schedule can be considered good or bad is dependent on whether the team in question is good or bad. Again, the above example is an extreme case that would never occur in real life, but the effect is still there; it’s just less obvious.
As Ken Massey points out, making an adjustment to account for this effect only introduces other peculiarities. If strength of schedule also factors in the success of the team in question, then schedule strength between teams can only be compared between peers. Two teams could have identical schedules, but if one team is more successful than the other, they may have different strength of schedule ratings.
The correct way to account for strength of schedule in a rating system is to bypass quantifying it as single number. Instead, every component of a team’s body of work—who each team beat, who beat them, and the ratings of all of those opponents—needs to be factored in simultaneously to calculate each team’s final rating. Strength of schedule can’t be calculated as an independent step. It must be baked into the rating calculation, because no single number can quantify a schedule’s strength.