I previously scored play from 2007 to 2021 in terms of expected/predicted points. By starting at the play level, I can then roll up each team’s plays at the game or season level to compute their raw offensive/defensive efficiency. Tangibly, this indicates the net points per play for a team over the course of the season while on offense or defense.
Here are the top 5 teams based on overall efficiency from 2007 to 2015.
SEASON | RANK | TEAM | OFFENSE | DEFENSE | OVERALL |
2007 | 1 | West Virginia | 0.149 | 0.134 | 0.284 |
2007 | 2 | Kansas | 0.143 | 0.103 | 0.246 |
2007 | 3 | Florida | 0.241 | -0.014 | 0.228 |
2007 | 4 | Ohio State | 0.101 | 0.123 | 0.224 |
2007 | 5 | LSU | 0.098 | 0.105 | 0.203 |
2008 | 1 | Florida | 0.211 | 0.161 | 0.373 |
2008 | 2 | Texas | 0.293 | 0.044 | 0.337 |
2008 | 3 | USC | 0.183 | 0.146 | 0.329 |
2008 | 4 | Oklahoma | 0.307 | 0.013 | 0.321 |
2008 | 5 | Penn State | 0.210 | 0.091 | 0.301 |
2009 | 1 | TCU | 0.101 | 0.216 | 0.317 |
2009 | 2 | Florida | 0.150 | 0.153 | 0.303 |
2009 | 3 | Boise State | 0.188 | 0.109 | 0.297 |
2009 | 4 | Texas | 0.054 | 0.213 | 0.267 |
2009 | 5 | Alabama | 0.068 | 0.184 | 0.252 |
2010 | 1 | Boise State | 0.295 | 0.229 | 0.523 |
2010 | 2 | TCU | 0.211 | 0.154 | 0.365 |
2010 | 3 | Ohio State | 0.180 | 0.163 | 0.343 |
2010 | 4 | Alabama | 0.220 | 0.035 | 0.255 |
2010 | 5 | Nevada | 0.241 | 0.006 | 0.246 |
2011 | 1 | Alabama | 0.113 | 0.331 | 0.444 |
2011 | 2 | Boise State | 0.255 | 0.090 | 0.345 |
2011 | 3 | Wisconsin | 0.346 | -0.001 | 0.344 |
2011 | 4 | LSU | 0.132 | 0.158 | 0.290 |
2011 | 5 | Oklahoma State | 0.231 | 0.030 | 0.261 |
2012 | 1 | Alabama | 0.247 | 0.140 | 0.387 |
2012 | 2 | Oregon | 0.264 | 0.071 | 0.335 |
2012 | 3 | Texas A&M | 0.262 | -0.029 | 0.233 |
2012 | 4 | Boise State | 0.116 | 0.113 | 0.229 |
2012 | 5 | Florida State | 0.155 | 0.068 | 0.223 |
2013 | 1 | Florida State | 0.353 | 0.159 | 0.512 |
2013 | 2 | Alabama | 0.265 | 0.072 | 0.337 |
2013 | 3 | Louisville | 0.221 | 0.106 | 0.327 |
2013 | 4 | Baylor | 0.221 | 0.032 | 0.253 |
2013 | 5 | Oregon | 0.225 | 0.019 | 0.244 |
2014 | 1 | Marshall | 0.192 | 0.066 | 0.258 |
2014 | 2 | TCU | 0.134 | 0.111 | 0.245 |
2014 | 3 | Ohio State | 0.225 | -0.003 | 0.221 |
2014 | 4 | Michigan State | 0.184 | 0.034 | 0.218 |
2014 | 5 | Georgia | 0.200 | 0.009 | 0.209 |
2015 | 1 | Ohio State | 0.158 | 0.080 | 0.238 |
2015 | 2 | Alabama | 0.084 | 0.133 | 0.216 |
2015 | 3 | Western Kentucky | 0.268 | -0.067 | 0.201 |
2015 | 4 | Houston | 0.152 | 0.047 | 0.200 |
2015 | 5 | Clemson | 0.167 | 0.027 | 0.194 |
Based purely on their on field play, this measure highlights some teams that we would expect (2012 Alabama, 2013 Florida State), but it also rates certain teams very highly that we wouldn’t expect (2010 Nevada, 2014 Marsall, 2015 Western Kentucky).
Why is that happening? The issue is that these estimates are simply the average predicted points per play for each team over the course of the season. They don’t take into account the relative strength of the opposition faced - a 10 yard pass against UMass is considered the same as a 10 yard pass against Ohio State.
As an example, this means that in 2015, from a raw offense/defense efficiency perspective, Western Kentucky is rated pretty closely to Alabama. This is mainly because Alabama’s raw offense is rated lower than Western Kentucky’s - a year in which they Alabama went 14-1 and had Derrick Henry win the Heisman with over 2200 yards rushing.
SEASON | TEAM | OFFENSE | DEFENSE | OVERALL |
2015 | Alabama | 0.084 | 0.133 | 0.216 |
2015 | Western Kentucky | 0.268 | -0.067 | 0.201 |
Why is Alabama lower? If we look at the overall (raw) strength of the teams they played that season, we can see that Alabama faced much stronger teams than Western Kentucky.
SEASON | OPPONENT | OVERALL |
2015 | Southern Mississippi | 0.149 |
2015 | LSU | 0.126 |
2015 | Louisiana Tech | 0.125 |
2015 | South Florida | 0.120 |
2015 | Marshall | 0.101 |
2015 | Middle Tennessee | 0.045 |
2015 | Indiana | 0.013 |
2015 | Florida Atlantic | -0.055 |
2015 | Vanderbilt | -0.112 |
2015 | Old Dominion | -0.158 |
2015 | Rice | -0.209 |
2015 | Miami (OH) | -0.221 |
2015 | North Texas | -0.311 |
SEASON | OPPONENT | OVERALL |
2015 | Clemson | 0.195 |
2015 | Wisconsin | 0.185 |
2015 | Ole Miss | 0.175 |
2015 | Tennessee | 0.134 |
2015 | LSU | 0.126 |
2015 | Georgia | 0.113 |
2015 | Michigan State | 0.097 |
2015 | Mississippi State | 0.095 |
2015 | Florida | 0.061 |
2015 | Texas A&M | 0.056 |
2015 | Arkansas | 0.051 |
2015 | Middle Tennessee | 0.045 |
2015 | Auburn | 0.016 |
2015 | Louisiana Monroe | -0.252 |
Alabama had a much tougher schedule than Western Kentucky, and our measures of team offense/defense efficiency should account for that.
How do we adjust team offense/defense efficiency based on the quality of their opponent?
This means regressing the predicted points per play on each team as well as home field advantage:
\[PPA = Offense_i + Defense_j + HomeFieldAdvantage\]
Now fit to each season in the training set.
Now look at the coefficients. Good offenses will have positive coefficients (how much more the team scored on a given play than average).
SEASON | term | estimate |
2013 | (Intercept) | 0.055 |
2013 | HOME_FIELD_ADVANTAGE | 0.025 |
2013 | OFFENSE_ID_Florida.State | 0.305 |
2013 | OFFENSE_ID_Texas.A.M | 0.260 |
2013 | OFFENSE_ID_Ohio.State | 0.249 |
2013 | OFFENSE_ID_Alabama | 0.218 |
2013 | OFFENSE_ID_LSU | 0.212 |
2013 | OFFENSE_ID_Oregon | 0.211 |
2013 | OFFENSE_ID_Baylor | 0.206 |
2013 | OFFENSE_ID_Auburn | 0.196 |
2013 | OFFENSE_ID_UCF | 0.195 |
2013 | OFFENSE_ID_Georgia | 0.192 |
2013 | OFFENSE_ID_South.Carolina | 0.178 |
2013 | OFFENSE_ID_Arizona.State | 0.168 |
2013 | OFFENSE_ID_Missouri | 0.163 |
Good defenses will have negative coefficients (because they prevented other teams from scoring). For the purpose of evaluating offenses and defenses, I will flip this so positive always means good.
SEASON | term | estimate |
2013 | (Intercept) | 0.055 |
2013 | HOME_FIELD_ADVANTAGE | 0.025 |
2013 | DEFENSE_ID_Florida.State | -0.235 |
2013 | DEFENSE_ID_Alabama | -0.198 |
2013 | DEFENSE_ID_Oklahoma.State | -0.175 |
2013 | DEFENSE_ID_USC | -0.171 |
2013 | DEFENSE_ID_Florida | -0.165 |
2013 | DEFENSE_ID_Utah.State | -0.161 |
2013 | DEFENSE_ID_Iowa | -0.150 |
2013 | DEFENSE_ID_Wisconsin | -0.146 |
2013 | DEFENSE_ID_Michigan.State | -0.142 |
2013 | DEFENSE_ID_TCU | -0.134 |
2013 | DEFENSE_ID_Louisville | -0.133 |
2013 | DEFENSE_ID_Virginia.Tech | -0.114 |
2013 | DEFENSE_ID_Stanford | -0.109 |
Putting this together (and reversing the sign on defense), we can get adjusted ratings for each team. How do Alabama and Western Kentucky compare for 2015 after adjusting for opponents?
Western Kentucky gets penalized due to the quality of the competition they faced - their offense gets an adjustment down while their defense gets a slight improvement. Alabama’s offense receives a slight increase based on their opponents, but their defense is the most affected as they played a lot of strong offenses. With the adjustments, Bama moves from the #2 team to the #1, while Western Kentucky falls from #3 to a still respectable #16 - evidently the AP had them at #24 to finish the season, so this feels pretty reasonable.
SEASON | TEAM | TYPE | OFFENSE | DEFENSE | OVERALL | MARGIN | RANK |
2015 | Alabama | adjusted | 0.108 | 0.232 | 0.340 | 22.103 | 1 |
2015 | Alabama | raw | 0.084 | 0.133 | 0.216 | 14.051 | 2 |
2015 | Western Kentucky | raw | 0.268 | -0.067 | 0.201 | 13.095 | 3 |
2015 | Western Kentucky | adjusted | 0.194 | -0.031 | 0.163 | 10.622 | 16 |
I trained the original expected points model on the years 2007 to 2015, so I can place every team in these seasons based on their offense and defense efficiency. The best teams are in the upper right quadrant, with strong offenses and defenses.
We can rank teams according to each of these dimensions and take a look at the overall distributions. Here are the top teams by each metric (overall, offense, defense) in 2012.
I trained the original expected points model on 2007-2015, but then I predicted the next four seaasons (then retrained, then predicted the remaining seaons). I’ll add all seasons from 2016 to 2021 in here to get season level opponent adjusted metrics for all teams from 07 to 21.
As before, can look at individual seasons to see where teams stacked up in terms offensive/defensive efficiency at the end of the year.
Here is what teams looked like at the end of 2021.
We can look at this to see changes within team overall strength year over year.
If we look at Alabama, we should see them basically be near the top this whole damn era.
Georgia should take off in the last few years.
What about each conference? I’ll look at each team’s rankings in their conference over this time period, filtering to the P5 conferences.
In addition to looking at a team’s overall offense/defense, we can also break plays down by passing/rushing/special teams.
Look at a team like Wisconsin, which has had fairly weak passing efficiency since the days of Russell Wilson, though evidently their passing game was somewhat okay in 2016 and 2017 while their run game suffered?
SEASON | TEAM | TYPE | OFFENSE_PASS | OFFENSE_RUN | DEFENSE_PASS | DEFENSE_RUN |
2007 | Wisconsin | adjusted | 0.128 | 0.118 | -0.001 | -0.005 |
2008 | Wisconsin | adjusted | 0.006 | 0.061 | 0.053 | 0.012 |
2009 | Wisconsin | adjusted | 0.131 | 0.081 | 0.058 | 0.182 |
2010 | Wisconsin | adjusted | 0.277 | 0.219 | 0.122 | -0.001 |
2011 | Wisconsin | adjusted | 0.628 | 0.321 | 0.038 | -0.009 |
2012 | Wisconsin | adjusted | 0.114 | 0.066 | 0.087 | 0.135 |
2013 | Wisconsin | adjusted | -0.032 | 0.279 | 0.143 | 0.236 |
2014 | Wisconsin | adjusted | -0.054 | 0.284 | 0.096 | 0.107 |
2015 | Wisconsin | adjusted | 0.028 | -0.057 | 0.173 | 0.148 |
2016 | Wisconsin | adjusted | 0.218 | -0.030 | 0.260 | 0.235 |
2017 | Wisconsin | adjusted | 0.232 | 0.060 | 0.330 | 0.180 |
2018 | Wisconsin | adjusted | -0.028 | 0.201 | 0.034 | 0.038 |
2019 | Wisconsin | adjusted | 0.339 | 0.243 | 0.311 | 0.169 |
2020 | Wisconsin | adjusted | -0.073 | -0.060 | 0.213 | 0.315 |
2021 | Wisconsin | adjusted | -0.062 | 0.060 | 0.365 | 0.246 |
As opposed to team like Texas Tech, which has often had a spectacular passing offense at the expense of basically everything else.
SEASON | TEAM | TYPE | OFFENSE_PASS | OFFENSE_RUN | DEFENSE_PASS | DEFENSE_RUN |
2007 | Texas Tech | adjusted | 0.254 | 0.020 | 0.129 | -0.060 |
2008 | Texas Tech | adjusted | 0.281 | 0.295 | 0.078 | -0.123 |
2009 | Texas Tech | adjusted | 0.118 | 0.028 | 0.160 | 0.026 |
2010 | Texas Tech | adjusted | 0.084 | -0.074 | -0.025 | -0.029 |
2011 | Texas Tech | adjusted | 0.119 | 0.060 | -0.137 | -0.076 |
2012 | Texas Tech | adjusted | 0.260 | -0.005 | 0.093 | -0.053 |
2013 | Texas Tech | adjusted | 0.176 | 0.001 | 0.104 | -0.059 |
2014 | Texas Tech | adjusted | 0.262 | 0.104 | -0.016 | -0.154 |
2015 | Texas Tech | adjusted | 0.190 | 0.287 | -0.218 | -0.195 |
2016 | Texas Tech | adjusted | 0.328 | 0.025 | -0.243 | -0.175 |
2017 | Texas Tech | adjusted | 0.180 | 0.060 | -0.039 | 0.010 |
2018 | Texas Tech | adjusted | 0.070 | -0.041 | -0.079 | 0.019 |
2019 | Texas Tech | adjusted | 0.011 | 0.082 | -0.095 | 0.023 |
2020 | Texas Tech | adjusted | -0.051 | 0.112 | 0.052 | -0.091 |
2021 | Texas Tech | adjusted | 0.092 | 0.097 | -0.044 | -0.001 |
Alabama is an interesting one to look at, as they’re generally strong across the board but their passing efficiency in recent years is what has really set them apart.
SEASON | TEAM | TYPE | OFFENSE_PASS | OFFENSE_RUN | DEFENSE_PASS | DEFENSE_RUN |
2007 | Alabama | adjusted | -0.035 | 0.066 | 0.102 | 0.001 |
2008 | Alabama | adjusted | 0.036 | 0.166 | 0.134 | 0.260 |
2009 | Alabama | adjusted | 0.248 | 0.078 | 0.438 | 0.167 |
2010 | Alabama | adjusted | 0.287 | 0.217 | 0.191 | 0.130 |
2011 | Alabama | adjusted | 0.189 | 0.201 | 0.469 | 0.409 |
2012 | Alabama | adjusted | 0.256 | 0.292 | 0.314 | 0.292 |
2013 | Alabama | adjusted | 0.371 | 0.147 | 0.171 | 0.275 |
2014 | Alabama | adjusted | 0.436 | 0.212 | 0.121 | 0.221 |
2015 | Alabama | adjusted | 0.125 | 0.096 | 0.374 | 0.263 |
2016 | Alabama | adjusted | 0.059 | 0.251 | 0.530 | 0.367 |
2017 | Alabama | adjusted | 0.115 | 0.293 | 0.385 | 0.258 |
2018 | Alabama | adjusted | 0.640 | 0.292 | 0.317 | 0.339 |
2019 | Alabama | adjusted | 0.571 | 0.157 | 0.296 | 0.075 |
2020 | Alabama | adjusted | 0.652 | 0.179 | 0.100 | 0.162 |
2021 | Alabama | adjusted | 0.373 | 0.108 | 0.284 | 0.209 |
Oklahoma 2018 is an interesting because they were so strong passing offensively and so weak at preventing other teams from passing.
SEASON | TEAM | TYPE | OFFENSE_PASS | OFFENSE_RUN | DEFENSE_PASS | DEFENSE_RUN |
2007 | Oklahoma | adjusted | 0.335 | 0.071 | 0.151 | 0.069 |
2008 | Oklahoma | adjusted | 0.561 | 0.252 | 0.166 | 0.083 |
2009 | Oklahoma | adjusted | 0.131 | -0.144 | 0.347 | 0.167 |
2010 | Oklahoma | adjusted | 0.222 | -0.006 | 0.212 | 0.019 |
2011 | Oklahoma | adjusted | 0.163 | 0.125 | 0.219 | 0.189 |
2012 | Oklahoma | adjusted | 0.306 | 0.001 | 0.321 | -0.154 |
2013 | Oklahoma | adjusted | 0.117 | 0.128 | 0.210 | -0.006 |
2014 | Oklahoma | adjusted | 0.196 | 0.163 | 0.101 | 0.125 |
2015 | Oklahoma | adjusted | 0.169 | 0.099 | 0.234 | 0.106 |
2016 | Oklahoma | adjusted | 0.605 | 0.065 | -0.001 | -0.002 |
2017 | Oklahoma | adjusted | 0.635 | 0.269 | 0.078 | -0.001 |
2018 | Oklahoma | adjusted | 0.564 | 0.339 | -0.106 | -0.028 |
2019 | Oklahoma | adjusted | 0.394 | 0.272 | 0.071 | 0.080 |
2020 | Oklahoma | adjusted | 0.299 | 0.059 | 0.254 | 0.109 |
2021 | Oklahoma | adjusted | 0.177 | 0.161 | -0.164 | 0.053 |
Texas A&M is an interesting one on this because in Manziel’s Heisman year A&M generated more points per play running the ball than throwing - I would imagine this is cause of all of the times Manziel scrambled in order to bail out the offense.
SEASON | TEAM | TYPE | OFFENSE_PASS | OFFENSE_RUN | DEFENSE_PASS | DEFENSE_RUN |
2007 | Texas A&M | adjusted | 0.000 | 0.112 | -0.016 | -0.052 |
2008 | Texas A&M | adjusted | 0.041 | -0.063 | -0.141 | -0.175 |
2009 | Texas A&M | adjusted | 0.096 | 0.162 | 0.042 | -0.104 |
2010 | Texas A&M | adjusted | 0.060 | 0.019 | 0.201 | 0.173 |
2011 | Texas A&M | adjusted | 0.114 | 0.160 | 0.140 | 0.104 |
2012 | Texas A&M | adjusted | 0.324 | 0.408 | 0.172 | 0.133 |
2013 | Texas A&M | adjusted | 0.374 | 0.265 | -0.026 | -0.090 |
2014 | Texas A&M | adjusted | 0.234 | 0.102 | 0.022 | -0.130 |
2015 | Texas A&M | adjusted | -0.079 | 0.038 | 0.305 | -0.001 |
2016 | Texas A&M | adjusted | 0.132 | 0.071 | 0.022 | 0.100 |
2017 | Texas A&M | adjusted | 0.144 | -0.018 | 0.111 | 0.075 |
2018 | Texas A&M | adjusted | 0.229 | 0.140 | 0.029 | 0.175 |
2019 | Texas A&M | adjusted | 0.125 | 0.117 | 0.175 | 0.117 |
2020 | Texas A&M | adjusted | 0.122 | 0.152 | 0.090 | 0.154 |
2021 | Texas A&M | adjusted | -0.055 | 0.082 | 0.305 | 0.065 |
So far, everything we’ve computed has been retrospective, evaluating teams on an entire season’s worth of plays to measure their offense and defense strength. But suppose we are now going into the 2016 season. How do we rate teams? Before any games have been played, we could rely on previous seasons’ efficiency data for every team. This is the lineup of Week 1 games using each team’s power rating at the end of the previous season.
SEASON | WEEK | HOME | AWAY | HOME_POWER | AWAY_POWER |
2016 | 1 | Temple | Army | 5.6 | -11.6 |
2016 | 1 | Texas | Notre Dame | 0.7 | 14.5 |
2016 | 1 | Florida | UMass | 8.2 | -9.5 |
2016 | 1 | Vanderbilt | South Carolina | -0.5 | 0.4 |
2016 | 1 | Alabama | USC | 22.1 | 9.4 |
2016 | 1 | Arkansas | Louisiana Tech | 11.6 | 3.2 |
2016 | 1 | Auburn | Clemson | 7.6 | 19.1 |
2016 | 1 | North Carolina | Georgia | 13.3 | 9.9 |
2016 | 1 | Kentucky | Southern Mississippi | -2.3 | 2.7 |
2016 | 1 | Wisconsin | LSU | 10.0 | 13.7 |
2016 | 1 | Mississippi State | South Alabama | 12.9 | -15.2 |
2016 | 1 | West Virginia | Missouri | 7.6 | -0.7 |
2016 | 1 | Tennessee | Appalachian State | 12.7 | 3.8 |
2016 | 1 | Texas A&M | UCLA | 10.5 | 7.3 |
2016 | 1 | Florida State | Ole Miss | 10.9 | 15.7 |
2016 | 1 | California | Hawai'i | 9.5 | -18.6 |
2016 | 1 | Minnesota | Oregon State | 0.8 | -7.1 |
2016 | 1 | Colorado | Colorado State | -3.1 | -6.3 |
2016 | 1 | Stanford | Kansas State | 13.7 | -0.1 |
2016 | 1 | Washington | Rutgers | 9.7 | -6.1 |
2016 | 1 | Louisiana | Boise State | -14.4 | 4.4 |
2016 | 1 | Wyoming | Northern Illinois | -14.9 | -1.9 |
2016 | 1 | Nebraska | Fresno State | 4.8 | -16.7 |
2016 | 1 | Michigan | Hawai'i | 13.6 | -18.6 |
2016 | 1 | Tulsa | San José State | -6.3 | -5.5 |
2016 | 1 | Georgia State | Ball State | -5.6 | -14.0 |
2016 | 1 | Ohio State | Bowling Green | 16.6 | 8.5 |
2016 | 1 | Penn State | Kent State | 5.4 | -16.4 |
2016 | 1 | Iowa | Miami (OH) | 7.4 | -17.3 |
2016 | 1 | Ohio | Texas State | -2.3 | -20.1 |
2016 | 1 | Arkansas State | Toledo | -3.1 | 7.8 |
2016 | 1 | Northwestern | Western Michigan | 3.4 | 1.4 |
2016 | 1 | Louisville | Charlotte | 7.2 | -20.7 |
2016 | 1 | Florida International | Indiana | -10.6 | 3.2 |
2016 | 1 | North Texas | SMU | -22.9 | -13.2 |
2016 | 1 | Western Kentucky | Rice | 10.6 | -16.9 |
2016 | 1 | UTEP | New Mexico State | -20.5 | -18.2 |
2016 | 1 | Wake Forest | Tulane | -5.2 | -15.1 |
2016 | 1 | Boston College | Georgia Tech | -2.3 | 1.5 |
2016 | 1 | Houston | Oklahoma | 10.3 | 15.1 |
We could use the difference in each team’s power to predict week 1 - the power ratings themselves are the expected margin of victory for each team against an average team on a neutral field. So, we simply take the difference and add an adjustment based on home field advantage to predict the score of the game (this is a crude mapping of team ratings to predicting the score, I’ll use a model for this later). How does this do in predicting week 1?
SEASON | WEEK | HOME | AWAY | PRED_MARGIN | HOME_MARGIN |
2016 | 1 | Temple | Army | 20.3 | -15 |
2016 | 1 | Texas | Notre Dame | -10.8 | 3 |
2016 | 1 | Florida | UMass | 20.6 | 17 |
2016 | 1 | Vanderbilt | South Carolina | 2.2 | -3 |
2016 | 1 | Alabama | USC | 12.7 | 46 |
2016 | 1 | Arkansas | Louisiana Tech | 11.4 | 1 |
2016 | 1 | Auburn | Clemson | -8.5 | -6 |
2016 | 1 | North Carolina | Georgia | 3.4 | -9 |
2016 | 1 | Kentucky | Southern Mississippi | -1.9 | -9 |
2016 | 1 | Wisconsin | LSU | -3.7 | 2 |
2016 | 1 | Mississippi State | South Alabama | 31.1 | -1 |
2016 | 1 | West Virginia | Missouri | 11.3 | 15 |
2016 | 1 | Tennessee | Appalachian State | 11.9 | 7 |
2016 | 1 | Texas A&M | UCLA | 6.1 | 7 |
2016 | 1 | Florida State | Ole Miss | -4.8 | 11 |
2016 | 1 | California | Hawai'i | 28.1 | 20 |
2016 | 1 | Minnesota | Oregon State | 10.9 | 7 |
2016 | 1 | Colorado | Colorado State | 3.2 | 37 |
2016 | 1 | Stanford | Kansas State | 16.9 | 13 |
2016 | 1 | Washington | Rutgers | 18.8 | 35 |
2016 | 1 | Louisiana | Boise State | -15.8 | -35 |
2016 | 1 | Wyoming | Northern Illinois | -9.9 | 6 |
2016 | 1 | Nebraska | Fresno State | 24.5 | 33 |
2016 | 1 | Michigan | Hawai'i | 35.2 | 60 |
2016 | 1 | Tulsa | San José State | 2.2 | 35 |
2016 | 1 | Georgia State | Ball State | 11.4 | -10 |
2016 | 1 | Ohio State | Bowling Green | 11.2 | 67 |
2016 | 1 | Penn State | Kent State | 24.9 | 20 |
2016 | 1 | Iowa | Miami (OH) | 27.7 | 24 |
2016 | 1 | Ohio | Texas State | 20.8 | -2 |
2016 | 1 | Arkansas State | Toledo | -7.9 | -21 |
2016 | 1 | Northwestern | Western Michigan | 4.9 | -1 |
2016 | 1 | Louisville | Charlotte | 30.9 | 56 |
2016 | 1 | Florida International | Indiana | -10.8 | -21 |
2016 | 1 | North Texas | SMU | -6.7 | -13 |
2016 | 1 | Western Kentucky | Rice | 30.5 | 32 |
2016 | 1 | UTEP | New Mexico State | 0.6 | 16 |
2016 | 1 | Wake Forest | Tulane | 12.9 | 4 |
2016 | 1 | Boston College | Georgia Tech | -3.8 | -3 |
2016 | 1 | Houston | Oklahoma | -4.8 | 10 |
How well do last year’s rankings do in predicting games in terms of accuracy and the spread?
## # A tibble: 1 × 5
## SEASON WEEK .metric .estimator .estimate
## <dbl> <dbl> <chr> <chr> <dbl>
## 1 2016 1 rmse standard 18.6
## # A tibble: 1 × 5
## SEASON WEEK .metric .estimator .estimate
## <dbl> <dbl> <chr> <chr> <dbl>
## 1 2016 1 accuracy binary 0.7
Not so great on the margin of victory, but about 70% of the games correct. Not too great, but it’s a start.
But now let’s say we’ve seen a week’s worth of games. How do we predict next week’s games? We could keep on using the power ratings from last season to predict week 2 (and it turns out that when we do, we get close to 90% accuracy - Week 1 in 2016 had a bunch more upsets), but we want our power ratings to update based on new information.
SEASON | WEEK | type | .metric | .estimate |
2016 | 1 | previous season | accuracy | 0.700 |
2016 | 2 | previous season | accuracy | 0.891 |
2016 | 3 | previous season | accuracy | 0.800 |
2016 | 4 | previous season | accuracy | 0.654 |
2016 | 5 | previous season | accuracy | 0.667 |
2016 | 6 | previous season | accuracy | 0.604 |
2016 | 7 | previous season | accuracy | 0.760 |
2016 | 8 | previous season | accuracy | 0.692 |
2016 | 9 | previous season | accuracy | 0.596 |
2016 | 10 | previous season | accuracy | 0.678 |
2016 | 11 | previous season | accuracy | 0.691 |
2016 | 12 | previous season | accuracy | 0.672 |
2016 | 13 | previous season | accuracy | 0.600 |
2016 | 14 | previous season | accuracy | 0.688 |
2016 | 15 | previous season | accuracy | 0.000 |
2016 | 16 | previous season | accuracy | 0.634 |
If we rely only on previous season’s data, we end up getting less and less accurate the further out we go, though it is kind of interesting looking at how well we do predicting bowl games using nothing other than how the teams were at the end of the previous season.
SEASON | SEASON_TYPE | WEEK | HOME | AWAY | HOME_POWER | AWAY_POWER | PRED_MARGIN | HOME_MARGIN |
2016 | postseason | 16 | UT San Antonio | New Mexico | -13.4 | -9.8 | -3.6 | -3 |
2016 | postseason | 16 | San Diego State | Houston | 4.7 | 10.3 | -5.6 | 24 |
2016 | postseason | 16 | Toledo | Appalachian State | 7.8 | 3.8 | 4.0 | -3 |
2016 | postseason | 16 | Louisiana | Southern Mississippi | -14.4 | 2.7 | -17.0 | -7 |
2016 | postseason | 16 | Tulsa | Central Michigan | -6.3 | -1.5 | -4.8 | 45 |
2016 | postseason | 16 | Western Kentucky | Memphis | 10.6 | 6.1 | 4.5 | 20 |
2016 | postseason | 16 | Wyoming | BYU | -14.9 | 3.0 | -17.9 | -3 |
2016 | postseason | 16 | Colorado State | Idaho | -6.3 | -15.2 | 8.9 | -11 |
2016 | postseason | 16 | Old Dominion | Eastern Michigan | -14.9 | -19.0 | 4.1 | 4 |
2016 | postseason | 16 | Navy | Louisiana Tech | 10.5 | 3.2 | 7.3 | -3 |
2016 | postseason | 16 | Troy | Ohio | -7.0 | -2.3 | -4.7 | 5 |
2016 | postseason | 16 | Middle Tennessee | Hawai'i | -1.8 | -18.6 | 16.8 | -17 |
2016 | postseason | 16 | Mississippi State | Miami (OH) | 12.9 | -17.3 | 30.2 | 1 |
2016 | postseason | 16 | Boston College | Maryland | -2.3 | -2.9 | 0.5 | 6 |
2016 | postseason | 16 | Vanderbilt | NC State | -0.5 | 7.2 | -7.7 | -24 |
2016 | postseason | 16 | North Texas | Army | -22.9 | -11.6 | -11.3 | -7 |
2016 | postseason | 16 | Wake Forest | Temple | -5.2 | 5.6 | -10.8 | 8 |
2016 | postseason | 16 | Washington State | Minnesota | 5.6 | 0.8 | 4.9 | -5 |
2016 | postseason | 16 | Baylor | Boise State | 13.3 | 4.4 | 8.9 | 19 |
2016 | postseason | 16 | Northwestern | Pittsburgh | 3.4 | 4.9 | -1.6 | 7 |
2016 | postseason | 16 | Miami | West Virginia | 2.1 | 7.6 | -5.5 | 17 |
2016 | postseason | 16 | Kansas State | Texas A&M | -0.1 | 10.5 | -10.6 | 5 |
2016 | postseason | 16 | South Carolina | South Florida | 0.4 | 8.4 | -8.1 | -7 |
2016 | postseason | 16 | Virginia Tech | Arkansas | 4.1 | 11.6 | -7.5 | 11 |
2016 | postseason | 16 | Colorado | Oklahoma State | -3.1 | 6.4 | -9.5 | -30 |
2016 | postseason | 16 | TCU | Georgia | 7.3 | 9.9 | -2.5 | -8 |
2016 | postseason | 16 | Tennessee | Nebraska | 12.7 | 4.8 | 7.8 | 14 |
2016 | postseason | 16 | Florida State | Michigan | 10.9 | 13.6 | -2.7 | 1 |
2016 | postseason | 16 | Louisville | LSU | 7.2 | 13.7 | -6.5 | -20 |
2016 | postseason | 16 | Alabama | Washington | 22.1 | 9.7 | 12.4 | 17 |
2016 | postseason | 16 | Clemson | Ohio State | 19.1 | 16.6 | 2.5 | 31 |
2016 | postseason | 16 | Iowa | Florida | 7.4 | 8.2 | -0.7 | -27 |
2016 | postseason | 16 | Wisconsin | Western Michigan | 10.0 | 1.4 | 8.6 | 8 |
2016 | postseason | 16 | Penn State | USC | 5.4 | 9.4 | -4.0 | -3 |
2016 | postseason | 16 | Oklahoma | Auburn | 15.1 | 7.6 | 7.5 | 16 |
2016 | postseason | 16 | Kentucky | Georgia Tech | -2.3 | 1.5 | -3.7 | -15 |
2016 | postseason | 16 | Alabama | Clemson | 22.1 | 19.1 | 3.0 | -4 |
2016 | postseason | 16 | North Carolina | Stanford | 13.3 | 13.7 | -0.4 | -2 |
2016 | postseason | 16 | Utah | Indiana | 7.8 | 3.2 | 4.5 | 2 |
2016 | postseason | 16 | UCF | Arkansas State | -21.7 | -3.1 | -15.6 | -18 |
2016 | postseason | 16 | Air Force | South Alabama | 0.4 | -15.2 | 15.7 | 24 |
How do we incorporate new information from teams? As they play games in the new season, we should expect that this will outweigh the information about teams from previous seasons. I can calculate the raw ratings for Week 1 based on play by play data: here are the top 5 and bottom 5 teams based on Week 1 raw efficiency in 2016.
SEASON | TEAM | TYPE | OFFENSE | DEFENSE | OVERALL | MARGIN |
2016 | Louisville | raw | 0.709 | 0.479 | 1.188 | 77.235 |
2016 | Michigan | raw | 0.562 | 0.376 | 0.938 | 60.942 |
2016 | Boise State | raw | 0.606 | 0.154 | 0.760 | 49.375 |
2016 | Washington | raw | 0.343 | 0.381 | 0.724 | 47.077 |
2016 | Colorado | raw | 0.319 | 0.391 | 0.709 | 46.110 |
2016 | San José State | raw | -0.279 | -0.325 | -0.604 | -39.263 |
2016 | Colorado State | raw | -0.391 | -0.319 | -0.709 | -46.110 |
2016 | Rutgers | raw | -0.381 | -0.343 | -0.724 | -47.077 |
2016 | Louisiana | raw | -0.154 | -0.606 | -0.760 | -49.375 |
2016 | Charlotte | raw | -0.479 | -0.709 | -1.188 | -77.235 |
Because we’re looking only at raw data, the best teams are simply the ones that won big in Week 1 (and the worst ones are the teams they beat) - Louisville beat Charlotte 70-14, Michigan beat Hawaii 63-3, Boise State beat Loisiana 45-10…
One week can only tell us so much about a team, but the bigger problem is the same as before: raw expected points per play doesn’t take into account the quality of the opponent. The method I’ve used for adjusting based on opponent quality so far has been at the season level, and it relies on taking into account all games that have been playing to partial out how strong or weak each team’s offense/defense is.
I can run the ridge regression on each week of data to get the coefficient for each team based on a week’s worth of results. The Week 0 ratings are our prior, which should be pretty strong, as they’re based on the last season’s worth of data (and we can include even more previous seasons in computing that prior).
Each team’s rating for Week 0 is based on previous season(s). If we run the ridge regression after the first week, each team’s rating going into Week 2 is based only on a week’s worth of games. Here are where Notre Dame and Texas end up after their matchup: Texas gets a big jump and Notre Dame drops a bunch.
If we let another week pass, we run the ridge regression on both weeks. Both Notre Dame and Texas beat their (fairly weak) opponents of Nevada and UTEP pretty handily, so both get an increase. This increase is enough for Notre Dame that they bounce above where they were after losing to Texas (quality loss?).
Roll forward a few more weeks and we can see how both teams started to crash around midseason before eventually recovering to reach medicrity by end season.
At the beginning of the season, relying only on games played so far can yield pretty unstable results. As we get more and more information, each team’s rating starts to converge and become more stable, but the early season will see teams bounce around a lot.
This means if we try to predict games in the early part of the year based only on in-season results, we don’t do nearly as well as when we use information from previous seasons.
SEASON | WEEK | .metric | in_season | previous_season | winner |
2016 | 1 | accuracy | 0.700 | 0.700 | tie |
2016 | 2 | accuracy | 0.717 | 0.891 | previous_season |
2016 | 3 | accuracy | 0.727 | 0.800 | previous_season |
2016 | 4 | accuracy | 0.654 | 0.654 | tie |
2016 | 5 | accuracy | 0.702 | 0.667 | in_season |
2016 | 6 | accuracy | 0.604 | 0.604 | tie |
2016 | 7 | accuracy | 0.660 | 0.760 | previous_season |
2016 | 8 | accuracy | 0.731 | 0.692 | in_season |
2016 | 9 | accuracy | 0.673 | 0.596 | in_season |
2016 | 10 | accuracy | 0.746 | 0.678 | in_season |
2016 | 11 | accuracy | 0.655 | 0.691 | previous_season |
2016 | 12 | accuracy | 0.690 | 0.672 | in_season |
2016 | 13 | accuracy | 0.633 | 0.600 | in_season |
2016 | 14 | accuracy | 0.750 | 0.688 | in_season |
2016 | 15 | accuracy | 0.000 | 0.000 | tie |
2016 | 16 | accuracy | 0.610 | 0.634 | previous_season |
In general, the early season sees a lot of swings using only in season data, but it starts to outperform using the previous season’s rating as the season wears on. So our goal is to blend the two together using a weighted average. I’ll use exponentially decaying weights over the course of the season, heavily weighting the prior season rating in early weeks and upping the weight to the in season data as the season wears on.
I’ll now run the weekly team ratings through the weighting function. This is what Texas and Notre Dame look like throughout the 2016 season using the combination vs just the in season.
The weighted average approach tends to counterbalance the early season swings, which means we will be less reactive to early season results. The extent to which we should react to early season results is an interesting one - in the case of Texas and Notre Dame, we may have wanted to be more cynical about how good they were by the mid season. In the case of some teams that did emerge early in the 2016 season, such as Louisville, Washington, and Michigan, we could miss out on catching onto these teams as being better than previous seasons if we weight previous seasons too highly.
Can also look at other teams.
I can then compare how this does in comparison to using purely the previous season or the in season.
SEASON | WEEK | .metric | in_season | previous_season | weighted | best |
2016 | 1 | accuracy | 0.700 | 0.700 | 0.700 | 0.700 |
2016 | 2 | accuracy | 0.717 | 0.891 | 0.891 | 0.891 |
2016 | 3 | accuracy | 0.727 | 0.800 | 0.836 | 0.836 |
2016 | 4 | accuracy | 0.654 | 0.654 | 0.692 | 0.692 |
2016 | 5 | accuracy | 0.702 | 0.667 | 0.737 | 0.737 |
2016 | 6 | accuracy | 0.604 | 0.604 | 0.623 | 0.623 |
2016 | 7 | accuracy | 0.660 | 0.760 | 0.780 | 0.780 |
2016 | 8 | accuracy | 0.731 | 0.692 | 0.712 | 0.731 |
2016 | 9 | accuracy | 0.673 | 0.596 | 0.712 | 0.712 |
2016 | 10 | accuracy | 0.746 | 0.678 | 0.763 | 0.763 |
2016 | 11 | accuracy | 0.655 | 0.691 | 0.655 | 0.691 |
2016 | 12 | accuracy | 0.690 | 0.672 | 0.655 | 0.690 |
2016 | 13 | accuracy | 0.633 | 0.600 | 0.650 | 0.650 |
2016 | 14 | accuracy | 0.750 | 0.688 | 0.750 | 0.750 |
2016 | 15 | accuracy | 0.000 | 0.000 | 0.000 | 0.000 |
2016 | 16 | accuracy | 0.610 | 0.634 | 0.610 | 0.634 |
Well, it improves 2016; let’s take a look to see for the previous years.
I’ll now apply this to all weeks from 2008 to 2021, but focus on testing the previous years.
Look at a sample of teams and games here.