I've described aspects of the Matrix as it has evolved, but I think its about time that I give it one coherent description for anyone interested.
The Matrix uses three ratings- a general performance rating based on margin of victory (which is used for rankings), a recent performance rating, and a win/loss rating. The general rating and win/loss rating are calculated with a progressive adjustment model derived from the Elo chess rating system. Ratings are adjusted according to the improbability that a given outcome would occur. The model simulates the season a few hundred times, allowing smaller adjustments with each round, until, through automated trial and error, it arrives at the ratings associated with the least improbability.
For both the general performance rating and win/loss rating, the model assumes that a team's performance will vary and the probability of a particular performance level will fall somewhere on the normal curve. The ratings, therefore, theoretically represent the mean. The larger the point margin, the less effect an additional point will have on ratings, so the effect of "running up the score" is minimal. When estimating the improbability of an event, the model barely differentiates an 18 point win and a 40 point win.
The win/loss rating, obviously, uses only wins and losses and ignore the margin of victory. The factor actually has very little effect on the outcome of model, but I have included it for the sake of comprehensiveness. For the most part, close games really are primarily by luck, and so it is best that the model does not overemphasize the winning of the game. Because the model uses a marginal progressive adjustment method, it is able to handle undefeated teams without the problems faced by MLE approaches.
After the general and win/loss ratings have been calculated, a recent performance ratings is calculated using the deviation of a teams margin of victory from the expected margin of victory. Obviously, greater weight is given to more recent games.
The final component of the Matrix are the Navy adjustment factors. Essentially, these factors compare a team's opponent against past opponent in terms of its relative dependency on the pass and run and then adjusts the expected outcome to match any advantages or disadvantages a team may experience in match-ups. For example, if a team has plays terrible pass defense and now has to play Texas Tech, it should be expected to under-perform relative to its general, recent and win/loss performance ratings.
The general performance rating, win/loss rating, recent performance rating, and Navy adjustment factors are then weighted and used to estimate the margin of victory (along with an adjustment for home field advantage). Finally, I use a consistency rating (how predictable a team's performance has been) to estimate the probability of a suggested outcome (of a team winning or covering the spread).
Results:
These results are only relevant for the results before week 11, 2007.
Top 5 overall:
1. Ohio State
2. Oregon
3. West Virginia
4. LSU
5. Missouri
(Note: After the OSU lost and WVU struggled against Louisville, Oregon has taken the top spot and Oklahoma and Kansas have moved into the top 5)
Oklahoma fans might see a problem that Missouri is ranked higher than their own Sooners. This is a good example, though, where the model has punished Oklahoma more for the greater improbability of their loss to Colorado. Because both teams have only one loss and Missouri loss to a better team than Colorado (who just happens to be Oklahoma), Missouri is ranked higher. Oklahoma is 6th and only 2/10's of a point behind the Tigers.
Top 5 Win/Loss
1. Ohio State
2. Kansas
3. Hawaii
4. LSU
4. Oklahoma
4. Arizona State
Obviously, a win/loss rating should give extra kudos to undefeated teams. The three-way tie for 4th is a bit of an anomaly, but here the Sooners have the advantage over Missouri.
Top 5 Consistency
1. Kansas
2. Florida International
3. Utah State
4. Arizona State
5. Ohio State
Two types of teams find themselves among the most consistent. The surprisingly successful teams that just seem to win every week and the really, really bad teams that will always play poorly against D1A competition. I thought it was interesting that Kansas has been the most consistent team this season and they are 9-0 against the spread this year.
The five most unpredictable teams -
1. UCLA
2. Utah
3. Central Michigan
4. Iowa State
5. UNLV
Fitting.
Navy adjustment factor:
You can't produce a ranking from the adjustment factor, but we can guess which teams are going to have a tough match-up this weekend. The team most likely to get unusually lit up through the air this week was, coincidentally, Navy who gave up almost 500 passing yards and 62 points in a winning effort against the 1-7 (now 1-8) Mean Green of North Texas.
Recent Performance:
Again, it doesn't make much sense to rank teams on their recent performances, because it is relative to their general performance, but the hottest team going into this weekend was Iowa State (relative to their performance all season). Unfortunately for Boston College, another very hot team is Clemson - and a cold team is, well, BC.
When dealing with all these factors, I think it is important to consider their relative importance. The Matrix has the power to explain about 65% of the variance of point margins for games involving D1A teams this season. About 61% is explained by the general performance rating alone and the other 4% by the other adjustment factors and ratings. The win/loss rating barely makes an appearance, and is really just included so the model can be comprehensive and "hybrid," which is such a popular term is sports rating these days.
The model is still somewhat fluid as I make minor adjustments to deal with problems as they arise, but these are the general principles on which it is based. I will continue to publish rankings and predictions, and I will add other stats - consistency, recent performance, match-up warnings, unexpected results, etc.
P.S. according to the Matrix, the most unlikely outcome involving two D1A teams was Notre Dame over UNLV and #2 was UNLV over Utah.
BPR | A system for ranking teams based only one wins and losses and strength of schedule. See BPR for an explanation. |
EPA (Expected Points Added) | Expected points are the points a team can "expect" to score based on the distance to the end zone and down and distance needed for a first down, with an adjustment for the amount of time remaining in some situations. Expected points for every situation is estimated using seven years of historical data. The expected points considers both the average points the offense scores in each scenario and the average number of points the other team scores on their ensuing possession. The Expected Points Added is the change in expected points before and after a play. |
EP3 (Effective Points Per Possession) | Effective Points Per Possession is based on the same logic as the EPA, except it focuses on the expected points added at the beginning and end of an offensive drive. In other words, the EP3 for a single drive is equal to the sum of the expected points added for every offensive play in a drive (EP3 does not include punts and field goal attempts). We can also think of the EP3 as points scored+expected points from a field goal+the value of field position change on the opponent's next possession. |
Adjusted for Competition | We attempt to adjust some statistics to compensate for differences in strength of schedule. While the exact approach varies some from stat to stat the basic concept is the same. We use an algorithm to estimate scores for all teams on both sides of the ball (e.g., offense and defense) that best predict real results. For example, we give every team an offensive and defensive yards per carry score. Subtracting the offensive score from the defensive score for two opposing teams will estimate the yards per carry if the two teams were to play. Generally, the defensive scores average to zero while offensive scores average to the national average, e.g., yards per carry, so we call the offensive score "adjusted for competition" and roughly reflects what the team would do against average competition |
Impact | see Adjusted for Competition. Impact scores are generally used to evaluate defenses. The value roughly reflects how much better or worse a team can expect to do against this opponent than against the average opponent. |
[-] About this table
Includes the
top 180 QBs by total plays
Total <=0 | Percent of plays that are negative or no gain |
Total >=10 | Percent of plays that gain 10 or more yards |
Total >=25 | Percent of plays that gain 25 or more yards |
10 to 0 | Ratio of Total >=10 to Total <=0 |
Includes the
top 240 RBs by total plays
Total <=0 | Percent of plays that are negative or no gain |
Total >=10 | Percent of plays that gain 10 or more yards |
Total >=25 | Percent of plays that gain 25 or more yards |
10 to 0 | Ratio of Total >=10 to Total <=0 |
Includes the
top 300 Receivers by total plays
Total <=0 | Percent of plays that are negative or no gain |
Total >=10 | Percent of plays that gain 10 or more yards |
Total >=25 | Percent of plays that gain 25 or more yards |
10 to 0 | Ratio of Total >=10 to Total <=0 |
Includes
the
top 180 players by pass attempts)
3rdLComp% |
Completion % on 3rd and long (7+
yards) |
SitComp% |
Standardized completion % for
down and distance. Completion % by down and distance are weighted by
the national average of pass plays by down and distance. |
Pass <=0 | Percent of pass plays that are negative or no gain |
Pass >=10 | Percent of pass plays that gain 10 or more yards |
Pass >=25 | Percent of pass plays that gain 25 or more yards |
10 to 0 | Ratio of Pass >=10 to Pass<=0 |
%Sacks |
Ratio of sacks to pass plays |
Bad INTs |
Interceptions on 1st or 2nd down
early before the last minute of the half |
Includes the top 240 players by carries
YPC1stD |
Yards per carry on 1st down |
CPCs |
Conversions (1st down/TD) per
carry in short yardage situations - the team 3 or fewer yards for a 1st
down or touchdown |
%Team Run |
Player's carries as a percent of team's carries |
%Team RunS |
Player's carries as a percent of team's carries in short
yardage situations |
Run <=0 |
Percent of running plays that
are negative or no gain |
Run >=10 |
Percent of running plays that
gain 10 or more yards |
Run >=25 | Percent of running plays that gain 25 or more yards |
10 to 0 | Ratio of Run >=10 to Run <=0 |
Includes the top 300 players by targets
Conv/T 3rd | Conversions per target on 3rd Downs |
Conv/T PZ | Touchdowns per target inside the 10 yardline |
%Team PZ | Percent of team's targets inside the 10 yardline |
Rec <=0 | Percent of targets that go for negative yards or no net gain |
Rec >=10 | Percent of targets that go for 10+ yards |
Rec >=25 | Percent of targets that go for 25+ yards |
10 to 0 | Ratio of Rec>=0 to Rec<=0 |
Includes the top 300 players by targets
xxxx | xxxx |
...
Includes players with a significant number of attempts
NEPA | "Net Expected Points Added": (expected points after play - expected points before play)-(opponent's expected points after play - opponent's expected points before play). Uses the expected points for the current possession and the opponent's next possession based on down, distance and spot |
NEPA/PP | Average NEPA per play |
Max/Min | Single game high and low |
Includes players with a significant number of attempts
NEPA | "Net Expected Points Added": (expected points after play - expected points before play)-(opponent's expected points after play - opponent's expected points before play). Uses the expected points for the current possession and the opponent's next possession based on down, distance and spot |
NEPA/PP | Average NEPA per play |
Max/Min | Single game high and low |
Adjusted | Reports the per game EPA adjusted for the strength of schedule. |
Defensive Possession Stats
Points/Poss | Offensive points per possession |
EP3 | Effective Points per Possession |
EP3+ | Effective Points per Possession impact |
Plays/Poss | Plays per possession |
Yards/Poss | Yards per possession |
Start Spot | Average starting field position |
Time of Poss | Average time of possession (in seconds) |
TD/Poss | Touchdowns per possession |
TO/Poss | Turnovers per possession |
FGA/Poss | Attempted field goals per possession |
%RZ | Red zone trips per possession |
Points/RZ | Average points per red zone trip. Field Goals are included using expected points, not actual points. |
TD/RZ | Touchdowns per red zone trip |
FGA/RZ | Field goal attempt per red zone trip |
Downs/RZ | Turnover on downs per red zone trip |
Defensive Play-by-Play Stats
EPA/Pass | Expected Points Added per pass attempt |
EPA/Rush | Expected Points Added per rush attempt |
EPA/Pass+ | Expected Points Added per pass attempt impact |
EPA/Rush+ | Expected Points Added per rush attempt impact |
Yards/Pass | Yards per pass |
Yards/Rush | Yards per rush |
Yards/Pass+ | Yards per pass impact |
Yards/Rush+ | Yards per rush impact |
Exp/Pass | Explosive plays (25+ yards) per pass |
Exp/Rush | Explosive plays (25+ yards) per rush |
Exp/Pass+ | Explosive plays (25+ yards) per pass impact |
Exp/Rush+ | Explosive plays (25+ yards) per rush impact |
Comp% | Completion percentage |
Comp%+ | Completion percentage impact |
Yards/Comp | Yards per completion |
Sack/Pass | Sacks per pass |
Sack/Pass+ | Sacks per pass impact |
Sack/Pass* | Sacks per pass on passing downs |
INT/Pass | Interceptions per pass |
Neg/Rush | Negative plays (<=0) per rush |
Neg/Run+ | Negative plays (<=0) per rush impact |
Run Short | % Runs in short yardage situations |
Convert% | 3rd/4th down conversions |
Conv%* | 3rd/4th down conversions versus average by distance |
Conv%+ | 3rd/4th down conversions versus average by distance impact |
Offensive Play-by-Play Stats
Plays | Number of offensive plays |
%Pass | Percent pass plays |
EPA/Pass | Expected Points Added per pass attempt |
EPA/Rush | Expected Points Added per rush attempt |
EPA/Pass+ | Expected Points Added per pass attempt adjusted for competition |
EPA/Rush+ | Expected Points Added per rush attempt adjusted for competition |
Yards/Pass | Yards per pass |
Yards/Rush | Yards per rush |
Yards/Pass+ | Yards per pass adjusted for competition |
Yards/Rush+ | Yards per rush adjusted for competition |
Exp Pass | Explosive plays (25+ yards) per pass |
Exp Run | Explosive plays (25+ yards) per rush |
Exp Pass+ | Explosive plays (25+ yards) per pass adjusted for competition |
Exp Run+ | Explosive plays (25+ yards) per rush adjusted for competition |
Comp% | Completion percentage |
Comp%+ | Completion percentage adjusted for competition |
Sack/Pass | Sacks per pass |
Sack/Pass+ | Sacks per pass adjusted for competition |
Sack/Pass* | Sacks per pass on passing downs |
Int/Pass | Interceptions per pass |
Neg/Run | Negative plays (<=0) per rush |
Neg/Run+ | Negative plays (<=0) per rush adjusted for competition |
Run Short | % Runs in short yardage situations |
Convert% | 3rd/4th down conversions |
Conv%* | 3rd/4th down conversions versus average by distance |
Conv%+ | 3rd/4th down conversions versus average by distance adjusted for competition |
Offensive Possession Stats
Points/Poss | Offensive points per possession |
EP3 | Effective Points per Possession |
EP3+ | Effective Points per Possession adjusted for competition |
Plays/Poss | Plays per possession |
Yards/Poss | Yards per possession |
Start Spot | Average starting field position |
Time of Poss | Average time of possession (in seconds) |
TD/Poss | Touchdowns per possession |
TO/Poss | Turnovers per possession |
FGA/Poss | Attempted field goals per possession |
Poss/Game | Possessions per game |
%RZ | Red zone trips per possession |
Points/RZ | Average points per red zone trip. Field Goals are included using expected points, not actual points. |
TD/RZ | Touchdowns per red zone trip |
FGA/RZ | Field goal attempt per red zone trip |
Downs/RZ | Turnover on downs per red zone trip |
PPP | Points per Possession |
aPPP | Points per Possession allowed |
PPE | Points per Exchange (PPP-aPPP) |
EP3+ | Expected Points per Possession |
aEP3+ | Expected Points per Possession allowed |
EP2E+ | Expected Points per Exchange |
EPA/Pass+ | Expected Points Added per Pass |
EPA/Rush+ | Expected Points Added per Rush |
aEPA/Pass+ | Expected Points Allowed per Pass |
aEPA/Rush+ | Expected Points Allowed per Rush |
Exp/Pass | Explosive Plays per Pass |
Exp/Rush | Explosive Plays per Rush |
aExp/Pass | Explosive Plays per Pass allowed |
aExp/Rush | Explosive Plays per Rush allowed |
BPR | A method for ranking conferences based only on their wins and losses and the strength of schedule. See BPR for an explanation. |
Power | A composite measure that is the best predictor of future game outcomes, averaged across all teams in the conference |
P-Top | The power ranking of the top teams in the conference |
P-Mid | The power ranking of the middling teams in the conference |
P-Bot | The power ranking of the worst teams in the conference |
SOS-Und | Strength of Schedule - Undefeated. Focuses on the difficulty of going undefeated, averaged across teams in the conference |
SOS-BE | Strength of Schedule - Bowl Eligible. Focuses on the difficulty of becoming bowl eligible, averaged across teams in the conference |
Hybrid | A composite measure that quantifies human polls, applied to converences |
Player Game Log
Use the yellow, red and green cells to filter values. Yellow cells filter for exact matches, green cells for greater values and red cells for lesser values. By default, the table is filtered to only the top 200 defense-independent performances (oEPA). The table includes the 5,000 most important performances (positive and negative) by EPA.
Use the yellow, red and green cells to filter values. Yellow cells filter for exact matches, green cells for greater values and red cells for lesser values. By default, the table is filtered to only the top 200 defense-independent performances (oEPA). The table includes the 5,000 most important performances (positive and negative) by EPA.
EPA | Expected points added (see glossary) |
oEPA | Defense-independent performance |
Team Game Log
Use the yellow, red and green cells to filter values. Yellow cells filter for exact matches, green cells for greater values and red cells for lesser values.
Use the yellow, red and green cells to filter values. Yellow cells filter for exact matches, green cells for greater values and red cells for lesser values.
EP3 | Effective points per possession (see glossary) |
oEP3 | Defense-independent offensive performance |
dEP3 | Offense-independent defensive performance |
EPA | Expected points added (see glossary) |
oEPA | Defense-independent offensive performance |
dEPA | Offense-independent defensive performance |
EPAp | Expected points added per play |
Very nice blog. I'm curious as to how the matrix has fared picking winners against the spread. How does it fare against significantly larger spreads (20+ points). It would seem logical to me that consistent teams (Kansas) can cover those large spreads because well, they are so consistent. Perhaps an inconsistent team that is a very large favorite, would be a good bet to not cover?
ReplyDeleteAlso concerning the inconsistecy rating, I am not surrised that UCLA, Utah, and UNLV are in the top 5 (how did UNLV beat Utah 27-0?!). What about Central Michigan? It seems they have been pretty consistent in conference play (undefeated) and pretty consistent in nonconference play (lost every game save Army). Do your ratings for consistency adjust for strenght of opposition? That may explain some of Central Michigan's inconsistency.
First, on Central Michigan - I honestly don't know enough about the Chippewas or the MAC to do them justice. Central Michigan has been in a lot of lopsided games this season (-56 to Clemson, -45 to Kansas, -31 to Purdue, +30 to ND State, +25 to N. Illinois, +24 to Army) and the Matrix may have been too conservative in these games, pulling their rating in two directions.
ReplyDeleteAs for adjusting the consistency ratings for strength of opposition - they are implicitly adjusted in one sense. I've considered an adjustment for the magnitude of the game, in the sense that teams get up for big games and don't coast at the end, but I haven't found any real statistical evidence for that yet. The problem is that some teams crack under pressure, and I don't know how I could separate the gutless from the gamers.
Against the spread, the Matrix has been averaging about 55-60% against the spread (40% this past weekend), but I don't have enough cases yet to consider that anything significant. I can't apply it retroactively, because I've used the results from past games to develop the model. So far, my real goal has been to mimic the spread, and from there make marginal adjustments to find some advantage. I'm still working out a few bugs, but its getting there.
You're right that inconsistent teams make bad bets, especially with large expected margins. The problem with consistent teams (except for the perennially underrated like Kansas) is that line setters also have a pretty good idea what to expect from them and set the line accordingly. I'm hoping that, starting next week, I will be able to generate meaningful probabilities that will also take the consistency rating into account.
Thanks for comments - I'm happy to take any suggestions