With the Stat-Wise Heisman Rankings I have tried to imitate the more scientific aspects of Heisman voting while avoiding the non-competitive biases. Points are awarded to players based on their opponent adjusted stats (yards, yards per play), their performance in "important" games, and lose points for poor performances (especially turnovers) in losses. At least for the near future, the Stat-Wise Heisman Rankings are limited to offensive players.
BPR | A system for ranking teams based only one wins and losses and strength of schedule. See BPR for an explanation. |
EPA (Expected Points Added) | Expected points are the points a team can "expect" to score based on the distance to the end zone and down and distance needed for a first down, with an adjustment for the amount of time remaining in some situations. Expected points for every situation is estimated using seven years of historical data. The expected points considers both the average points the offense scores in each scenario and the average number of points the other team scores on their ensuing possession. The Expected Points Added is the change in expected points before and after a play. |
EP3 (Effective Points Per Possession) | Effective Points Per Possession is based on the same logic as the EPA, except it focuses on the expected points added at the beginning and end of an offensive drive. In other words, the EP3 for a single drive is equal to the sum of the expected points added for every offensive play in a drive (EP3 does not include punts and field goal attempts). We can also think of the EP3 as points scored+expected points from a field goal+the value of field position change on the opponent's next possession. |
Adjusted for Competition | We attempt to adjust some statistics to compensate for differences in strength of schedule. While the exact approach varies some from stat to stat the basic concept is the same. We use an algorithm to estimate scores for all teams on both sides of the ball (e.g., offense and defense) that best predict real results. For example, we give every team an offensive and defensive yards per carry score. Subtracting the offensive score from the defensive score for two opposing teams will estimate the yards per carry if the two teams were to play. Generally, the defensive scores average to zero while offensive scores average to the national average, e.g., yards per carry, so we call the offensive score "adjusted for competition" and roughly reflects what the team would do against average competition |
Impact | see Adjusted for Competition. Impact scores are generally used to evaluate defenses. The value roughly reflects how much better or worse a team can expect to do against this opponent than against the average opponent. |
[-] About this table
Includes the
top 180 QBs by total plays
Total <=0 | Percent of plays that are negative or no gain |
Total >=10 | Percent of plays that gain 10 or more yards |
Total >=25 | Percent of plays that gain 25 or more yards |
10 to 0 | Ratio of Total >=10 to Total <=0 |
Includes the
top 240 RBs by total plays
Total <=0 | Percent of plays that are negative or no gain |
Total >=10 | Percent of plays that gain 10 or more yards |
Total >=25 | Percent of plays that gain 25 or more yards |
10 to 0 | Ratio of Total >=10 to Total <=0 |
Includes the
top 300 Receivers by total plays
Total <=0 | Percent of plays that are negative or no gain |
Total >=10 | Percent of plays that gain 10 or more yards |
Total >=25 | Percent of plays that gain 25 or more yards |
10 to 0 | Ratio of Total >=10 to Total <=0 |
Includes
the
top 180 players by pass attempts)
3rdLComp% |
Completion % on 3rd and long (7+
yards) |
SitComp% |
Standardized completion % for
down and distance. Completion % by down and distance are weighted by
the national average of pass plays by down and distance. |
Pass <=0 | Percent of pass plays that are negative or no gain |
Pass >=10 | Percent of pass plays that gain 10 or more yards |
Pass >=25 | Percent of pass plays that gain 25 or more yards |
10 to 0 | Ratio of Pass >=10 to Pass<=0 |
%Sacks |
Ratio of sacks to pass plays |
Bad INTs |
Interceptions on 1st or 2nd down
early before the last minute of the half |
Includes the top 240 players by carries
YPC1stD |
Yards per carry on 1st down |
CPCs |
Conversions (1st down/TD) per
carry in short yardage situations - the team 3 or fewer yards for a 1st
down or touchdown |
%Team Run |
Player's carries as a percent of team's carries |
%Team RunS |
Player's carries as a percent of team's carries in short
yardage situations |
Run <=0 |
Percent of running plays that
are negative or no gain |
Run >=10 |
Percent of running plays that
gain 10 or more yards |
Run >=25 | Percent of running plays that gain 25 or more yards |
10 to 0 | Ratio of Run >=10 to Run <=0 |
Includes the top 300 players by targets
Conv/T 3rd | Conversions per target on 3rd Downs |
Conv/T PZ | Touchdowns per target inside the 10 yardline |
%Team PZ | Percent of team's targets inside the 10 yardline |
Rec <=0 | Percent of targets that go for negative yards or no net gain |
Rec >=10 | Percent of targets that go for 10+ yards |
Rec >=25 | Percent of targets that go for 25+ yards |
10 to 0 | Ratio of Rec>=0 to Rec<=0 |
Includes the top 300 players by targets
xxxx | xxxx |
...
Includes players with a significant number of attempts
NEPA | "Net Expected Points Added": (expected points after play - expected points before play)-(opponent's expected points after play - opponent's expected points before play). Uses the expected points for the current possession and the opponent's next possession based on down, distance and spot |
NEPA/PP | Average NEPA per play |
Max/Min | Single game high and low |
Includes players with a significant number of attempts
NEPA | "Net Expected Points Added": (expected points after play - expected points before play)-(opponent's expected points after play - opponent's expected points before play). Uses the expected points for the current possession and the opponent's next possession based on down, distance and spot |
NEPA/PP | Average NEPA per play |
Max/Min | Single game high and low |
Adjusted | Reports the per game EPA adjusted for the strength of schedule. |
Defensive Possession Stats
Points/Poss | Offensive points per possession |
EP3 | Effective Points per Possession |
EP3+ | Effective Points per Possession impact |
Plays/Poss | Plays per possession |
Yards/Poss | Yards per possession |
Start Spot | Average starting field position |
Time of Poss | Average time of possession (in seconds) |
TD/Poss | Touchdowns per possession |
TO/Poss | Turnovers per possession |
FGA/Poss | Attempted field goals per possession |
%RZ | Red zone trips per possession |
Points/RZ | Average points per red zone trip. Field Goals are included using expected points, not actual points. |
TD/RZ | Touchdowns per red zone trip |
FGA/RZ | Field goal attempt per red zone trip |
Downs/RZ | Turnover on downs per red zone trip |
Defensive Play-by-Play Stats
EPA/Pass | Expected Points Added per pass attempt |
EPA/Rush | Expected Points Added per rush attempt |
EPA/Pass+ | Expected Points Added per pass attempt impact |
EPA/Rush+ | Expected Points Added per rush attempt impact |
Yards/Pass | Yards per pass |
Yards/Rush | Yards per rush |
Yards/Pass+ | Yards per pass impact |
Yards/Rush+ | Yards per rush impact |
Exp/Pass | Explosive plays (25+ yards) per pass |
Exp/Rush | Explosive plays (25+ yards) per rush |
Exp/Pass+ | Explosive plays (25+ yards) per pass impact |
Exp/Rush+ | Explosive plays (25+ yards) per rush impact |
Comp% | Completion percentage |
Comp%+ | Completion percentage impact |
Yards/Comp | Yards per completion |
Sack/Pass | Sacks per pass |
Sack/Pass+ | Sacks per pass impact |
Sack/Pass* | Sacks per pass on passing downs |
INT/Pass | Interceptions per pass |
Neg/Rush | Negative plays (<=0) per rush |
Neg/Run+ | Negative plays (<=0) per rush impact |
Run Short | % Runs in short yardage situations |
Convert% | 3rd/4th down conversions |
Conv%* | 3rd/4th down conversions versus average by distance |
Conv%+ | 3rd/4th down conversions versus average by distance impact |
Offensive Play-by-Play Stats
Plays | Number of offensive plays |
%Pass | Percent pass plays |
EPA/Pass | Expected Points Added per pass attempt |
EPA/Rush | Expected Points Added per rush attempt |
EPA/Pass+ | Expected Points Added per pass attempt adjusted for competition |
EPA/Rush+ | Expected Points Added per rush attempt adjusted for competition |
Yards/Pass | Yards per pass |
Yards/Rush | Yards per rush |
Yards/Pass+ | Yards per pass adjusted for competition |
Yards/Rush+ | Yards per rush adjusted for competition |
Exp Pass | Explosive plays (25+ yards) per pass |
Exp Run | Explosive plays (25+ yards) per rush |
Exp Pass+ | Explosive plays (25+ yards) per pass adjusted for competition |
Exp Run+ | Explosive plays (25+ yards) per rush adjusted for competition |
Comp% | Completion percentage |
Comp%+ | Completion percentage adjusted for competition |
Sack/Pass | Sacks per pass |
Sack/Pass+ | Sacks per pass adjusted for competition |
Sack/Pass* | Sacks per pass on passing downs |
Int/Pass | Interceptions per pass |
Neg/Run | Negative plays (<=0) per rush |
Neg/Run+ | Negative plays (<=0) per rush adjusted for competition |
Run Short | % Runs in short yardage situations |
Convert% | 3rd/4th down conversions |
Conv%* | 3rd/4th down conversions versus average by distance |
Conv%+ | 3rd/4th down conversions versus average by distance adjusted for competition |
Offensive Possession Stats
Points/Poss | Offensive points per possession |
EP3 | Effective Points per Possession |
EP3+ | Effective Points per Possession adjusted for competition |
Plays/Poss | Plays per possession |
Yards/Poss | Yards per possession |
Start Spot | Average starting field position |
Time of Poss | Average time of possession (in seconds) |
TD/Poss | Touchdowns per possession |
TO/Poss | Turnovers per possession |
FGA/Poss | Attempted field goals per possession |
Poss/Game | Possessions per game |
%RZ | Red zone trips per possession |
Points/RZ | Average points per red zone trip. Field Goals are included using expected points, not actual points. |
TD/RZ | Touchdowns per red zone trip |
FGA/RZ | Field goal attempt per red zone trip |
Downs/RZ | Turnover on downs per red zone trip |
PPP | Points per Possession |
aPPP | Points per Possession allowed |
PPE | Points per Exchange (PPP-aPPP) |
EP3+ | Expected Points per Possession |
aEP3+ | Expected Points per Possession allowed |
EP2E+ | Expected Points per Exchange |
EPA/Pass+ | Expected Points Added per Pass |
EPA/Rush+ | Expected Points Added per Rush |
aEPA/Pass+ | Expected Points Allowed per Pass |
aEPA/Rush+ | Expected Points Allowed per Rush |
Exp/Pass | Explosive Plays per Pass |
Exp/Rush | Explosive Plays per Rush |
aExp/Pass | Explosive Plays per Pass allowed |
aExp/Rush | Explosive Plays per Rush allowed |
BPR | A method for ranking conferences based only on their wins and losses and the strength of schedule. See BPR for an explanation. |
Power | A composite measure that is the best predictor of future game outcomes, averaged across all teams in the conference |
P-Top | The power ranking of the top teams in the conference |
P-Mid | The power ranking of the middling teams in the conference |
P-Bot | The power ranking of the worst teams in the conference |
SOS-Und | Strength of Schedule - Undefeated. Focuses on the difficulty of going undefeated, averaged across teams in the conference |
SOS-BE | Strength of Schedule - Bowl Eligible. Focuses on the difficulty of becoming bowl eligible, averaged across teams in the conference |
Hybrid | A composite measure that quantifies human polls, applied to converences |
Player Game Log
Use the yellow, red and green cells to filter values. Yellow cells filter for exact matches, green cells for greater values and red cells for lesser values. By default, the table is filtered to only the top 200 defense-independent performances (oEPA). The table includes the 5,000 most important performances (positive and negative) by EPA.
Use the yellow, red and green cells to filter values. Yellow cells filter for exact matches, green cells for greater values and red cells for lesser values. By default, the table is filtered to only the top 200 defense-independent performances (oEPA). The table includes the 5,000 most important performances (positive and negative) by EPA.
EPA | Expected points added (see glossary) |
oEPA | Defense-independent performance |
Team Game Log
Use the yellow, red and green cells to filter values. Yellow cells filter for exact matches, green cells for greater values and red cells for lesser values.
Use the yellow, red and green cells to filter values. Yellow cells filter for exact matches, green cells for greater values and red cells for lesser values.
EP3 | Effective points per possession (see glossary) |
oEP3 | Defense-independent offensive performance |
dEP3 | Offense-independent defensive performance |
EPA | Expected points added (see glossary) |
oEPA | Defense-independent offensive performance |
dEPA | Offense-independent defensive performance |
EPAp | Expected points added per play |
Saturday, July 31, 2010
Stat-Wise Heisman Rankings
The winner of the Heisman Memorial Trophy is supposed to be the season's most outstanding player. But we all have a different understanding of outstanding. Because there are no specific guidelines about what makes one a Heisman candidate, the process is riddled with problems. A player has a better chance if he spends more time on television, plays for a traditional power, has good weapons around him, etc.
Thursday, July 22, 2010
CFB National Championship Props
Based on historical data, poll voting patterns, and thousands of computer simulations, we can estimate each team's chance of hoisting the crystal football trophy. It turns out that the most difficult part of predicting national champions is not simulating games, but ranking teams to match human polls. In fact, 99% of the processing time in each simulation is devoted to ranking teams, not simulating games, but the payout should be worth it. The modified ranking algorithm has a strong correlation with past voting patterns, especially in identifying the BCS qualifiers.
The results above are based on 6,540 simulations using the data that would have been available in mid-October 2009. To jog your memory, Alabama was 6-0 with Auburn left as the toughest game on the regular season schedule (86% win probability). Florida was also heavily favored all the way out, but faced a slightly stickier road. Cincinnati was also undefeated, but simply wasn't as good of a team. TCU still had BYU and Utah left, but Boise St. was sitting pretty to finish the regular season 13-0. Texas was just about to play a two-loss Oklahoma.
Given this reality, Alabama was looking at a 63% chance of playing for a national championship and a 43% chance of winning it all. Florida and Texas were coming in with an 18% chance of a national title. The most likely scenario was an SEC champ vs. Big 12 champ championship game, with nearly a quarter of simulations pointing to a Texas vs. Alabama/Florida title game. Boise St had a 1/4 chance of playing for a national championship--they were all but guaranteed an undefeated season and would have been given the nod over most one loss teams-they were often the benefactors of a conference title game upset. An undefeated TCU was usually picked over a perfect Boise St team, but the Horned Frogs faced a tougher schedule. A few teams were still in the running, but needed to win out and needed a lot of lucky breaks along the way.
Given all the undefeated challengers and conference championship games, championship game participants averaged 12.5 wins. Boise St and TCU never reached a championship game after a loss. In one simulation, a three-loss Alabama team slipped into the championship game against an undefeated TCU and won.
Wednesday, July 21, 2010
Revised Historical Rankings
I've revised rankings for all college football teams since 1900 that have sufficient sample size (see "Past Rankings" on right panel). These rankings are based on a statistical equation that accounts for wins and losses, margin of victory, and strength of schedule. It is designed to mimic the ranking process, but without the biases and heuristics that limit human voters.
Looking at #1s, the computer and I only disagree once since my birth-Kansas in 2007. Dartmouth in 1970 is also a curious pick (#14 AP that season) and Cornell over Texas A&M in 1939 is personally disagreeable.
Looking at #1s, the computer and I only disagree once since my birth-Kansas in 2007. Dartmouth in 1970 is also a curious pick (#14 AP that season) and Cornell over Texas A&M in 1939 is personally disagreeable.
Tuesday, July 20, 2010
The Breakdown
The Breakdown is an all-inclusive statistical tour of a college football game. In addition to computer simulated results, including scores, odds, team and individual statistics, the Breakdown sheds some light on the historical data and analytic techniques used to derive those predictions.
The first panel is a pre-game-post-game summary - an "Expected Box Score", including the expected score, odds, and team and individual statistics. Red and green numbers next to the team statistics compare the expected performance to the team's average performance. In this example, Texas' expected 227 passing yards is 46 less than their average, large because the 59.4 completion percentage is 8 percentage points below the team average. Generally, individual predictions do not account for injuries and suspensions, especially in-game injuries like that to Colt McCoy in this particular game.
[+] Enlarge
[+] Enlarge
[+] Enlarge
[+] Enlarge
The first panel is a pre-game-post-game summary - an "Expected Box Score", including the expected score, odds, and team and individual statistics. Red and green numbers next to the team statistics compare the expected performance to the team's average performance. In this example, Texas' expected 227 passing yards is 46 less than their average, large because the 59.4 completion percentage is 8 percentage points below the team average. Generally, individual predictions do not account for injuries and suspensions, especially in-game injuries like that to Colt McCoy in this particular game.
The second panel is a summary of the two teams trend-O-meter, Hybrid, and cRPI (the cRPI* is multiplied by 100) - with national rankings in parentheses. The hybrid rating is the most realistic system for ranking teams. You can see from the trend-O-meter that Alabama came into this game playing relatively well, but Texas did not.
Text boxes in this panel list more team statistics. Ratings (Unit, Rush and Pass) are adjusted to opponent strength. The unit rating is based on points score/allowed, and the rush and pass ratings are based on yards/play gained/allowed. The bar graphs offer a summary of offensive and defensive match-ups. The portion below zero on each bar is representative of the opposing teams defensive strength in that area. The portion above the bar in the team's color is the predicted yards per run or pass for that team. The gray portion is what the team gains on average. In the title of the graph is percent of plays that the team runs or passes. In this case, Texas' defense should be particular effective against the Alabama pass offense, but because Alabama runs the ball 63% of the time, this advantage will not be as important.
Panel 4 adds individual statistics and information on up to 6 previous meetings.
Next is a comparison of the two teams since 1980 (explanations of the Hybrid and cRPI). In this case, the hybrid ratings across seasons are standardized to 1.
The next panel has even more statistics and national rankings in parentheses. The most important numbers here are the sacks/pass, tackles for loss (TFL)/run, points/possession and TDs/possession.
[+] Enlarge
Explanation of maps. In the maps, team's with similar styles are placed closed to one another. In this case, the number in parentheses is the point differential between what that team was expected to do and what they actually did. In this example, Texas gave up 13 more points to Texas A&M than it should have and 3 fewer points to Nebraska. Because Alabama is closer to Nebraska than Texas A&M, this generally suggests that Texas' defense is relatively well-suited for the Alabama-style offense. Looking at the defense map, it seems that Texas' offense is relatively well-suited for the Alabama-style defense as well.
[+] Enlarge
Tuesday, July 13, 2010
Monday, July 12, 2010
Individual Statistics-Quarterbacks
I've now added quarterbacks to the mix. The most important statistic here is the Adjusted+ completion percentage. It adjusts a quarterbacks completion percentage for both the quality of the pass defense against whom the pass was being thrown and the distance the path was thrown.
Enjoy
Wednesday, July 7, 2010
Individual Statistics-Running Backs
This season I hope to start posting individual players statistics. I've started working with running backs. I've defined running backs as players that carry the ball an average of 4 times per regular season game (48) and throw less than 2 passes per game. To be included in the leaderboard above, a player must have carried the ball 8 times per game (96). In addition to carries and rushing average, I've added a few interesting, statistically derived numbers. First, FD is the probability that, if given the ball threes times, the player would gain 10 yards (roughly the requirements for gaining a first down). The next two numbers are the probability that the player would lose one yard and gain 10 in one carry.
Enjoy
Subscribe to:
Posts (Atom)