I feel like I've got some really interesting numbers on my last blog, and suggest that visitors give it a look, but I'm pretty stoked about this blog, too. This week I am presenting the Matrix, the culmination of my experiment into college football prediction models (plus whatever refinements I might want to make later).
A quick overview of the matrix. It uses two general ratings--play throughout the season and play weighted by the last games. To find the best ratings, I use an automated trial and error system that runs the teams through the season a few hundred times to find the best fit. I then adjust these two ratings for match ups in the game--offensive and defensive play against the run and pass and home field advantage.
Before we get to this week's picks, I want to quickly review my picks from last week. I finished the day 2-3, but to my credit, Boston College did not deserve to win. Morelli proved me right in Happy Valley, and Sanchez showed that USC's problems go much deeper than the play of the quarterback. I still have a hard time believing the PAC 10 is any good with Arizona State undefeated and Oregon in second (after the beat down they got last year in the Holiday Bowl). If USC is having an off year, the entire PAC 10 drops a few rungs in my opinion.
Now with a few notable picks (in a less exciting week).
Game 1. LSU @ Alabama
The only reason the spread is under 10 is that the game is at Alabama and people are waiting anxiously for Saban to perform some wonder. Keep waiting. I've heard this talk that Nick will bring his A game against his old team - but the coach doesn't play and has players don't have much of an A game. The hard hitting, low scoring game (both teams will complete less than 60% of their passes) will appear closer than it really was.
The Matrix:
LSU by 13 points, 66% of covering a -7.5 point spread
Game 2. Oregon @ Arizona State
It's disappointing to think that the Ducks, after franchise-establishing wins at Michigan and against USC could fall to the Sun Devils and out of national championship contention. It is harder to believe that Arizona State is actually good. I've heard that Oregon has "far and away the best offense in the country." That's a bunch of bologna, but they'll still win this one.
The Matrix:
Oregon by less than a point (51% chance of winning), 31% chance against the 7 point spread
Game 3. Rutgers @ Connecticut
I ragged on Connecticut last week and they proved me wrong. Rutgers' performance, on the other hand, made me look like a prophet. The Matrix favors Rutgers in every statistical category it measures but the score.
The Matrix:
Connecticut by 2.2 (57% chance of winning), toss up against the spread (-2)
Game 4. Wisconsin @ Ohio State
Ohio State will beat up on another very weak Big 10 opponent. The Ohio State University might have a legitimately good team, but the rest of the Big 10 is soft. I understand the South when they moan about a lack of balance in college football - if BC and the OSU play for the national championship at the end of the season, I don't see any good reason to recognize the winner as the best team in the country. In my opinion, if national championship game does not include West Virginia or LSU (both teams that lost only once, on the road, against more talented teams than any that BC and Ohio State play all year), or if it includes any team other than those two, Oregon or Ohio State, I won't bother to watch. But Ohio State, with powerful wins over Akron and Kent State, which is better known for a shooting than football, has played well this season.
The Matrix:
Ohio State by 23 (94% chance of winning), 70% chance against the spread (-15.5)
Game 5. Texas A&M @ Oklahoma
Its a bad week for college football and I could only find 4 games of real importance, so I picked the 5th game as a homer. Stat of the game--the Matrix sees Oklahoma holding the Aggies to 3 yards per carry. I'd like to console myself by saying that A&M was only one stupid call away from beating Oklahoma last year, but the same coach that made that same stupid call (to kick the field goal) made the same stupid call last week against Kansas and seems, in fact, to be perfecting the art of stupid calls.
The Matrix:
Oklahoma by 20 (92% chance of winning), 47% chance against the spread (-21)
The most interesting game this week is Navy at Notre Dame. The Midshipmen haven't pulled out a win in a little less than a half-century (1963, I believe), but this year they get to play the JV. Navy is favored by 3.5, the Matrix gives them less than a point, but my gut tells me it won't be that close. Navy's run attack is good enough to put up points against anyone, and Notre Dame's offense is bad enough to stop themselves on an empty field.
Picks of the week:
I'm not a gambler, and I don't suggest it personally, but I do like outsmarting the folks in Vegas, so I present three games where the Matrix believes they are off the mark.
Illinois is favored by 12 and still the Matrix gives them a 72% chance of covering at Minnesota. Not only do the Gophers have an appallingly bad defense, their offense isn't half as good as the Big 10 likes to believe it is - but they did manage 21 against the might Bison of North Dakota State.
UTEP is favored by 7 at Rice, but they'll win by more than 15, 70% chance of covering. Even Texas, that has struggled against almost everyone this season (including powerhouses like Arkansas State and UCF), blew out Rice.
Iowa State has played tougher in recent weeks against Missouri and, especially, Oklahoma, but they are still one of the worst teams in FBS. Kansas State will cover the 14 point spread and win by 30+, 90% chance against the spread. Need I mention that Iowa State is the only other team against which Texas has looked competent.
Click the image below to see the rest of the picks. Rankings based on the matrix will be released starting next week.
CSV file
Sortable Table
BPR | A system for ranking teams based only one wins and losses and strength of schedule. See BPR for an explanation. |
EPA (Expected Points Added) | Expected points are the points a team can "expect" to score based on the distance to the end zone and down and distance needed for a first down, with an adjustment for the amount of time remaining in some situations. Expected points for every situation is estimated using seven years of historical data. The expected points considers both the average points the offense scores in each scenario and the average number of points the other team scores on their ensuing possession. The Expected Points Added is the change in expected points before and after a play. |
EP3 (Effective Points Per Possession) | Effective Points Per Possession is based on the same logic as the EPA, except it focuses on the expected points added at the beginning and end of an offensive drive. In other words, the EP3 for a single drive is equal to the sum of the expected points added for every offensive play in a drive (EP3 does not include punts and field goal attempts). We can also think of the EP3 as points scored+expected points from a field goal+the value of field position change on the opponent's next possession. |
Adjusted for Competition | We attempt to adjust some statistics to compensate for differences in strength of schedule. While the exact approach varies some from stat to stat the basic concept is the same. We use an algorithm to estimate scores for all teams on both sides of the ball (e.g., offense and defense) that best predict real results. For example, we give every team an offensive and defensive yards per carry score. Subtracting the offensive score from the defensive score for two opposing teams will estimate the yards per carry if the two teams were to play. Generally, the defensive scores average to zero while offensive scores average to the national average, e.g., yards per carry, so we call the offensive score "adjusted for competition" and roughly reflects what the team would do against average competition |
Impact | see Adjusted for Competition. Impact scores are generally used to evaluate defenses. The value roughly reflects how much better or worse a team can expect to do against this opponent than against the average opponent. |
[-] About this table
Includes the
top 180 QBs by total plays
Total <=0 | Percent of plays that are negative or no gain |
Total >=10 | Percent of plays that gain 10 or more yards |
Total >=25 | Percent of plays that gain 25 or more yards |
10 to 0 | Ratio of Total >=10 to Total <=0 |
Includes the
top 240 RBs by total plays
Total <=0 | Percent of plays that are negative or no gain |
Total >=10 | Percent of plays that gain 10 or more yards |
Total >=25 | Percent of plays that gain 25 or more yards |
10 to 0 | Ratio of Total >=10 to Total <=0 |
Includes the
top 300 Receivers by total plays
Total <=0 | Percent of plays that are negative or no gain |
Total >=10 | Percent of plays that gain 10 or more yards |
Total >=25 | Percent of plays that gain 25 or more yards |
10 to 0 | Ratio of Total >=10 to Total <=0 |
Includes
the
top 180 players by pass attempts)
3rdLComp% |
Completion % on 3rd and long (7+
yards) |
SitComp% |
Standardized completion % for
down and distance. Completion % by down and distance are weighted by
the national average of pass plays by down and distance. |
Pass <=0 | Percent of pass plays that are negative or no gain |
Pass >=10 | Percent of pass plays that gain 10 or more yards |
Pass >=25 | Percent of pass plays that gain 25 or more yards |
10 to 0 | Ratio of Pass >=10 to Pass<=0 |
%Sacks |
Ratio of sacks to pass plays |
Bad INTs |
Interceptions on 1st or 2nd down
early before the last minute of the half |
Includes the top 240 players by carries
YPC1stD |
Yards per carry on 1st down |
CPCs |
Conversions (1st down/TD) per
carry in short yardage situations - the team 3 or fewer yards for a 1st
down or touchdown |
%Team Run |
Player's carries as a percent of team's carries |
%Team RunS |
Player's carries as a percent of team's carries in short
yardage situations |
Run <=0 |
Percent of running plays that
are negative or no gain |
Run >=10 |
Percent of running plays that
gain 10 or more yards |
Run >=25 | Percent of running plays that gain 25 or more yards |
10 to 0 | Ratio of Run >=10 to Run <=0 |
Includes the top 300 players by targets
Conv/T 3rd | Conversions per target on 3rd Downs |
Conv/T PZ | Touchdowns per target inside the 10 yardline |
%Team PZ | Percent of team's targets inside the 10 yardline |
Rec <=0 | Percent of targets that go for negative yards or no net gain |
Rec >=10 | Percent of targets that go for 10+ yards |
Rec >=25 | Percent of targets that go for 25+ yards |
10 to 0 | Ratio of Rec>=0 to Rec<=0 |
Includes the top 300 players by targets
xxxx | xxxx |
...
Includes players with a significant number of attempts
NEPA | "Net Expected Points Added": (expected points after play - expected points before play)-(opponent's expected points after play - opponent's expected points before play). Uses the expected points for the current possession and the opponent's next possession based on down, distance and spot |
NEPA/PP | Average NEPA per play |
Max/Min | Single game high and low |
Includes players with a significant number of attempts
NEPA | "Net Expected Points Added": (expected points after play - expected points before play)-(opponent's expected points after play - opponent's expected points before play). Uses the expected points for the current possession and the opponent's next possession based on down, distance and spot |
NEPA/PP | Average NEPA per play |
Max/Min | Single game high and low |
Adjusted | Reports the per game EPA adjusted for the strength of schedule. |
Defensive Possession Stats
Points/Poss | Offensive points per possession |
EP3 | Effective Points per Possession |
EP3+ | Effective Points per Possession impact |
Plays/Poss | Plays per possession |
Yards/Poss | Yards per possession |
Start Spot | Average starting field position |
Time of Poss | Average time of possession (in seconds) |
TD/Poss | Touchdowns per possession |
TO/Poss | Turnovers per possession |
FGA/Poss | Attempted field goals per possession |
%RZ | Red zone trips per possession |
Points/RZ | Average points per red zone trip. Field Goals are included using expected points, not actual points. |
TD/RZ | Touchdowns per red zone trip |
FGA/RZ | Field goal attempt per red zone trip |
Downs/RZ | Turnover on downs per red zone trip |
Defensive Play-by-Play Stats
EPA/Pass | Expected Points Added per pass attempt |
EPA/Rush | Expected Points Added per rush attempt |
EPA/Pass+ | Expected Points Added per pass attempt impact |
EPA/Rush+ | Expected Points Added per rush attempt impact |
Yards/Pass | Yards per pass |
Yards/Rush | Yards per rush |
Yards/Pass+ | Yards per pass impact |
Yards/Rush+ | Yards per rush impact |
Exp/Pass | Explosive plays (25+ yards) per pass |
Exp/Rush | Explosive plays (25+ yards) per rush |
Exp/Pass+ | Explosive plays (25+ yards) per pass impact |
Exp/Rush+ | Explosive plays (25+ yards) per rush impact |
Comp% | Completion percentage |
Comp%+ | Completion percentage impact |
Yards/Comp | Yards per completion |
Sack/Pass | Sacks per pass |
Sack/Pass+ | Sacks per pass impact |
Sack/Pass* | Sacks per pass on passing downs |
INT/Pass | Interceptions per pass |
Neg/Rush | Negative plays (<=0) per rush |
Neg/Run+ | Negative plays (<=0) per rush impact |
Run Short | % Runs in short yardage situations |
Convert% | 3rd/4th down conversions |
Conv%* | 3rd/4th down conversions versus average by distance |
Conv%+ | 3rd/4th down conversions versus average by distance impact |
Offensive Play-by-Play Stats
Plays | Number of offensive plays |
%Pass | Percent pass plays |
EPA/Pass | Expected Points Added per pass attempt |
EPA/Rush | Expected Points Added per rush attempt |
EPA/Pass+ | Expected Points Added per pass attempt adjusted for competition |
EPA/Rush+ | Expected Points Added per rush attempt adjusted for competition |
Yards/Pass | Yards per pass |
Yards/Rush | Yards per rush |
Yards/Pass+ | Yards per pass adjusted for competition |
Yards/Rush+ | Yards per rush adjusted for competition |
Exp Pass | Explosive plays (25+ yards) per pass |
Exp Run | Explosive plays (25+ yards) per rush |
Exp Pass+ | Explosive plays (25+ yards) per pass adjusted for competition |
Exp Run+ | Explosive plays (25+ yards) per rush adjusted for competition |
Comp% | Completion percentage |
Comp%+ | Completion percentage adjusted for competition |
Sack/Pass | Sacks per pass |
Sack/Pass+ | Sacks per pass adjusted for competition |
Sack/Pass* | Sacks per pass on passing downs |
Int/Pass | Interceptions per pass |
Neg/Run | Negative plays (<=0) per rush |
Neg/Run+ | Negative plays (<=0) per rush adjusted for competition |
Run Short | % Runs in short yardage situations |
Convert% | 3rd/4th down conversions |
Conv%* | 3rd/4th down conversions versus average by distance |
Conv%+ | 3rd/4th down conversions versus average by distance adjusted for competition |
Offensive Possession Stats
Points/Poss | Offensive points per possession |
EP3 | Effective Points per Possession |
EP3+ | Effective Points per Possession adjusted for competition |
Plays/Poss | Plays per possession |
Yards/Poss | Yards per possession |
Start Spot | Average starting field position |
Time of Poss | Average time of possession (in seconds) |
TD/Poss | Touchdowns per possession |
TO/Poss | Turnovers per possession |
FGA/Poss | Attempted field goals per possession |
Poss/Game | Possessions per game |
%RZ | Red zone trips per possession |
Points/RZ | Average points per red zone trip. Field Goals are included using expected points, not actual points. |
TD/RZ | Touchdowns per red zone trip |
FGA/RZ | Field goal attempt per red zone trip |
Downs/RZ | Turnover on downs per red zone trip |
PPP | Points per Possession |
aPPP | Points per Possession allowed |
PPE | Points per Exchange (PPP-aPPP) |
EP3+ | Expected Points per Possession |
aEP3+ | Expected Points per Possession allowed |
EP2E+ | Expected Points per Exchange |
EPA/Pass+ | Expected Points Added per Pass |
EPA/Rush+ | Expected Points Added per Rush |
aEPA/Pass+ | Expected Points Allowed per Pass |
aEPA/Rush+ | Expected Points Allowed per Rush |
Exp/Pass | Explosive Plays per Pass |
Exp/Rush | Explosive Plays per Rush |
aExp/Pass | Explosive Plays per Pass allowed |
aExp/Rush | Explosive Plays per Rush allowed |
BPR | A method for ranking conferences based only on their wins and losses and the strength of schedule. See BPR for an explanation. |
Power | A composite measure that is the best predictor of future game outcomes, averaged across all teams in the conference |
P-Top | The power ranking of the top teams in the conference |
P-Mid | The power ranking of the middling teams in the conference |
P-Bot | The power ranking of the worst teams in the conference |
SOS-Und | Strength of Schedule - Undefeated. Focuses on the difficulty of going undefeated, averaged across teams in the conference |
SOS-BE | Strength of Schedule - Bowl Eligible. Focuses on the difficulty of becoming bowl eligible, averaged across teams in the conference |
Hybrid | A composite measure that quantifies human polls, applied to converences |
Player Game Log
Use the yellow, red and green cells to filter values. Yellow cells filter for exact matches, green cells for greater values and red cells for lesser values. By default, the table is filtered to only the top 200 defense-independent performances (oEPA). The table includes the 5,000 most important performances (positive and negative) by EPA.
Use the yellow, red and green cells to filter values. Yellow cells filter for exact matches, green cells for greater values and red cells for lesser values. By default, the table is filtered to only the top 200 defense-independent performances (oEPA). The table includes the 5,000 most important performances (positive and negative) by EPA.
EPA | Expected points added (see glossary) |
oEPA | Defense-independent performance |
Team Game Log
Use the yellow, red and green cells to filter values. Yellow cells filter for exact matches, green cells for greater values and red cells for lesser values.
Use the yellow, red and green cells to filter values. Yellow cells filter for exact matches, green cells for greater values and red cells for lesser values.
EP3 | Effective points per possession (see glossary) |
oEP3 | Defense-independent offensive performance |
dEP3 | Offense-independent defensive performance |
EPA | Expected points added (see glossary) |
oEPA | Defense-independent offensive performance |
dEPA | Offense-independent defensive performance |
EPAp | Expected points added per play |
Wednesday, October 31, 2007
Monday, October 29, 2007
Why Some Teams are Good, Part 2 - Population
Obviously, a team has a better chance of landing a recruit if he lives nearby (or, in the case of Joe McKnight, they might be wishing they had stayed closer to home). In this blog I provide some evidence to support a claim I made in part 1 that increasing population increased the talent pool and, therefore, led to better football teams.
I picked 8 states more or less at random. I tried to include states from a variety of regions, with a variety of sizes and that have experienced a variety of population trends. I have included both Nebraska and Oklahoma, and, honestly, I don't know why.
Ratings come from Soren Sorenson, who you will find listed in the Statistics Hall of Fame. I have added 5000 to all scores so that they are all positive (Sorenson's system ranges from -4000 to +4000, +or- a thousand). Population data is drawn from the Census. Census data is collected every ten years and I have used my own estimates to fill in the gaps.
I have looked at states as a whole, adding together the ratings of all teams in that state, because teams in the same state recruit for players in the same talent pool. For now, I am ignoring population growth in the region (e.g. Georgia benefits from population growth in Florida), and characteristics of the population (e.g. old people in Arizona don't play football), but some day I will look at those issues in more detail.
So, first we begin in 1950.
The 8 states are Nebraska and Oklahoma, which I already mentioned, Florida, Arizona, New York, Indiana, North Carolina and Alabama. The graphic on the left shows the teams as they were ranked in 1950, color coded by state. Florida State, UCF, USF, and Buffalo did not have D 1 programs at the time (or, in some cases, did not have a football team, or just started admitting boys to the school).
This chart is important because, from here on, I will be focusing on indexed values for the state, so that indexed value will always reference back to this starting point. For example, 1950 was a good year for Oklahoma and Army (perhaps the two best teams in the country). This will be important to keep in mind.
This next chart demonstrates an important principle as well. This compares the percent of the total points held by a state (with their scores added together) of all the points available against the percent of the US population in that state. So, New York, despite Army's success, was under-performing. Anyone who has been to a high school football game in Dallas and in Rochester knows why this is happening. It shouldn't surprise anyone that Oklahoma performed the best giving their population size. Alabama was facing a unique challenge in segregation. It would be another 20 years before Sam Bam Cunningham would convince Bear Bryant to integrate, allowing Alabama to dip much deeper in its talent pool.
The population of most states would grew over the next 50 years, but some grew much faster than others. Florida and Arizona are good examples of states that blew up in terms of population, while New York stagnated.
In the following charts, I present data for each state in terms of their performance and their population over the 5 decades from 1950 to 2000. The black line is the team's performance. It is a running four year average which I use under the assumption that players from a cohort will play for a team for four years. The red line is the indexed population, where 100 is equal to the population in 1950. The blue line is based on the same principle, but represents the percent of the US population represented by that state, so that if a team's population is growing slower, but slower than the entire US, the blue line will fall but the red line will rise. The red line, therefore, represents the real talent pool and the blue line the relative talent pool, and because teams are good relative to each other, we should focus on the blue line. (You can click the charts to see a bigger version.)
Nebraska had some kicking teams in the 70's and the mid to late 90's, which shows up in their chart. The population as a percent of the US population was actually going down, but Nebraska kept spitting out world class teams. It makes me think that Osborne may have been a much better coach than we give him credit for. Arizona's performance isn't improving with its rapidly growing population. I think two things are at issue. First, Arizona doesn't have as strong of a football culture as the rest of the South and, second, Arizona's programs might be experiencing a bit of a lag.
I was a little surprised to see how well Indiana fits the pattern. Notre Dame has a unique advantage to recruit nationally and should be able to overcome general demographic shifts. Notre Dame claims their challenges are rooted in high academic standards, so I guess I'll have to look at that claim another day.
Alabama has been generally outplaying its population since the 50's but, like all the others, its performance is generally falling with the decline in its relative population size. The effect of integration on performance is still a little unclear, but something I will definitely look at more closely in the future.
But the overall results from this little experiment are clear--population trends in a region definitely effect the performance of that regions teams. The black lines tend to go where ever the blue lines are going. It also shows that we can't ignore culture, quality of coaches and the power of programs to attract players from long distances.
I picked 8 states more or less at random. I tried to include states from a variety of regions, with a variety of sizes and that have experienced a variety of population trends. I have included both Nebraska and Oklahoma, and, honestly, I don't know why.
Ratings come from Soren Sorenson, who you will find listed in the Statistics Hall of Fame. I have added 5000 to all scores so that they are all positive (Sorenson's system ranges from -4000 to +4000, +or- a thousand). Population data is drawn from the Census. Census data is collected every ten years and I have used my own estimates to fill in the gaps.
I have looked at states as a whole, adding together the ratings of all teams in that state, because teams in the same state recruit for players in the same talent pool. For now, I am ignoring population growth in the region (e.g. Georgia benefits from population growth in Florida), and characteristics of the population (e.g. old people in Arizona don't play football), but some day I will look at those issues in more detail.
So, first we begin in 1950.
The 8 states are Nebraska and Oklahoma, which I already mentioned, Florida, Arizona, New York, Indiana, North Carolina and Alabama. The graphic on the left shows the teams as they were ranked in 1950, color coded by state. Florida State, UCF, USF, and Buffalo did not have D 1 programs at the time (or, in some cases, did not have a football team, or just started admitting boys to the school).
This chart is important because, from here on, I will be focusing on indexed values for the state, so that indexed value will always reference back to this starting point. For example, 1950 was a good year for Oklahoma and Army (perhaps the two best teams in the country). This will be important to keep in mind.
This next chart demonstrates an important principle as well. This compares the percent of the total points held by a state (with their scores added together) of all the points available against the percent of the US population in that state. So, New York, despite Army's success, was under-performing. Anyone who has been to a high school football game in Dallas and in Rochester knows why this is happening. It shouldn't surprise anyone that Oklahoma performed the best giving their population size. Alabama was facing a unique challenge in segregation. It would be another 20 years before Sam Bam Cunningham would convince Bear Bryant to integrate, allowing Alabama to dip much deeper in its talent pool.
The population of most states would grew over the next 50 years, but some grew much faster than others. Florida and Arizona are good examples of states that blew up in terms of population, while New York stagnated.
In the following charts, I present data for each state in terms of their performance and their population over the 5 decades from 1950 to 2000. The black line is the team's performance. It is a running four year average which I use under the assumption that players from a cohort will play for a team for four years. The red line is the indexed population, where 100 is equal to the population in 1950. The blue line is based on the same principle, but represents the percent of the US population represented by that state, so that if a team's population is growing slower, but slower than the entire US, the blue line will fall but the red line will rise. The red line, therefore, represents the real talent pool and the blue line the relative talent pool, and because teams are good relative to each other, we should focus on the blue line. (You can click the charts to see a bigger version.)
Nebraska had some kicking teams in the 70's and the mid to late 90's, which shows up in their chart. The population as a percent of the US population was actually going down, but Nebraska kept spitting out world class teams. It makes me think that Osborne may have been a much better coach than we give him credit for. Arizona's performance isn't improving with its rapidly growing population. I think two things are at issue. First, Arizona doesn't have as strong of a football culture as the rest of the South and, second, Arizona's programs might be experiencing a bit of a lag.
I was a little surprised to see how well Indiana fits the pattern. Notre Dame has a unique advantage to recruit nationally and should be able to overcome general demographic shifts. Notre Dame claims their challenges are rooted in high academic standards, so I guess I'll have to look at that claim another day.
Alabama has been generally outplaying its population since the 50's but, like all the others, its performance is generally falling with the decline in its relative population size. The effect of integration on performance is still a little unclear, but something I will definitely look at more closely in the future.
But the overall results from this little experiment are clear--population trends in a region definitely effect the performance of that regions teams. The black lines tend to go where ever the blue lines are going. It also shows that we can't ignore culture, quality of coaches and the power of programs to attract players from long distances.
Sunday, October 28, 2007
Week 9 Rankings
Here are rankings for week 9 from some of the major polls. I should have my own rankings in the next week or two for comparison.
The image on the left (click to see a larger version) contains rankings from a number of different polls. Below are rankings from the major polls.
Mine | BCS | Coaches | AP | Sagarin | Massey | |
Ohio St | . | 1 | 1 | 1 | 3 | 1 |
Boston College | . | 2 | 2 | 2 | 10 | 7 |
LSU | . | 3 | 3 | 3 | 1 | 2 |
Arizona St | . | 4 | 7 | 7 | 4 | 6 |
Oregon | . | 5 | 5 | 5 | 6 | 4 |
Oklahoma | . | 6 | 4 | 4 | 9 | 11 |
West Virginia | . | 7 | 6 | 6 | 7 | 5 |
Virginia Tech | . | 8 | 9 | 8 | 20 | 10 |
Kansas | . | 9 | 10 | 12 | 2 | 3 |
South Florida | . | 10 | 12 | 11 | 5 | 9 |
Florida | . | 11 | 11 | 9 | 8 | 8 |
USC | . | 12 | 8 | 9 | 24 | 20 |
Missouri | . | 13 | 13 | 13 | 13 | 12 |
Kentucky | . | 14 | 15 | 14 | 12 | 14 |
Virginia | . | 15 | 18 | 21 | 22 | 18 |
South Carolina | . | 16 | 17 | 15 | 14 | 15 |
Hawaii | . | 17 | 14 | 16 | 48 | 46 |
Georgia | . | 18 | 19 | 20 | 19 | 16 |
Texas | . | 19 | 16 | 17 | 31 | 34 |
Michigan | . | 20 | 21 | 19 | 21 | 21 |
California | . | 21 | 20 | 18 | 17 | 19 |
Auburn | . | 22 | 23 | 23 | 11 | 13 |
Connecticut | . | 23 | | | 16 | 23 |
Alabama | . | 24 | 24 | 22 | 25 | 17 |
Penn St | . | 25 | 22 | 24 | 29 | 24 |
Wake Forest | . | 26 | | | 28 | 31 |
UCLA | . | 27 | | | 15 | 25 |
Rutgers | . | 28 | | 25 | 23 | 35 |
Boise St | . | 29 | | | 45 | 49 |
Purdue | . | 30 | | | 32 | 38 |
Texas A&M | . | 31 | | | 34 | 42 |
Georgia Tech | . | 32 | | | 26 | 26 |
Tennessee | . | 33 | | | 35 | 27 |
Oklahoma St | . | 34 | | | 30 | 33 |
Maryland | . | 35 | | | 43 | 44 |
Clemson | . | 36 | | | 38 | 32 |
Wisconsin | . | 37 | 25 | | 53 | 50 |
Air Force | . | 38 | | | 50 | 48 |
Illinois | . | 39 | | | 44 | 43 |
Kansas St | . | 40 | | | 18 | 22 |
BYU | . | 41 | | | 40 | 36 |
Texas Tech | . | 42 | | | 33 | 30 |
Oregon St | . | 43 | | | 36 | 40 |
Michigan St | . | 44 | | | 41 | 47 |
Florida St | . | 45 | | | 42 | 39 |
Cincinnati | . | 46 | | | 27 | 29 |
Miami FL | . | 47 | | | 47 | 41 |
Vanderbilt | . | 48 | | | 46 | 37 |
Colorado | . | 49 | | | 39 | 51 |
Navy | . | 50 | | | 57 | 62 |
Nebraska | . | 51 | | | 61 | 69 |
Troy | . | 52 | | | 63 | 45 |
Arkansas | . | 53 | | | 37 | 28 |
New Mexico | . | 54 | | | 59 | 56 |
Fresno St | . | 55 | | | 58 | 52 |
Utah | . | 56 | | | 52 | 53 |
Mississippi St | . | 57 | | | 60 | 55 |
Northwestern | . | 58 | | | 68 | 63 |
Wyoming | . | 59 | | | 67 | 71 |
Washington | . | 60 | | | 49 | 57 |
Bowling Green | . | 61 | | | 69 | 72 |
Stanford | . | 62 | | | 56 | 59 |
Indiana | . | 63 | | | 55 | 58 |
East Carolina | . | 64 | | | 71 | 65 |
Ball St | . | 65 | | | 66 | 68 |
FL Atlantic | . | 66 | | | 75 | 70 |
UCF | . | 67 | | | 64 | 64 |
C Michigan | . | 68 | | | 74 | 73 |
Pittsburgh | . | 69 | | | 62 | 66 |
Louisville | . | 70 | | | 51 | 54 |
North Carolina | . | 71 | | | 54 | 61 |
Houston | . | 72 | | | 65 | 60 |
Tulsa | . | 73 | | | 81 | 74 |
TCU | . | 74 | | | 70 | 67 |
Miami OH | . | 75 | | | 80 | 79 |
UTEP | . | 76 | | | 86 | 85 |
Akron | . | 77 | | | 82 | 88 |
Notre Dame | . | 78 | | | 78 | 77 |
W Kentucky | . | 79 | | | 84 | 87 |
Iowa | . | 80 | | | 77 | 80 |
Duke | . | 81 | | | 72 | 84 |
NC State | . | 82 | | | 73 | 76 |
Army | . | 83 | | | 95 | 102 |
Mississippi | . | 84 | | | 76 | 75 |
Kent | . | 85 | | | 98 | 98 |
Southern Miss | . | 86 | | | 93 | 82 |
Baylor | . | 87 | | | 96 | 95 |
Nevada | . | 88 | | | 85 | 86 |
New Mexico St | . | 89 | | | 101 | 96 |
Temple | . | 90 | | | 99 | 100 |
Syracuse | . | 91 | | | 97 | 90 |
Washington St | . | 92 | | | 91 | 81 |
Middle Tenn St | . | 93 | | | 83 | 83 |
W Michigan | . | 94 | | | 89 | 92 |
Buffalo | . | 95 | | | 94 | 101 |
Arizona | . | 96 | | | 79 | 78 |
San Jose St | . | 97 | | | 102 | 93 |
Louisiana Tech | . | 98 | | | 92 | 99 |
Toledo | . | 99 | | | 105 | 111 |
Minnesota | . | 100 | | | 87 | 103 |
San Diego St | . | 101 | | | 88 | 89 |
Arkansas St | . | 102 | | | 103 | 97 |
UNLV | . | 103 | | | 100 | 94 |
Ohio | . | 104 | | | 106 | 109 |
UAB | . | 105 | | | 108 | 105 |
Memphis | . | 106 | | | 107 | 104 |
LA Monroe | . | 107 | | | 110 | 106 |
Iowa St | . | 108 | | | 104 | 110 |
Colorado St | . | 109 | | | 90 | 91 |
Tulane | . | 110 | | | 111 | 107 |
E Michigan | . | 111 | | | 109 | 108 |
Rice | . | 112 | | | 115 | 112 |
North Texas | . | 113 | | | 120 | 116 |
Florida Intl | . | 114 | | | 118 | 120 |
LA Lafayette | . | 115 | | | 116 | 117 |
SMU | . | 116 | | | 119 | 119 |
N Illinois | . | 117 | | | 114 | 114 |
Marshall | . | 118 | | | 112 | 113 |
Idaho | . | 119 | | | 117 | 118 |
Utah St | . | 120 | | | 113 | 115 |
Thursday, October 25, 2007
Week 9 Picks and Prediction Model (PM) 3.0
This week I will start with picks, and then describe the prediction model I used to generate the picks below. I'm also getting a big head so I thought someone might be interested in my own picks--and that way we can see if I'm smarter than my own computer. The prediction model (PM 3.0 this week) and I will go head to head on 5 games a week, picking winners and against the spread, and then I will also post PM 3.0's picks for the rest of D 1-A (aka FBS).
If you are interested in spreads, covers.com is the place to go. I have included the handicap for the home team in parentheses.
Game 1. Ohio State @ Penn State (+4)
I don't think this game will be as close as it looks like it should be. Sure, its in Happy Valley, and, sure, Ohio State and Penn State statistically look very similar--except in one very important area, the win/loss record. Watch Morelli to crack like Woodson at SC and OSU will win this walking away.
Me:
To Win: Ohio State
Against the Spread: Ohio State
PM 3.0:
To Win: Ohio State
Against the Spread: Ohio State
Game 2. West Virginia @ Rutgers (+6.5)
Again, the better team is on the road. Pat White will be healthy (or as healthy as he ever is) and West Virginia will be flying around the field again. It is important in this game to consider match ups. South Florida beat WV (at home) because they had the speed on defense to contain Slaton and White. Rutgers beat South Florida (at home) because that speed didn't translate well when Rice was slamming it down their throats. Rutgers, so far, has been a flat, uninspiring team with the exception of one Thursday night. West Virginia will break it open in the second half and score to many points for Rice to keep up.
Me:
To Win: West Virginia
Against the Spread: West Virginia
PM 3.0:
To Win: Rutgers
Against the Spread: Rutgers
Game 3. South Florida @ Connecticut (+4.5)
I have included this game only because I can. Who would have predicted at the beginning of the year that this game would pit two ranked, one-loss teams against each other with Big East title hopes alive? But seriously, I can't get myself to believe that UConn has a good team--when has Connecticut ever produced a good athlete? And I'm not the only one to think this.1 South Florida is definitely the better team, but cold weather and inexperience may slow them down. They still win easily.
Me:
To Win: South Florida
Against the Spread: South Florida
PM 3.0:
To Win: South Florida
Against the Spread: South Florida
Game 4. USC @ Oregon (-3)
I was worried that Mark Sanchez off the bench might give SC the spark they needed to be a good football team again. Fortunately, he's not everything he was supposed to be. It looks like Booty's finger will be well enough and he will lead his team to another mediocre performance. A note on USC--their big victories are against Nebraska (cupcake) and Notre Dame (wedding cake). They lost to Stanford (cheese puff) and almost lost to Arizona (lost little child). Oregon's beat down of Michigan was impressive, but that was a Michigan team that is still recovering from the week 1 train wreck. Both teams are talented, but with PAC-10 talent - either could win by 30 or flake out and lose to my high school team. I take USC, because they have more raw talent to start with.
Me:
To Win: USC
Against the Spread: USC
PM 3.0:
To Win: USC
Against the Spread: USC
Game 5a. Boston College @ Virginia Tech (-3)
See Game 4. Two teams that have not been all that impressive, but, to their credit, they have been winning a lot of games. I'm taking Virginia Tech to knock off the first top 10 team this weekend on Thursday, but it will be close.
Me:
To Win: Virginia Tech
Against the Spread: Boston College
PM 3.0:
To Win: Boston College
Against the Spread: Boston College
Game 5b. Kansas @ Texas A&M (+2.5)
I had to include this game for a number of reasons. First, this might be Kansas's only weekend in the top 10, so we must take a moment to recognize it. Second, I would like to note that Kansas is actually very good and undefeated for a reason (the same reason that BC is undefeated but without the same level of respect). After Saturday Kansas will have two cupcakes (Iowa State and Nebraska) and a road game in Stillwater before the final match up against Missouri. Finally, I have included this game so I can point out that, while they are getting no love from the national media, the Aggies have only lost twice and they are tied for first in the South. The outcome of the game depends on the Aggie passing game. If Kansas can put 8 in the box all night, they win and cover the spread; if not, and A&M burns the secondary a time or two at Kyle Field, it could be very interesting. One last quick note on Kansas--they have covered the last five weeks.
Me:
To Win: I abstain
Against the Spread: I abstain
PM 3.0:
To Win: Kansas
Against the Spread: Kansas
The Rest: Click image to see a legible version
It includes the probability for each team of winning and beating the spread and the yards. Obviously, if a team has a better than 50% chance of winning then they are "favored".
PM 3.0
Prediction Model 3.0 is my first model to account for match ups. The method I have chosen to do this is too simple, but I'm building on trial and error for now. The basic idea is that teams have relatively consistent run to pass ratios. I use time of possession and plays per second to estimate how many plays a team will have in a game (adjusted for how long their opponent will have the ball) and then estimate the number of run and pass plays each team will run. Using their average yards per run play, completion percentage, and average yards per completion, adjusting for the other teams defensive strengths, I can get a figure on the number of total yards a team should have. I then use the basic rating system I used in PM 2.11 and give a bonus to the team that will generate more yards.
In these circumstances, the only real variables that I have to decide on are the adjustments I will be using. I have decided to use a k of 3/sqrt(1+t) where t is the week in which the game took place. The figure, therefore, should stabilize as the season progresses, which I believe mirrors reality.
The adjustment of the rating is Rating + 10*(team yards/opponent yards). I chose ten rather at random, but it really means that in the most extreme cases a team may have 5 to 10 points added to their estimated margin of victory.
The problem with a prediction model that adjusts for match ups is that it cannot be used to rank teams. In a rating system it is necessary that if A>B>C then A>C, but if we take match ups into account then if A>B>C it is still possible that C>A if A matches up poorly against C and well against B. This means that it can't be used for ranking teams, but only for predicting the winner if two teams play. I have thought up a method of getting around that, but programing it will take some time.
If you are interested in spreads, covers.com is the place to go. I have included the handicap for the home team in parentheses.
Game 1. Ohio State @ Penn State (+4)
I don't think this game will be as close as it looks like it should be. Sure, its in Happy Valley, and, sure, Ohio State and Penn State statistically look very similar--except in one very important area, the win/loss record. Watch Morelli to crack like Woodson at SC and OSU will win this walking away.
Me:
To Win: Ohio State
Against the Spread: Ohio State
PM 3.0:
To Win: Ohio State
Against the Spread: Ohio State
Game 2. West Virginia @ Rutgers (+6.5)
Again, the better team is on the road. Pat White will be healthy (or as healthy as he ever is) and West Virginia will be flying around the field again. It is important in this game to consider match ups. South Florida beat WV (at home) because they had the speed on defense to contain Slaton and White. Rutgers beat South Florida (at home) because that speed didn't translate well when Rice was slamming it down their throats. Rutgers, so far, has been a flat, uninspiring team with the exception of one Thursday night. West Virginia will break it open in the second half and score to many points for Rice to keep up.
Me:
To Win: West Virginia
Against the Spread: West Virginia
PM 3.0:
To Win: Rutgers
Against the Spread: Rutgers
Game 3. South Florida @ Connecticut (+4.5)
I have included this game only because I can. Who would have predicted at the beginning of the year that this game would pit two ranked, one-loss teams against each other with Big East title hopes alive? But seriously, I can't get myself to believe that UConn has a good team--when has Connecticut ever produced a good athlete? And I'm not the only one to think this.1 South Florida is definitely the better team, but cold weather and inexperience may slow them down. They still win easily.
Me:
To Win: South Florida
Against the Spread: South Florida
PM 3.0:
To Win: South Florida
Against the Spread: South Florida
Game 4. USC @ Oregon (-3)
I was worried that Mark Sanchez off the bench might give SC the spark they needed to be a good football team again. Fortunately, he's not everything he was supposed to be. It looks like Booty's finger will be well enough and he will lead his team to another mediocre performance. A note on USC--their big victories are against Nebraska (cupcake) and Notre Dame (wedding cake). They lost to Stanford (cheese puff) and almost lost to Arizona (lost little child). Oregon's beat down of Michigan was impressive, but that was a Michigan team that is still recovering from the week 1 train wreck. Both teams are talented, but with PAC-10 talent - either could win by 30 or flake out and lose to my high school team. I take USC, because they have more raw talent to start with.
Me:
To Win: USC
Against the Spread: USC
PM 3.0:
To Win: USC
Against the Spread: USC
Game 5a. Boston College @ Virginia Tech (-3)
See Game 4. Two teams that have not been all that impressive, but, to their credit, they have been winning a lot of games. I'm taking Virginia Tech to knock off the first top 10 team this weekend on Thursday, but it will be close.
Me:
To Win: Virginia Tech
Against the Spread: Boston College
PM 3.0:
To Win: Boston College
Against the Spread: Boston College
Game 5b. Kansas @ Texas A&M (+2.5)
I had to include this game for a number of reasons. First, this might be Kansas's only weekend in the top 10, so we must take a moment to recognize it. Second, I would like to note that Kansas is actually very good and undefeated for a reason (the same reason that BC is undefeated but without the same level of respect). After Saturday Kansas will have two cupcakes (Iowa State and Nebraska) and a road game in Stillwater before the final match up against Missouri. Finally, I have included this game so I can point out that, while they are getting no love from the national media, the Aggies have only lost twice and they are tied for first in the South. The outcome of the game depends on the Aggie passing game. If Kansas can put 8 in the box all night, they win and cover the spread; if not, and A&M burns the secondary a time or two at Kyle Field, it could be very interesting. One last quick note on Kansas--they have covered the last five weeks.
Me:
To Win: I abstain
Against the Spread: I abstain
PM 3.0:
To Win: Kansas
Against the Spread: Kansas
The Rest: Click image to see a legible version
It includes the probability for each team of winning and beating the spread and the yards. Obviously, if a team has a better than 50% chance of winning then they are "favored".
PM 3.0
Prediction Model 3.0 is my first model to account for match ups. The method I have chosen to do this is too simple, but I'm building on trial and error for now. The basic idea is that teams have relatively consistent run to pass ratios. I use time of possession and plays per second to estimate how many plays a team will have in a game (adjusted for how long their opponent will have the ball) and then estimate the number of run and pass plays each team will run. Using their average yards per run play, completion percentage, and average yards per completion, adjusting for the other teams defensive strengths, I can get a figure on the number of total yards a team should have. I then use the basic rating system I used in PM 2.11 and give a bonus to the team that will generate more yards.
In these circumstances, the only real variables that I have to decide on are the adjustments I will be using. I have decided to use a k of 3/sqrt(1+t) where t is the week in which the game took place. The figure, therefore, should stabilize as the season progresses, which I believe mirrors reality.
The adjustment of the rating is Rating + 10*(team yards/opponent yards). I chose ten rather at random, but it really means that in the most extreme cases a team may have 5 to 10 points added to their estimated margin of victory.
The problem with a prediction model that adjusts for match ups is that it cannot be used to rank teams. In a rating system it is necessary that if A>B>C then A>C, but if we take match ups into account then if A>B>C it is still possible that C>A if A matches up poorly against C and well against B. This means that it can't be used for ranking teams, but only for predicting the winner if two teams play. I have thought up a method of getting around that, but programing it will take some time.
Subscribe to:
Posts (Atom)