Random Walker Rankings for NCAA Football
Can monkeys rank teams?
How Well Can Monkeys Rank Football Teams?
We've all experienced befuddlement upon perusing the NCAA Division I-A college football Bowl Championship Series (BCS) standings, because of the seemingly divine inspiration that must have been incorporated into their determination. The relatively small numbers of games between a large number of teams makes any ranking immediately suspect because of the dearth of head-to-head information. Perhaps you've even wondered if a bunch of monkeys could have ranked the football teams as well as the expert coaches and sportswriters polls and the complicated statistical ranking algorithms.
We had these thoughts, so we set out to test this hypothesis, although with simulated monkeys (random walkers) rather than real ones.
Each of our simulated "monkeys" gets a single vote to cast for the "best" team in the nation, making their decisions based on only one simple guideline: They periodically look up the win-loss outcome of a single game played by their favorite team, and flip a weighted coin to determine whether to change their allegiance to the other team. In order to make this process even modestly reasonable, this random decision is made so that there is higher probability that the monkey's allegiance and vote will go with the team that won the head-to-head contest. For instance, the weighting of the coin might be chosen so that 75% (say) of the time the monkey changes his vote to go with the winner of the game, meaning only a 25% chance of voting for the loser.
The monkey starts by voting for a randomly chosen team. Each monkey then meanders around a network which describes the collection of teams, randomly changing allegiance from one team to another along connections representing games played between the two teams that year. It's a simple process: if the outcome of the weighted coin flip indicates that he should be casting his vote for the opposing team, the monkey stops cheerleading for the old team and moves to the site in the network representing his new favorite team. While we let the monkeys change their minds over and over again---indeed, a single monkey voter will forever be changing his vote in this scheme---the percentage of votes cast for each football team quickly stabilizes. We thereby obtain rankings each week of the season and at the end of the season, based on the games played to that point of the season, by looking at the fraction of monkeys that vote for each team.
Mathematically analyzing how these simulated monkeys behave, we examined the resulting rankings for the past 33 seasons of Division I-A football. The calculations involved are related to a class of so-called "direct methods" of ranking, but the interpretation in terms of random-walkers appears to be novel. Under this system, winning games is directly rewarded and strength of schedule is automatically incorporated because games played against highly-ranked opponents lead to more monkeys inquiring about and making decisions based on the outcome of such games. Armed only with the single simple rule of more often voting for the winner of a game instead of the loser, the top few teams determined by total vote counts are typically quite reasonable. For instance, it isn't any surprise that the pre-bowl-game monkey rankings at the end of the 2002 season choose Miami and Ohio State as the top two teams, nor is it surprising that they pick Miami as the top team in 2001 and Oklahoma as top in 2000 (all 4 were major, undefeated teams).
More intriguing are the differences between the #2 teams in the pre-bowl monkey rankings and the BCS standings at the ends of the controversial 2000 and 2001 seasons. The results of the monkey ranking system depend on the precise value selected to describe how strongly the flipped coin is weighted, but over a wide range of the coin's weighting, the monkeys select Tennessee to play Miami for the championship at the end of 2001 and Washington to play Oklahoma for the championship at the end of 2000. Both of these selections are mildly surprising, as neither team was commonly backed in the controversies at the ends of the respective seasons. The pre-bowl monkey rankings select these two teams in part because our simple rule includes neither information about the dates of games played (no special weight on Tennessee losing the SEC Championship game to LSU), nor margin of victory (Washington won a number of close games). We could have included date of game and margin of victory in such a system by modifying the weighting of the coin according to some formula describing these factors, but such redefinitions would require essentially arbitrary choices about how strongly to weight such factors---in the face of such potential arbitrariness, we prefer the simpler system with just a single parameter to determine the transition probability of going with a game winner.
Also important in these controversial #2 selections is the relatively high ratings that the monkey votes accord to the top teams in the SEC in 2001 and the Pac 10 in 2000---in part because all 7 losses by the top three teams in the 2001 SEC were in conference, while 3 of the 4 losses by the top three teams in the 2000 Pac 10 were in conference.
One might very well argue over whether such selections are correct in any sense. We do not advocate that this method is superior to any other method; rather, our interest was to develop and study a very simplistic ranking system. Rather than directly rate the teams, the random walking monkeys are a simplistic behavioral model for voters who get to choose who they believe is the top team. Any arguments about who "should" have been picked for the National Championship game in controversial years remain inconclusive, underscoring the fundamental difficulty in attempting to rank college football teams based on the relatively small numbers of games played. Additionally, we should emphasize that the scheme is likely skewed towards distinguishing top teams, as opposed to separating, say, #31 from #32, since each random walking monkey has only a single vote. It may seem ironic that a group of mathematicians would prefer the easier to describe algorithm; but in the absence of more complete information---remembering that we're using only the win-loss outcome of each game---we prefer this simple ranking system of coin-flipping random-walking monkey voters with only one number (the weighting of the coin) that needs to be selected.
The virtue of this ranking system lies in its relative ease of explanation. Its performance is arguably on par with the expert polls and (typically more complicated) computer algorithms employed by the BCS. Can a bunch of monkeys rank football teams as well as the systems in use now? Perhaps they can. But how did they fair in 2003? 2004? 2005? 2006? How are they doing in 2007?
THIS PAGE IS NEITHER A PUBLICATION OF THE UNIVERSITY OF NORTH CAROLINA (UNC) NOR THE GEORGIA INSTITUTE OF TECHNOLOGY (GT), WHERE THIS WORK BEGAN. NEITHER UNC NOR GT ARE RESPONSIBLE FOR EDITING OR EXAMINING ITS CONTENT. THE AUTHOR OF THIS PAGE IS SOLELY RESPONSIBLE FOR THE CONTENT. THE RIGHTS TO ANY AND ALL MATERIALS CREATED BY THE AUTHOR OF THIS PAGE ARE RETAINED BY THAT AUTHOR.