Saturday, October 24, 2009

Got Math?

On a day full of exciting action, including a last-second blocked FG attempt that may turn out to have serious BCS implications, it may seem rather pedestrian to ask a math question. Then again, that's essentially what we do here. So while we're watching the rest of the games, I have a question, brought to my attention by another football ranking fan, Martien Maas.

Martien Maas' Rating System also appears on Kenneth Massey's College Football Ranking Comparison page. Perhaps in part because we ended up very close to each other in the comparisons this week, Martien noted that the RWFL rank order this week is precisely the same as that from Eugene Potemkin's E-Rating System (see also his more detailed discussion). Indeed, the two are nearly the same every week (except for some examples from last year, including here, here, and here). And there are clearly some philosophical similarities between the two rankings. But I haven't sat down to try to work out whether we're mathematically equivalent, so I'd be happy if someone could tell me if they have an expert opinion here. My gut instinct is that our p=0.75 bias value choice happens to set our rankings to the same linear algebra problem, with perhaps the small differences in the past due to details about how non-FBS teams are handled. But, like I said, I haven't looked at it sufficiently yet. Nevertheless, I thought it was worth mentioning...

----

Addition (October 25): Of course, while I tried to leave this puzzle for others, I couldn't let it go myself. I can never resist a good puzzle. It's probably a good thing that I get to solve puzzles for a living. Plus I received an email from Eugene Potemkin responding to a query I sent him directly.

Eugene and I had a wonderfully pleasant exchange of emails back and forth today, wherein he shared some of the details of his E-Rating implementation for college football, adding further mathematical details, including: (1) Where he uses ratios of "ratings" and "anti-ratings" to obtain scores in other sports, he uses a difference for American college football (this is the same as the "First-minus-Last" part in RWFL). (2) Like us, he usually treats the collection of all non-FBS teams as effectively one team. (3) To get around the singular nature of random walks on the fully directed graph---sorry for the lingo here but be thankful I'm not using it to launch into an entire discussion of how this relates to the original PageRank algorithm!---he doesn't treat a win as a full win; rather he equates a win as effectively 3 wins and 1 loss. This is perfectly identical to the "bias value" p=0.75 choice that we've espoused here, which is nice for a variety of reasons. So it appears that the minor differences must be small round-off or tie-breaking differences, and the RWFL(p=0.75) and E-Ratings are completely identical.

Again, a huge thanks to both Martien and Eugene. It's been nice emailing with both of them.

Going forward, we still have value to add, don't worry. For instance, we should spend a lot more time in future posts looking at the plots I post every week that show the top rankings across different choices of this infamous "bias value" p, because those plots hold a lot of utility in being a proxy for various kinds of ranking choices.

Labels: