I thought I’d take the opportunity to trace through the general strategy here, both to get my brain clearer for the season, but also to review for whoever is reading.

We start out with all the wins and losses for the season so far. The first design choice I made was to ignore points, and any other stats other than wins and losses. This is the conceit of the site and the idea. We focus only on wins, losses, and who beat who – and then, usually around Week 4, we start seeing beatloops, like A->B->C->A.

It may be that the initial beatloops are very long, or very short.

The next design choice that I made long ago was to remove the smallest beatloops first. I did this for both philosophical and expedient reasons. First, the smallest beatloop is a season split – where two teams beat each other. What this says to me is that we are unable to determine which team is better by looking at only those two game outcomes, so we throw them out. And so after all team splits are removed, I remove remaining 3-team beatloops, and then the remaining 4-team beatloops, etc.

This is arguable, as you may have A->B->C->A, and A->B->D->E->A, and C->A might be spectacularly flukey. Removing the smallest first would yield B->D->E->A. Removing the flukey game and the longer beatloop would retain B->C. If C->A is flukey, I think you can make a stronger case that B->C than B–>A. But even then, the question is how do you determine that C->A is flukey?

The above paragraph is a good example of the pretzels we enjoy tying our brains up in during the NFL season.

The expedient reason was that removing smaller beatloops first tends to straighten out the longer beatloops. If you don’t remove the smaller ones first, things get so circular with so many different routes, that any other scheme to prioritize loop removal becomes about even more ambiguous tradeoffs.

The other main tradeoff with this approach is that one game can be in several shared beatloops, thereby eliminating several other game outcomes. This has been the source of much of our commentary in previous seasons – whether we see these games as flukey games or key games. I have never been convinced that a game being in more shared beatloops means that is is flukey. But we have developed a few theories (check the comments here for instance) on how to identify actual links (game outcomes) to remove, thereby de-emphasizing loop removal in favor of breaking them into segments that we keep.

At any rate, all this discussion is to get down to one data structure, a directed acyclic graph (DAG).

From there, it’s about finding a power ranking. The general approach is to examine all teams that don’t have a beatloss, apply a tie-breaker, pick one, remove it from the graph, and then examine all remaining teams that don’t have a beatloss (which may include new teams if its only beatloss was the team that got removed in the previous step). We’ve also discussed many tiebreaker methods, from referring to the previous week’s rankings, to other more complex methods.

My principles for the system have always been to keep things simple enough that the method can be explained to a sports layperson. There are articles every year on how Podunk University should be ranked ahead of University of Miami, because of a chain that is long beatpath, so the concept itself is fun and interesting. So far the approach has been to not try and identify single games as flukey and remove them, because what seems like a fluke could instead be a massive clue or harbinger. Instead, the approach has been to instead merely remove ambiguities, and rely on the more clarified parts of the season to attempt to fill in the blanks.

The final part of the system has been to have fun using the system in various ways to come up with weekly picks. I’d love the commenters to come up with their own picking systems that use beatpaths either in full or in part. I’m thinking of a variant of the Isaacson-Tarbell predictor that relies on beatpaths just a tad.

One statistic I’m curious about – the historical predictive pick record for matchups that actually have beatpaths, versus ones that don’t.

Producing an alternate ‘fluke’ ranking may be possible by looking at the retroactive pick record from each week. We can then see which team the standard beatpaths ranking gets wrong to the greatest degree. For the end-of-regular-season ranking last year, Denver was the least accounted for. Other teams were more consistent in their play, so their ranking made only a few games look flukey in retrospect. With Denver, fully half of their games looked flukey given their ranking.

So since Denver has the worst retroactive pick record, we might produce an alternate ‘fluke’ graph and rankings by busting loops starting with the games that retroactively look flukey for Denver. If that doesn’t bust all the loops, the next two worst retroactive pick records are NY Jets and Tampa Bay.

The downside to this method is that it’s a big hassle.