NFL Beatpaths

This is an attempt to figure out an automatic way to rank NFL teams. Just about every publication out there has a form of “NFL Power Rankings”. So far, they tend to be in two categories:

  • Completely Stats-Driven: Football Outsiders, also featured at Fox Sports, has a system where they break everything down to the average performance per play, per position, per team, and then they adjust a team’s stats by who their opponent is. It’s really rather cool, and completely objective. But there are some things that bug me. For instance, Denver, at 5-1, was ranked 13th in the 2005 Week 6 rankings, behind some teams at or below .500. In other words, a team that is capable of piling up the stats without winning can easily be ranked ahead of another team that doesn’t have great stats, but pulls out wins through gut, will, and exploiting matchups.
  • Completely Subjective: Seems like all the other major power ranking lists out there are in this category. A sportswriter or a committee applies their subjective judgment to all the teams and ranks them however the hell they want. You’ll see huge changes in the lineup every week because of the upsets. The main flaw with these lists is that they aren’t scientific, have huge variance in week-to-week performance, and aren’t really reflective of the overall quality of a team.

Obviously, I like the scientific approach better, although all the lists are fun. But as I said, there’s that one thing about the Football Outsiders approach that bugs me. It doesn’t really pay attention to a team’s wins.

So, I’ve been attempting to devise a rankings system that pays attention to wins. And, since I’m just one guy with no access to actual nfl statistics and no ability to actually improve on Football Outsiders’ system, I’ve decided to create a vastly inferior system by only paying attention to wins. :-)

There’s one rule: a team is ranked ahead of another team if it has a beatpath to that team.

What do I mean by that? Well, that requires graphs. (And that’s really most of the reason I did this, so I could program in perl and play with this graph-generation package I have.) Read on for the graphs…

Technorati Tags:



To explain this visually, we’ll use the first few weeks of the 2005 NFL season. Each week of the NFL season, we end up with clear wins and losses for each team, which give us clues about which teams are better and worse.

Week 1 is boring:
1-Clean

Week 2 is where the fun begins:
2-Clean

Now, obviously we’re not able to make much sense of rankings yet – we’ve got some undefeated teams very low in the graph. But this is just to get a very general, vague sense of how all the teams relate to each other in the pecking order.

Week 3 is where things start to get complicated:
3-Snarl

Look closely. See how Denver has an arrow pointing up to Kansas City? We have a loop. Den->KC->NYJ->Mia->Den. As the season goes on and on, we are going to have more and more of these loops – we’ll even have two-team loops as division rivals split series.

But it’s the season splits that give us a clue of what we should do here. If two teams split a pair of games, it’s no longer clear which one is better. It’s a beatloop. So that means we should just rank the teams as if we don’t know who is better – in other words, as if they have never played each other at all. Then we’ll be left with unambiguous wins, or beatpaths.

So each week, I go through the graph to find all the beatloops (well, my perl program does). I find all the smallest beatloops, remove them, and then recalculate to find the next smallest batch of beatloops. In week 3, we only have one beatloop, of 4 teams. When I adjust for that, the graph looks like this:
3-Beatpath

It’s cleaner, but it still could be less cluttered. Since the full win-loss record of each team isn’t so important to us here, and since getting an idea of the overall ranking is more important, there are some redundant arrows here. Look at Seattle’s long swoopy arrow down to Arizona. Seattle already has another beatpath to Arizona, through Atlanta, Philly, SF, and STL. All the arrows really say is that Team A is better than Team B. We don’t need to be told twice. So let’s remove the redundant arrows:
3-Clean

That’s better, and cleaner. We’re starting to see a clearer picture of the ranking of the teams, too. Teams that are very good tend to have beatwins to other teams that are very good, which means they will have very long beatpaths. By beating San Diego (and helped out by that freak New Orleans victory over Carolina), Denver has the longest beatpath in Week 3.

Week 4 has a lot of beatloops. Don’t even try to look at this closely, it’s just to get an idea:
4-Snarl

The system eliminates all the 3-team beatloops first, and then all the 4-team beatloops, etc, until there are no more beatloops left. Also correcting for the redundant arrows, we’re left with a much cleaner picture:
4-Clean

There are some interesting side effects to removing the beatloops, but so far I haven’t come across a case where it doesn’t make sense after some thought. For instance, in week 4, Denver defeated Jacksonville, but was given no credit for it. The reason is it immediately became a beatloop – Den->Jac->NYJ->Mia->Den. Denver’s loss to Miami keeps them from getting credit for beating Jacksonville. But it goes all ways. You’ll notice that Jacksonville no longer is receiving credit for defeating the Jets, like they had in week 3. Teams should get penalized for inconsistent play and developing beatloops, and this is a perfect example.

For week 5, we’ll go straight to the clean version of the graph:
5-Clean

You can really see the pecking order come into a bit more focus. Denver and Indy both have long beatpaths, although I would suspect that Denver’s beatpath is at higher risk of being broken apart by beatloops in the future, due to Washington’s inconsistency. On the other hand, Denver’s victory over Jacksonville might reappear in future graphs. You’ll notice that the Jets and Miami are off in bubbles by themselves. That’s because they have played so inconsistently that all of their victories are in beatloops. The system is telling us that we don’t have enough data to know where to put them in relation to any other team.

Week 6:
6-Clean

Denver isn’t above Indy for any particular reason, it’s just how the graph laid it out. But you can see that Denver and Indy are putting together good track records of beating good teams, that have in turn beaten other quality teams. Other teams are also without beatlosses, like Tampa Bay, but the teams they’ve beaten aren’t as solid.

We’ve focused on Denver in this article, so let’s take a quick look at their game against the Giants this week. What happens if the Giants win? Well, it depends on how the other games play out, but it’s quite possible that Denver could then lose credit for defeating San Diego, which would give San Diego a boost. Denver would still probably be ranked ahead of the Giants through the Washington path, though – all because the Giants lost to San Diego. Since the Giants lost to a team Denver beat, they would not have a beatpath to Denver. However, if Denver loses to San Diego later this year, then the Giants may get credit for beating Denver!

For a quick illustration on how this kind of scenario, let’s look at Week 7 right after the conclusion of the KC-Miami game.
7-Clean

KC beat Miami. But the graph shows the Jets getting credit for defeating Miami when they didn’t the previous week, and KC getting credit for defeating the Jets! Why is that?

The answer is because earlier in the season, the Jets were in a four-team beatloop: Jets->Miami->Denver->KC->Jets. But since KC just beat Miami, the loop shrunk: KC->Miami->Denver->KC. Miami became even more inconsistent. So by removing that beatpath from every team’s record, we see the following effects. KC’s victory over the Jets is no longer ambiguous – it was only ambiguous because the Jets beat a team (Miami) who was arguably better than KC (due to defeating Denver, who defeated KC), but since KC defeated Miami, that’s no longer in doubt. The Jets previously got no credit for defeating Miami, since they were arguably better than KC, who beat them – but KC put that to rest, so the Jets get their victory back.

You’ll notice one other side effect. Denver had been in a 4-team beatloop with Jacksonville – Den->Jac->Jets->Miami->Den . But since KC beat Miami to create a smaller 3-team loop, giving more reason to believe that Denver is actually better than Miami, this loop is blown away and Denver regains credit for defeating Jacksonville.

In review, it’s definitely true that I started looking at rewarding wins because I was dissatisfied with Denver’s low ranking in the Football Outsider’s power rankings. But I didn’t nudge this system to emphasize Denver in any way. I don’t control the placement of the bubbles (AT&T’s graphviz package does that). It appears that thus far in the season, Denver is stacking up quite a record – not just of wins, but of quality wins.

I’ll continue to create more graphs for each week of the rest of the season. Be sure to read the main site to keep tabs on the season’s developments!

19 Responses to NFL Beatpaths

  1. MDS says:

    I love what you’ve done here. Your graphs provide a great way of looking at the entire NFL season in a simple format. Keep it up.

  2. tt says:

    Cool. Future graphs and power rankings will be their own entries on this weblog. If I get enough support I’ll put them on their own site.

  3. Trogdor says:

    This is an excellent work. I’ve tried to do similar things before, but doing it by hand is nigh impossible (for anyone with a job or life, that is). I’m glad you had the computer skillz to pull this off – it should be an outstanding ranking tool.

    One question about week 7 rankings (pre-MNF of course). It has a Denver-Washington-Dallas-Giants beatpath. Why isn’t that counted as a loop and removed, since the Giants just beat Denver? What other games render this unambiguous?

  4. tt says:

    The latest graph is here.

    As for the longer beatpath loops. What I do is I remove the smaller beatpath loops first and then recalculate. Denver and the Giants have a smaller beatpath with San Diego.

    Here’s a better illustration of what I mean. That beatpath is Denver=>San Diego=>Giants=>Denver, so it gets canceled out. The Giants don’t get credit for beating Denver, because they lost to San Diego, who Denver is apparently better than. It’s ambiguous, so it’s removed.

    With me so far? So, look ahead – later this season, San Diego will host Denver. If San Diego wins, they’ll split the series. This will mean that it will be legitimiately unclear who is better. It’s a smaller beatpath loop of only two teams and should be canceled out.

    So at that point, the Giants should have their win over Denver recognized, because Denver is no longer apparently better than San Diego. Similarly, San Diego’s win over the Giants would show up again.

    So yes. There is a DEN=>WAS=>DAL=>NYG=>DEN beatpath, but when I remove the NYG=>DEN=>SD beatpath, the longer one disappears. I always remove all smallest beatpath loops first, because that removes ambiguity for calculating the larger ones.

  5. tyler says:

    interesting stuff for the visual learners….

    question though, just to further illuminate: i notice SF and Seattle on the same line (2 to 3 down) and Seattle has teams below them, but SF does not. Are your “beat paths” and team strengths balanced horizontally or are they simply beside eachother to make clearer the paths of the teams above them.

    also… while i love what you are doing here, any chance in addition the the cool little graphic, you could also do a “power ranking” in list form that essentially summarizes your findings?

    thanks,
    -T

  6. tt says:

    Yeah, it’s the former – the lines are drawn as such to make a compact graph. Everyone should read the main weblog to find things like future graphs and power rankings.

  7. Dallas Trinkle says:

    Mark Newman at Univ. of Michigan has published an article outlining a related method to ranking college teams (which would obviously be applicable to NFL):

    http://aps.arxiv.org/abs/physics/0505169/

    Loops are treated rather by diminishing the importance of “separated” defeats–that is, A beating B is given more importance than A beating C which beat B. And, of course, a loop (A beats B beats C beats A) doesn’t help anyone particularly. It’s all handled automatically, and based on analysis of directed graphs using simply “who beats who.”

    Thought it might be worth mentioning.

  8. Ken says:

    if I beat you and you beat me, and your better than I am, then I should get more credit it for it – it shouldn’t just eliminate the existence of the game….still, it’s interesting – You should look at how chess ratings are handed out – it has the same basic idea _ when you beat an opponent you get a certain number of points averaged into your rating _ the better the person you beat (i.e., the person with the higher rating) the more points you get and the more your rating goes up _ you also get points for beat worse players, just not as many, because you were expected to beat them – as you play more and more games, it’s harder to move your rating because there is a bigger base against which new data is averaged with. You could do this with footbal _ it would avoid the loops _

  9. looj says:

    Seems to me you could just use matrix decomposition. Just tabulate points scored in a square matrix of team x team. Then convert the result into a symmetric matrix plus an anti-symmetric matrix. Discard the symmetric matrix. Rank the teams according to their entries in the remaining matrix.

  10. ThunderThumbs says:

    #7, thanks I’ll check it out. Sounds like he was thinking of the same sorts of things as me, with the difference that he has a PhD. Heh.

    #8, at various times in the development of the algorithm, I’ve thought of things like that, but the problem is how to determine the strength of the opponent you beat. They’re weaker since you beat them. If you go by what their ranking was before you beat them, then things start to not work as well.

    #9, I came across mention of something called a Markovian matrix that sounds like that, and I heard it’s used to handicap race horses. But I haven’t found any good resources online on how to actually do it. Would probably need a question-answerer to volunteer to answer some questions for me on how to actually create and subtract and interpret the matrices. I took a matrix theory course in college but it’s been a while.

  11. Will says:

    As both a math grad and sports fan, I find your method to be fantastic. I tried to create a similar system back in high school (early 1990s) with a friend as part of a math project. However, as noted by a previous poster on your blog, this was difficult without the computer graphing tools. (However, we did successfully use Excel to pick the winners of both the Rose Bowl and Superbowl that year!) At any rate, I would love to see this system used on NCAA football — both for the whole IA league and more condensed versions for each conference. And I’m quite sure I’m not alone, especially after your publicity on King Kaufman. What do you say?

  12. Ryan Waddell says:

    As a fellow nerd and a football fan, I salute you, good sir. I’d be fascinated to see the code, but I would understand if you wanted to keep it to yourself for possible sell-it-in-the-future reasons :)

  13. Paul says:

    Nice work. I’ll be interested to see how the graph changes as the season grinds on. And I too would love to see an NCAA I-A graph … if there’s enough computing power available. :)

  14. [...] The “How does it work” page gives a good summary of how charts and created and how that leads to rankings.  NFL fans who happen to be stat nerds as well (i.e. anyone who reads Football Outsiders) and also those who dismiss stats as the work of Nerdy McNerdersons and say only “wins and losses should count” should both find this method interesting. [...]

  15. otbricki says:

    Hi -

    Interesting problem there with removal of beatloops. What you have is the classical graph theory out of combinatorial mathematics approach to determine winners of tournaments. It also turns out that there is a mathematical solution known as the minimum feedback arc set which can be used as a fair way to remove what you call beatloops. Just removing the small loops first won’t necessarily give you the correct answer for your rankings, while a minimum feedback arc calculation will.

    I do have one problem with your rankings, and that is that if a team is undefeated it may still have a low ranking because it has not played anyone with significant wins. The Patriots are an example of this. Even though they could not possibly have a better beatpath than they currently have they end up with a low ranking. This is not college football where a team has some control over its schedule.

    Any method that leads to a ranking of the Lions and 49′ers ahead of the Patriots is obviously a bit questionable.

  16. ThunderThumbs says:

    Yeah, I have to explain this one a lot. otbricki… you’re right, as is everyone who makes related points. But the real aim of the system is to graphically represent the entire season of a league, showing who beat who, who appears to be better than who based off of wins and losses.

    So it isn’t exactly that the Patriots are 12th, even though that’s where I have them. It’s more that just based off of the existing wins and losses so far, there is more reason to rank those eleven teams ahead of the Patriots, than there is to rank the Patriots ahead of them. I’m not saying that the Patriots are the 12th-best team in the league, though. This is a tool – if someone is convinced the Patriots is the best team in the league, and then they come here and say, “oh wow, well hold on a second – it turns out they really haven’t beaten anyone of value yet…” then the beatpaths graph/ranking has served its purpose.

    I’ll look up the minimum feedback arc set. It might not fit what I’ve wanted to do here – one thing is that I kind of have perversely enjoyed creating a system that relies on absolutely zero math. It makes intuitive sense to me that a team split should cancel itself out before a three-team beatloop, and so on.

    And yes, it is early in the season. There is just something interesting in watching the graph change from round to round. Like it or not, teams get prematurely anointed or rejected from people not paying enough attention to short-term schedule difficulty. I think this point is actually a plus for beatpaths rather than a minus.

    But what’s even more interesting is when the paradigm shifts. For instance, a seemingly good team can lose to a couple of seemingly crappy teams, making the seemingly good team seem bad. But then it turns out those bad teams are good. So the first good team is still good. We see that a lot – last year, New Orleans was seen as good here, long before the main media came around.

  17. Suresh says:

    Actually, otbricki, I was thinking along the same lines, but the minimum feedback arc set is not quite what is going on here. In the feedback arc set problem, you want to delete the fewest number of edges so that there are no cycles (i.e beatloops). However, in that problem, you can do this by deleting any edge from the beatloop. In this setting, that’s not the right thing to do, because according to the rationale, any cycle means that you can’t infer anything about rankings from that ordering, and deleting an edge creates an articial ranking.

    For example, if INDY beats NE beats CINCY beats INDY, then the feedback arc set problem would recommend deleting (say) the INDY beats NE edge. But this would create a scenario where NE is superior to INDY and CINCY, not borne out by the data.

    Actually, what you need here is a min-cost flow. This is a lot more mathematical than what you might like, but the basic idea (see the book by Ahuja, Magnanti and Orlin) is that you want to compute a flow from the “source” – in this case all undefeated teams) to the “sink” – all winless teams (if there are no undefeated or winless teams, that’s not a problem either). Moreover, you want the flow (think of pushing water along the edges) to have minimum cost (because pushing water is expensive). In that case, the min-cost flow will automatically exclude all cycles, essentially deleting them, and the resulting graph (where we only keep edges with flow on them) will be the graph you need.

  18. thebobster says:

    It’s been a couple years since I posted, but I do stop by here several times during the football season.

    It seems that one of the largest factors that could improve the accuracy of the beatpath ranking is to get home field advantage into the equation somehow.

    I’m in Seattle, and our Seahawks have one of the more dramatic differences between home record and away record.

    My first thought was to split the league’s thirty-two teams into sixty-four “virtual” teams, like “Seahawks_HOME” and “Seahawks_AWAY” for each team… then run the same analysis you’ve got now and see what pops out.

    But you’d always have one team’s HOME version playing another team’s “AWAY” version, so the graph might be disjointed. I can’t quite visualize whether that would work or not.

    Another (more labor-intensive) way to explore home-field advantage would be to split just one team into its home and away versions… so have “Seahawks_HOME” and “Seahawks_AWAY” in the graph with the other thirty-one (non-split-into-home-and-away) teams. That would give a useful graph, and useful rankings, with Seattle’s two versions each showing up in the graph… possibly with a beatpath from one to the other!

    I don’t know how you could parlay that into a single useful graph that helps visualize all teams’ home field advantage into the beatpaths system…

  19. [...] team of our site’s proprietor, whose low reputation at one time was basically responsible for the creation of this site, end up at the bottom of our rankings. They do have the Tim Tebow future to look forward to, [...]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>