2009 NFL Week 10 Beatpath Power Rankings

Kenneth here again, folks! Sorry this is coming up late; I had some problems getting everything togehter. Anyway, I hope you like chaos, cause baby, we got it in spades. Lots of teams moving, lots of loops. Let’s just get right to it.

Rank Team Notes Last Week EdgePower

1

(Beat NE) One way to stay on top, of course, is to win your game that week. And one way to avoid loops is to win ALL your games. This is what the Colts have done, and it lands them in the top spot. It also helps that this is a win over a particularly strong team, which is something that was holding Indianapolis back somewhat before. Personally, I’m still trying to work up a conspiracy theory as to why Bill Belichick WANTS the Colts to finish undefeated.

5

73.68

(36 – 0) / 76

#1 – #14

2

(Beat STL) Saints avoid all the ruckus through the same “undefeated” method, but the flattening of the graph causes them to lose some power (the Colts, on the other hand, didn’t have that power in the tall graph). Personally, I was rooting for the upset, just because I like upsets. But I guess I got what I was hoping for elsewhere…

2

68.42

(28 – 0) / 76

#1 – #17

3

(Beat PIT) Cincy regains a beatpath to Pittsburgh. They still have two of their division games wiped out (PIT and BAL by Denver), so the second path was very important in both cases.

4

64.47

(22 – 0) / 76

#1 – #19

4

(Beat DET) No real gain by beating Detroit, as you’d expect. And actually, SF beating the Bears causes the Vikings to lose a beatwin over the 49ers (CHI=>PIT=>MIN=>SF=>CHI). That’s a loop that should go away when Chicago and Minnesota play each other, though.

3

63.82

(21 – 0) / 76

#1 – #20

5

(Lost to IND) No real loss of power losing to the Patriots. This is really just holding steady while two other teams fall outta sight from above. I like how the stats-centric-y sites all went full-barrel defending the Belichick decision as not just an acceptable choice, but SUPER AWESOME TOTALLY THE INDISPUTABLE RIGHT CHOICE even though no calculation I saw put the “go for it” percentage more than 9 percentage points higher than “punt”—and all of those had some iffy assumptions on percentages that you could easily sway one way or the other. I get the desire to push back against the mainstream “always be conservative” belief, but if your numbers basically say “it’s a wash”, say that. Even if you do think coaches should go for it on 4th more often (which I do). Personally, I don’t think I would have gone for it—the NE defense seemed like they had a decent shot at stopping a long drive—but it’s a valid call.

7

59.87

(16 – 1) / 76

#2 – #20

6

(Beat SEA) Whoosh! I had to check to see if the Cardinals really only had 3 losses. If you’re thinking that beating Seattle—for the second time—shouldn’t have this much effect on a team, you’re (checks the data) right! So what’s the skinny? Stay with me, this might get confusing. *deep breath, puts on Micro Machines guy voice* Last week Arizona was stuck under Carolina due to their direct loss to them, but this week that loss got looped away in a 5 team beatloop of ARI=>CHI=>CLE=>BUF=>CAR=>ARI. But wait, you say, none of those teams played each other this week how come this loop wasn’t in the graph last week? Well, it wasn’t because CLE=>BUF was looped away in a 4 team beatloop of TB=>GB=>CLE=>BUF=>TB but now that loop is broken up—how, you say? Well, Green Bay beat Dallas this week causing a 3 team beatloop of TB=>GB=>DAL=>TB to form which means that GB=>CLE=>BUF=>TB was able to stay in the graph long enough for CLE=>BUF to contribute to the huge beatloop we were talking about originally! *gasp* And that’s why Arizona is out from under the Panthers, which lets them shoot up the rankings. I knew that CAR=>ARI path wouldn’t stick for long.

19

57.24

(12 – 1) / 76

#2 – #23

7

(Lost to CIN) Pittsburgh falls back underneath the Bengals, but otherwise don’t hurt that much.

8

56.58

(11 – 1) / 76

#2 – #24

8

(Beat CLE) Beating the Browns doesn’t really help the Ravens, so they stay relatively close. I feel like I’m hearing a fair amount of stories about how the Steelers are in the drivers seat for the playoffs and the Ravens are in trouble and have to save their season, but they’re only 1 game apart. Is there really a big difference between these teams?

9

56.58

(12 – 2) / 76

#3 – #24

9

(Lost to CAR) Atlanta and Carolina split their season series. How concerned should people be about the loss of Turner? I thought last year more credit went to him (instead of his line) than was deserved, so I feel like this Snelling kid should be able to hold it down; am I off on that?

10

56.58

(13 – 3) / 76

#4 – #22

10

(Lost to GB) Losing to Green Bay only cost the Cowboys their win over Tampa Bay, which you might guess doesn’t really hurt Dallas all that much (and you’d be right). More important was Washington beating Denver, which causes the Cowboys to lose wins over Kansas City, Carolina, and Atlanta (yes, one game can loop away 3 other games—or more!). The first 2 aren’t awful, but Atlanta was the team that was really holding the ‘Boys up.

6

51.97

(3 – 0) / 76

#1 – #29

11

(Beat PHI) The Chargers have their win over the Eagles looped away by their loss to Broncos. The good news about that is that if they win this week, they’ll restore that path, along with two others (over the Giants and Chiefs). I can’t think of any bigger motivation.

11

51.97

(8 – 5) / 76

#5 – #27

12

(Lost to SD) On the same note, Philly loses something by losing to the Chargers—their win over Washington. Yes, that counts as something to lose. I didn’t say it was big, just that it was something.

12

51.97

(4 – 1) / 76

#2 – #29

13

(Lost to WAS) So, the big drop of the week (well, not the absolute biggest, but the one we care about the most). This game is in 6 beatloops, which takes a huge amount of support out from under Denver. This is kind of an anomaly, I think, so we’ll see how it shakes out. In real football terms, as a Chicago Bear fan I’d be scared if I were a Broncos fan. Last year Orton had a foot (leg?) injury, and he wasn’t the same the rest of the year.

1

51.32

(2 – 0) / 76

#1 – #30

14

(Bye) Not much happens to the Texans on the bye. They do get their win over the Titans back, due to the 49ers beating the Bears breaking up a longer loop and replacing it with a shorter one.

15

50.66

(2 – 1) / 76

#2 – #30

15

(Beat DAL) So, beating the Cowboys causes them to loop away their loss to the Bucs with a shorter loop (TB=>GB=>DAL=>TB), restoring Green Bays path to…Cleveland. Um, yay?

16

50.66

(3 – 2) / 76

#3 – #29

16

(Beat NYJ) The win over the Jets is nice, but as far as I can tell Jacksonville really benefits from Arizona moving up and the graph flattening out, which them out from under a lot of teams.

24

50.66

(4 – 3) / 76

#3 – #28

17

(Beat DEN) For each and every action et cetera et cetera. After beating the Broncos the Redskins now have NO BEATLOSSES. I’ll let that sink in a bit.

31

50.66

(1 – 0) / 76

#1 – #31

18

(Beat CHI) The win over the Bears loops away losses to the Titans and the Vikings. The Titans loss was already looped away last week, but this loop is shorter, so the 49ers also get a beatwin over the Seahawks back.

18

49.34

(4 – 5) / 76

#5 – #29

19

(Bye) Another rising tide story here; the Cardinals’ success benefits the Giants, as well.

23

49.34

(2 – 3) / 76

#4 – #30

20

(Beat ATL) Beating Atlanta is nice, but it gets looped away as part of a series split. What’s nicer is the beatloss to Buffalo getting looped away, as described above. Sure, they lost their beatwin over the Cardinals, but that was helping other teams more than it was helping the Panthers, and being out from under the Bills is almost as helpful.

17

48.68

(1 – 3) / 76

#3 – #31

21

(Lost to SF) Chicago’s loss to the 49ers just loops away their win over Pittsburgh, which doesn’t hurt them since they had already looped that way, and the big long loop takes away their loss to the Cardinals, which helps. I’ve noticed people in Chicago seem to have developed their own version of “BeatPicks”, where you try to come up with explanations to loop away Jay Cutler’s interceptions. On the third one the official got in Devin Hester’s way! The last one was the last play of the game so it was more like a very very short Hail Mary! I like playing this too but at some point you are still left with some ugly picks in that graph.

25

48.03

(1 – 4) / 76

#5 – #31

22

(Beat BUF) The Titans aren’t moving up in the rankings from their great winning streak yet, but they are moving to a higher class of direct beatloss. Last week it was the Jets, now its the Texans and Patriots! Fancy!

22

46.71

(0 – 5) / 76

#4 – #32

23

(Beat TB) Easy come, easy go. I don’t mean wins against the Buccaneers, I mean positions in the rankings. The fact that Buffalo is no longer on top of Carolina spells disaster for Miami, and they return from whence they came. Well, heck, actually further than they were.

13

45.39

(5 – 12) / 76

#10 – #28

24

(Lost to ARI) Hey, you really had me going there for awhile, Seahawks! Yeah, I thought you were going to expose the Cardinals, but then Velma pulled the mask off and you were really just Old Farmer Brown. Those medding kids!

26

44.08

(2 – 11) / 76

#8 – #30

25

(Lost to BAL) It’s getting to that point of the year where I don’t care much at all about the bad teams and have little to say about them. At least the Browns did have some interesting graph movement, losing their beatloss to Chicago. That makes them look better graph-wise, but they still don’t have any beatwins.

27

42.11

(0 – 12) / 76

#7 – #32

26

(Beat OAK) Splitting the season series gets the Chiefs out from under the Raiders, which is good, I guess. I thought I saw a stat that said the Raiders accounted for something like 3 of the Chiefs last 4 wins (going back a few seasons). You might want to expand your repertoire, boys.

30

42.11

(0 – 12) / 76

#10 – #32

27

(Lost to TEN) Technically, according to my quick and dirty calculations this is the biggest drop of the week. But come on, do you care more about this than what happened to Denver? In case you missed it, you can get most of the details in the Cardinals’ entry. Suffice it to say, not having a beatwin to Carolina has hurt the Bills greatly.

14

41.45

(1 – 14) / 76

#11 – #31

28

(Lost to JAC) The Jets end up under the Jaguars due to the loss, which is unfortunate for them. It actually makes their edgepower worse than last week, while most everyone else regressed to the mean due to the graph flattening.

20

38.82

(1 – 18) / 76

#13 – #31

29

(Lost to MIN) I know there was no way they were going to win this game, but at some point this season wouldn’t you like to see the Lions trade moral victories for actual victories? Playing the Vikings well was nice, now how about you beat someone (else)?

28

35.53

(0 – 22) / 76

#14 – #32

30

(Lost to NO) I knew I should have held onto that Scooby Doo joke. You know, nothing against the fine people of New Orleans, but I was really rooting for the Rams this weekend. I always love an underdog. Except for New England this week against Indy, I guess…if that counts.

29

35.53

(0 – 22) / 76

#13 – #32

31

(Lost to KC) Congratulations Oakland! By losing your beatpath to Kansas City you eliminated any pretense that you had value as a football team. Question of the week: which draft bust was more obvious coming out, JaMarcus Russell or Darius Heyward-Bey?

21

34.21

(0 – 24) / 76

#16 – #32

32

(Lost to MIA) It coulda been magnificent Tampa. Magnificent.

32

32.24

(0 – 27) / 76

#17 – #32

24 Responses to 2009 NFL Week 10 Beatpath Power Rankings

  1. rabub says:

    Discovered your website on an FO comment, it’s a nice one to add to my RSS feed !

    Meanwhile, you should change your “How it works” section, because your rant against FO ranking Denver too low at the time they were 6-0, what you evaluated like a mistake, is looking better and better as the season goes on.

  2. The MOOSE says:

    My graphs are up, sorry for being so late this week. I’ve been in the process of moving the last couple of weeks so I’ve been short on time.

  3. doktarr says:

    Thanks for getting them up Moose; I had been hitting your site regularly looking for the update.

    I don’t want to beat a dead horse any more than I have to, but speaking as a Redskins fan and a beatpaths fan: WAS with no beatlosses speaks to a fundamental flaw in the approach. One well-placed upset should not be able to wipe away an entire slate of losses. Sure it’s “interesting”, but it’s also pretty plainly wrong. As analysts of the alorithm, we should be able to recognize that there’s a lack of consistency here.

    You have to recognize two things about that 9% edge:

    1) The people who actually do the probability analysis tend to be pretty conservative in their estimates. That 9% is based on league average offenses. The better you make the Pats’ chances to convert (they are 75% in similar situations over recent years) or the Colts chance to score (they were killing the Pats on offense in the 4th quarter), the bigger the edge is. The idea that you could swing the percentages the OTHER way, towards punting, really doesn’t hold much weight. In a Browns vs. Raiders game, sure, but not Pats vs. Colts.

    2) 9% is NOT, NOT a small edge. Really, think about that for a second. After 58 minutes of football, one decision can swing the outcome by a factor of a tenth. That’s a huge edge. It’s like being able to guarantee your opponent doesn’t score on their first two possessions of the game.

    To me, the decision was a no-brainer, which is why it’s so infuriating to hear those who argue against it talk mindlessly about “playing the percentages”.

  4. doktarr says:

    (BTW, Kenneth, I’m not grouping you in with “those who argue against it talk mindlessly about “playing the percentages””. You’re at least making an argument informed by the probabilities, although I think you’re still allowing conventional wisdom to push you the wrong way.)

    Miami versus Carolina offers an interesting test case here. In standard, this is a game with no beatpaths. IT picks Carolina based on identical records and Carolina being the home team. Standard beatpaths narrowly agrees, based on the ranking algorithm, making it a unanimous pick. However, in iterative, Miami actually maintains a beatpath to Carolina.

    Of course, iterative doesn’t know that Carolina just had their best win of the year, and that Miami just lost their best player. If I had to bet my life on the game, I’d take Carolina. But it’s interesting to see iterative’s more vertical graph put to the test here.

    A sort of mirror image test case is San Diego at Denver. Here, IT and IT-beatpaths pick Denver, and Denver maintains a beatpath to SD in iterative. Only pure beatpaths picks San Diego, based on ranking.

    Of course, none of these systems realize that the Broncos have dropped three straight and just lost their QB. Again, I’d pick SD, but it’s based on those temporal factors.

  5. JT says:

    I too, as a Redskins fan, was surprised to see Washington with no beatlosses, just because of a win over Denver. It really points out what happens when two teams play, one who has a lot of beatwins, one who has a lot of beatlosses, and the second team wins. At this point of the season, a situation like that can create a fair number of loops. And looking at the possible range of spots in the rankings that Washington could be in (#1 – #31) and still fit in the graph seems rather strange ten weeks into the season.

    I do expect that within a week or two (without looking at the schedule), some of these paths are going to be restored, either by some smaller loops being formed or some previous win being reinforced. The Iterative graph on MOOSE’s site keeps a lot of the paths, while the Weighted graph really hurts Denver (and SD) because of this one game. I wonder how hard it would be to take these three graphs, normalize them somehow, and then combine the results to get an overall ranking. Kind of a BCS-style ranking across beatpath algorithm methods.

  6. Kenneth says:

    I admit I wasn’t watching the game super closely, but I didn’t have the impression that the defense was doing that poorly. I mean, they had given up a TD just before, but then just before that they had a pick.

    Also, the numbers people are using are adjusted somewhat, but maybe not enough. For one thing, I don’t know where 75% comes from; I saw in the PFR blog ( http://www.pro-football-reference.com/blog/?p=4671#more-4671 ) that they had a 75% record with Matt Cassell, but Cassell is a pretty good scrambler, so I don’t think you can use that. According to PFR since 2007 Brady has a 58% conversion rate (small sample size alert!), and you can make a somewhat convincing case that this was a harder play than normal because Indy doesn’t care how many yards they get past 2 if they get it, so they have less field they care about. So, maybe it’s lower than that, even.

    But look, the one thing I think is true is that you can’t say that punting was DEFINITELY the way to go. It’s true that you can’t plausibly swing the stats to make the percentages line up that way. I’m just not convinced that the percentages are also OBVIOUSLY statistically significant to the other way. I think it’s perfectly plausible that the Patriot’s chances were about even either way. Which of course means that you still can’t blame Belichick from a tactical standpoint. I might not have made that call, but I’m a defensive guy and a bit conservative anyway. And I don’t really have a problem with the call; you pays your money and you takes your chances, IMHO.

    (Also, I wasn’t trying to say that 9% wasn’t a significant advantage, but just that that seemed to be the upper bound of estimates, and the real difference is probably lower–in the 2-3% range, maybe, which is not a difference I’d worry too much about in a one-time situation.)

  7. doktarr says:

    JT, I don’t see the point in normalizing and combining these graphs. Anyone who’s been around here for a while knows that the “weighted” graph is very unreliable. The fact that Detroit is ranked above Denver and San Diego on that graph pretty much tells you all you need to know.

    As for standard vs iterative, I’ve been making this argument for at least three years now, but I think the way standard deals with loops is fundamentally flawed, and the iterative approach is simply a fix of that flaw. Allowing one game to wipe out many games means you’re saying some games are more important than others, but you’re deciding which are the important ones based on, essentially, quirks of the schedule. There’s simply no reason that should be the case.

    At the far extreme, imagine a 17-team league where each team plays every other team once. If a 1-15 team has its only win against a 15-1 team, then both of those teams have their ENTIRE SCHEDULE looped away by that one upset. Standard beatpaths is essentially saying that we know nothing about either of those teams. That simply doesn’t make sense.

  8. doktarr says:

    My point is that I don’t think 9% is the upper bound of estimates, and 2-3% is pretty much right at the reasonable lower bound.

    Sure, I’d say this was a slightly tougher situation than a typical 4th-and-2, but this effect is pretty marginal. I’d be fine with saying 55%, but that’s as far as I’d go.

    The real issue here is that most people are placing WAY, WAY too much of a probabilistic gap between the Colts’ chance of scoring from the 28, and the Colts’ chance of scoring after a punt. League average numbers in each case based on time out and field position estimates are 53% and 30%, respectively. Plugging that in gives you a bit over a 6% edge – and you still have a 3% edge if you go hyperconservative and assume that the 4th-and-2 play was a pure coinflip.

    The only way you can make it seem like a bad decision, really, is if you assume the Colts score WAAAY more often from the 28 than they do after a punt. But there’s really no reason to think that – either based on the numbers or based on the flow of the game. If you increase both of those percentages from league average, it makes going for it look a lot better.

    Personally, I’d say the Colts were about 75% from the 28, and 50% after a punt. Since I give the Pats about a 58% chance of making the conversion, that means the edge was over 18%, by my reckoning. That’s an agressive estimate, but it shows that 9% is far from a reasonable upper bound.

  9. Kenneth says:

    But consider the common case, too. IIRC*, if there’s just two 3 team beat loops–say A=>B=>C=>A and A=>B=>D=>A, then A=>B is completely removed and B=>C, B=>D, D=>A, and C=>A are left untouched. I know their strength is reduced to 50%, but assuming nothing else touches them they go back to full strength when the graph gets made and rankings are done…right? So that means that the A=>B game has no effect on the rankings or graph, which doesn’t seem right either.

    *We really need to do some better organizing of introductionary data here. Trying to find the explanations for what’s currently being done is really hard, and I think the rant that comment #1 is talking about was made in 2005. 🙂 I wonder if we could set up a wiki…

  10. Kenneth says:

    Geez, I should reload before I post. Also, I should stop posting and get back to work.

  11. The MOOSE says:

    JT, last year I did a similar thing with the NCAA standings to see which teams deserved to be in the BCS tournament that most people wish existed. Basically, I average the rankings of each team from each method and the team with the lowest average was considered the #1 team. Any ties would go to the team which had the better rank in 2 of the 3 methods.

    And to further support doktarr’s contention that 9% isn’t a small difference. In blackjack the house edge under basic strategy is 0.5%. Card counting shifts the edge by about 1.5%, and that 1.5% is enough to make casinos want to kick you out if they catch you doing it. The point is, in the long run, given two choices, you should always choose the option more likely to succeed, regardless of by how much the option is more likely. As a NE fan, I was begging for him to punt. But it had more to do with the fact that I knew we had no running game (which Belichick admitted by going with an empty backfield) and so IND knew we would go for a short pass. It almost worked anyway. What bothered me more about Belichick was that he blew the timeout he needed for a challenge to the play, and the play he selected allowed IND to ignore the possibility of a run which wouldn’t totally be out of the question with only 2 yards to gain.

  12. The MOOSE says:

    Kenneth: I have instructions on the methods I use on my site if you needed a full explanation of the Iterative method.

  13. ThunderThumbs says:

    A lot of the justification for the Iterative method is because of the effect it has in cases like these, which isn’t really the best way to judge its relative effectiveness. I’m definitely open to being affected by relative backtesting performance though, and I just haven’t gotten a chance to code in support for the method yet. I promise I really do intend to at some point. 🙂

    The thing is that it seems conceptually related to the beatflukes method in terms of what sorts of games to exclude and restore from loops, and I was surprised to find that beatflukes was overall less accurate than the current approach. It looks weird for Denver and Washington to have such huge differences in the graph this week, but the graph itself isn’t saying that Denver is “worse” or Washington is “better” – just that there is a lot more ambiguity in the data, which makes sense because the game outcomes of these two teams are pretty varied. Iterative attempts to go a little further than standard in removing this ambiguity by judging/determining which of these games rise to the level of being included in the graph.

  14. Tom says:

    @doktarr & Kenneth

    I think the most problematic loops are the really large loops. I think everyone finds it intuitive that an A->B->C->A loop can be removed without doing too much damage to our assessments of relative team strength.

    But loops that get into the five, six, or seven team range strain credulity. These loops are often the ones formed by the 15-1 team beating by the 1-15 team. The question is what to do about it–is it really a fluke? is it a sign of one or both teams’ inconsistency? is it a sign of a fundamental change in team strength (new quarterback, devastating injury)?

    It’s hard to know which link to break in the loops of 4+ teams. Iterative is one solution but there may be others that are more supportable. For instance, breaking the oldest link large loops may better capture the trending nature of team performance. Or breaking loops that we can somehow objectively define as “flukes” (TT’s old beatloops variant did this, I think).

    The ultimate problem is that different flukey games may mean different things, and therefore a different loop-breaking method may be relevant in some cases but not others. If the flukey game is because a team was missing a key player early in the season or because they were working out kinks in their game plan before the bye, then removing older links might be the best thing to do. If the ‘flukey’ games are recent, it’s more difficult–is it really just a fluke, or is it a sign of declining performance? Has Denver’s performance gone down? When it gets old and the season wears on, will games lost due to Favre’s shoulder/groin be accurately represented or looped away? Is the team truly inconsistent, playing down to bad opponents and beating good opponents (like the Broncos and Eagles last season) in which case they fly up and down the rankings as embarrassing losses are looped away and then restored every other week?

    Not clear what method is best, since flukey losses can come for a variety of reasons. Beatpaths is not alone in having a hard time trying to work them into the system–statistical sites also grapple with them poorly. I often feel bad for the FootballOutsiders folks who are constantly apologizing for the Eagles’ high ranking, despite their loss to the Raiders, for example.

    Iterative is one solution, but it too can produce bizarre results, like the Texans being the #2 team in the NFL last week (Week 9). So I guess the best thing to do is to keep looking for new and meaningful ways of representing the data.

  15. Kenneth says:

    Well, there is a difference between an aggregated action like playing in a casino, where you will definitely see the effects of the changed probability, and a one shot deal like making that 4th down call, which has a decent shot of not happening ever again. I mean, yes, the higher percentage is still a higher percentage, but there’s no guarantees, so if you feel better about a lower percentage shot for some reason it’s not exactly indefensible. Although I guess the reason you would do so is that you think the assumed percentages are wrong for some reason (like betting on a horse that you think is being underrated by the odds, or something), so in truth you are betting what you believe to be the “real odds”, but the real odds are unknowable so who knows. 🙂

  16. Kenneth says:

    @The MOOSE: Yeah, I saw that, it’s good. I checked it before I wrote my post, but I wasn’t 100% sure on how rankings were generated–if the power was totalled just by adding up the paths, or if the weakened path number was added.

  17. doktarr says:

    Tom,

    It’s actually not really the case that only the larger loops can be problematic. In the example I gave, the 1-15 team is in 15 different 3-team beatloops with the 15-1 team, all of which are erased, leaving each team 0-0.

    The problem is that there are 30 games that appear in a beatloop once, and 1 game that appears in a beatloop 15 times, yet we decide that ALL these games are equally unreliable. From a statistical perspective as well as an intuitive perspective, it’s clear that one game is the outlier there. A good algorithm should be able to recognize this.

    I do like the idea of weigting more recent games more strongly, but this sort of thing can easily be integrated into the iterative approach by putting a time decay function to all the path weights before you start removing loops.

    Of course it’s true that no method only looking at limited data can perfectly divine which losses are the flukey ones (assuming there were some perfect ideal anyway, which there isn’t). But that’s not a good reason to not look for the best method – the one that does the most consistent job at retaining good data and removing bad/inconsistent data.

  18. doktarr says:

    Kenneth/TT,

    You’re right that iterative just breaks that one game in the loop. That’s the whole point! Say those are the only games we have to look at for those four teams. If you were computing a ranking of those teams, would you be more inclined to say that all four teams are equally good? In all likelihood, you would not – in stead you would probably guess that B is the best and A is the worst. Sure, either case could be true, but both statistically and intuitively, the former case seems more likely.

    It’s reasonable to compare iterative to beatflukes in one sense – that being, both approaches were attempting to consider more data by throwing out fewer games. However, beatflukes shared the same flaw as standard – that is, how much influence a game has is highly dependent on how many loops it is present in. Also, beatflukes could suffer from the opposite problem – that is, a team with a couple long paths could throw out an arbitrary number of bad losses. Again, the idea of iterative is to treat each link of the graph equally. At the end of loop resolution, the total weights of paths in and out of a team will always equal the win-loss differential of that team. Every game gets the same weight.

    As far as doing a comparison of the various methods over a wider data-set, I’m all for it. However, note that we should only compare games where one algorithm produces a path one way, and the other algorithm produces a path the other way. Otherwise, we’re not really comparing the algorithms – we’re comparing the “tie breakers”.

    This is probably a relatively small set of games, but it would be really interesting to see the record in those games.

  19. The MOOSE says:

    I have to say that I am strongly against a time oriented method of loop breaking. If a team goes undefeated for 10 weeks, then loses a game that makes loops with those 10 wins, you’re saying that the one loss is more important than the 10 wins because it happened last. That doesn’t make sense, and is worse than the Standard method which would at least eliminate the loss too.

    If you’re instead suggesting some sort of Itero-Dative method, you might be able to find a way for it to work. But it would probably be more effective in a league with 82 or 162 games where the schedule isn’t as big of a factor, and each team plays a higher sampling of teams.

  20. doktarr says:

    Yes, MOOSE, I’d be arguing for an “itero-dative” method. First, apply an exponential decay function to the strength of every link. Then, apply the iterative algorith. In the 10 loop case, the one loss goes first, unless your decay function is so strong that the win from the first week has a weight of less than one tenth. (Which it probably shouldn’t.)

    That said… this gets us back to that old question of what the point of this is. Are beatpaths a predictive measure, a descriptive measure, or something in between? Using a decay function ONLY makes sense as part of a predictive measure.

    I see a value in both predictive and descriptive accuracy, but if we ever start looking at time decays we will probably need to seperate the rankings into one measure for each purpose.

    Incidentally, Kenneth, this is another way of looking at your A=>B=>C=>A and A=>B=>D=>A example. Say those five games were the only games of the season. Then, we were asked to use beatpaths to retroactively “predict” the results of the season.

    Standard would yield a record of 0-0. No games would be predicted, because no paths are preserved.

    Iterative would yield a record of 4-1. The preserved structure would correctly predict B=>C, B=>D, D=>A, and C=>A, and only get A=>B wrong.

  21. JT says:

    I know this got shot down as somewhat unworkable before, but I’ll throw it out again. We use the graph to make rankings, then use both the graph and rankings to make picks. We’ve even go so far as to use the ranking numbers to make “confidence” numbers on picks, at the point of the graph when those picks were made.

    What I’m wondering is if we could use those confidence numbers, based on the graph immediately prior to the week a particular game was played, to find the game in a beatloop that most goes against the confidence number we had for that game. That game would be considered the most “flukey” game in the loop and removed, retaining the rest of the loop. This would remove only one game, retaining the rest of the games in the loop, thus (potentially) leaving more games in the graph. It’s somewhat like the previous beatflukes (as best I remember them), but based on a different set of data.

    And yes, I agree that the earlier idea I threw out about the combined ranking from all three methods wasn’t entirely thought out.

  22. […] The Winning Ways of Winners « 2009 NFL Week 10 Beatpath Rankings November 19th, 2009 2008 NFL Picks, […]

  23. ThunderThumbs says:

    JT, I’m also curious about something akin to that… or just taking a week’s rankings, finding how many of the season’s games directly contradict those rankings… I haven’t thought that one through.

    Doktarr, regarding checking the accuracy of graph-building algorithms, you kind of do have to check their performance for all the season’s games by picking a tiebreaker. For instance, beatflukes actually outperforms standard for BeatPicks. But, the overall record is worse – beatflukes damages the accuracy of non-BeatPicks enough that overall it is a net loss. Weird, huh? I don’t have a theory to explain this yet.

    I also agree that using a larger data set would help. I’ve got all the pieces put together for NBA tracking, I think I probably just need some help keeping the content going.

  24. doktarr says:

    I see what you’re saying about wanting to compare the full data set. I think the most unbiased way to look at it is to divide things into four categories:

    1) Picks where both algorithms produce a beatpath
    2) Picks where alg. A produces a beatpath and alg. B does not.
    3) Picks where alg. B produces a beatpath and alg. A does not.
    4) Picks where neither algorithm produces a beatpath.

    I guess that really breaks down into eight categories, as each category above can be broken down by, “and the final pick is the same” versus “and the final pick is different”.

    Finally, I think that the best way to compare is to go retrospective – look at the final, postseason graph, and use it to retroactively pick every game of the season. This clearly brings us towards a “descriptive” as opposed to a “predictive” algorithm, but I think that’s the best approach for now, given that we’re using wins and losses alone as the measure.

Leave a Reply

Your email address will not be published. Required fields are marked *