Pick Confidence – Divisional Playoffs

Tom here with the confidence rankings for the four divisional playoff games this week:

(out of 100)
BeatPower comparison
(predicted winner – predicted loser)
Pittsburgh-San Diego 63.2 85.7-22.5 CORRECT
Carolina-Arizona 23.9 93.5-69.6 WRONG
Tennessee-Baltimore 13.8 94.6-80.8 WRONG
NY Giants-Philadelphia 2.3 92.3-90.0 WRONG

Like last week, the BeatPower comparisons of the matchups produces picks that diverge from the conventional wisdom. The system is most confident about Pittsburgh beating San Diego, whereas most spreads seem to think that Carolina has the largest expected margin of victory over Arizona. BeatPower gives Tennessee the edge here, while a number of other assessments seem to give Baltimore the edge—although in either case, no one seems to be predicting a blow-out by either team. Finally, I think BeatPower captures the NY Giants-Philadelphia matchup well, with only a very slim preference for the NY Giants. Given their previous season split, injuries, and intense, long-standing division rivalry, there are few analysts taking a strong stance either way on this game.

15 Responses to Pick Confidence – Divisional Playoffs

  1. The MOOSE says:

    Well, if we’re going to only get one right, it may as well be the one we’re surest about.

  2. Rick says:

    And once again, the results indicate my problems with the methodology remain correctly placed. IF the Eagles were so poorly ranked, Arizona was so poorly ranked – then why did both win relatively easily (certainly Arizona’s was much easier and Philly struggled a bit in the first half)?

    Because the methodology is based on a small sample that is ignorant of variables and focuses on the end result rather than the reasons (variables) for the result. It is limited logical reasoning.

  3. Alternator says:

    I’m not sure how you can object to Arizona being so poorly ranked–DVOA, which you swear by, has them at #20, which is lower than Beatpaths had them to start the playoffs. DVOA has Carolina at #6, higher than Beatpaths places them.

    Win some, lose some. When a terrible team catches fire during the playoffs, nobody using objective data is going to accurately measure them.

  4. Tom says:

    I’m not sure what’s going on with Philadelphia in Beatpaths. Beatpaths is 1-6 picking them since correctly picking their loss to the Ravens. It’s unclear to me, given the amount of data already in the system, that there are any remaining games that can significantly change the system’s assessment of the Eagles.

    TT has mentioned this problem previously, especially when dealing with the position of the NY Jets, which seems to be resilient no matter what the results of the playoffs are.

  5. doktarr says:

    Rick, once again, you’re missing the entire point of beatpath rankings. Nobody here denies that you can probably get more accuracy by looking at more data. We’re not even considering home field here, and that’s a simple and obvious effect.

    The point of beatpaths is to see how much accuracy can be wrung out of a very small data set.

    In the case of Arizona, frankly, every ranking system in existance, from the most involved to the most basic, was picking Carolina. It was an upset in the purest sense. That beatpaths got that pick wrong is completely unremarkable.

    Moose, are you going to do a post-bowl NCAA ranking? I bet it looks a lot more reasonable now that there is so much more inter-conference play.

  6. Tom says:

    Not only did every rankings system pick Carolina to beat Arizona, but this was invariably the one matchup picked with the highest margin of confidence and the biggest point spread of all the division round games.

    That the BeatPower “confidence” for this pick was relatively low (<40), while every other system was far more confident, is somewhat of a silver lining.

    That said, Beatpaths playoff record doesn’t look very good so far (2-6), even given the significant number of arguable upsets.

  7. Rick says:

    Alternator AND doktarr:
    DVOA is NOT something I swear by. It’s just far more accurate. While Arizona was a massive upset, it’s pretty interesting that in DVOA it was the ONLY upset. And it’s worth noting that given any statistical set, you’ll have at least one upset. To have a 75% accuracy rate (or even 50% in the playoffs) is very, very good.

    I am not missing the point of Beatpaths. I’m reiterating it. You cannot wring much information out of very little data. There are certain areas where simplicity simply fails as a concept – particularly if you’re relying on limited (and as I said) improper data sets. Winning in the NFL is NOT a variable – it’s an outcome of other things (which are all accounted for in DVOA).

    That said, I’ve also pointed out there is a VALUE in Beatpaths IF certain other variables are accounted for, and at various times have suggested a few of those variables worth considering. Beatloops are a ridiculous concept. It’s quite possible for one team to beat a team it’s lost to already (and happens quite a bit within divisions) for a variety of reasons (home field advantage, improved defensive play, etc.). However, if all you do is look at wins, you’ll never see the reasons why one win or another in a beatloop is fluky or not. Take the ridiculous St. Louis beatloops this year with their wins over Washington and Dallas. Washington was fluky. Dallas was not – Romo was out, and the team was clearly functioning poorly utilizing a washed-up, has-been QB. Certain adjustments can be made to accomodate for both these scenarios allowing to maintain the value of a Beatpath. You’re working with a small data set anyway, so making an subjective shift on a game here or there wouldn’t make a difference.

    In fact, in some ways it would probably add more to the credibility of the data set.

    My comments aren’t meant to undermine or belittle the overall concept of Beatpaths – but to improve the value of the information. As it stands right now, Beatpaths is 2-6 in the playoffs. If it turns out to be right in each remaining game, then it’s 5-6 – a LOUSY playoff performance.

    FWIW, DVOA is 5-3 at this point. Given the relative similarities in the 2 with regard to standings, DVOA will wind up with a better overall assessment regardless of the final outcome of the remaining 3 games.

    I have seen only 3 “upsets” in the post season. SD over Indy and Arizona in its two wins. While one could call Philly’s win an “upset” that would ignore the fact that Philly has had tremendously close games with the Giants historically, and had already beaten this team several weeks ago. At worst, the game should’ve been a pick ’em.

    I will reiterate…I think Beatpaths has value, which is why I keep coming back. But it needs a bit of tinkering to improve its track record and predictive relevance….which ultimately is the goal of these types of exercises.

    I happen to like the basic concept, just not the construct.

  8. Tom says:

    As I’ve said before, I think Rick is right that we have a very small data set compared to other systems (Football Outsiders & Accuscore). I think we continue to disagree on how much information can be gleaned from even a very small data set–but the proof is in the pudding, and I suspect we’ll continue to go back and forth on this.

    I do disagree, however, with your comments that Beatpaths uses “improper data sets” and that “winning … is not a variable.” Using wins to predict games is not improper, and the Isaacson-Tarbell Predictor does just fine using a team’s win-loss record as the key variable. A team’s win-loss record–a quantifiable metric–is most certainly a variable.

    I think what you’re trying to get at is that as a dichotomous variable (trichotomous in the event of an Eagles-Bengals tie…) subject to flukey outcomes, wins-losses might obscure more about a team’s performance than it reveals. Certainly other variables like yards-per-play or pass defense ought to be more continuous and consistent week-on-week, and you’re right that those types of variables end up composing the outcome that Beatpaths uses as its variable. You are arguing that the varied parts that compose the result are better variables to use, while Beatpaths simply uses the result as the variable. I think that’s a legitimate argument, but I think it’s incorrect to say that wins-losses is an “improper data set” or “not a variable.”

    The final point is that Beatpaths is generally more descriptive than it is predictive. That’s a theme that’s come up in our discussions before, and something I think everyone agrees on. The predictive aspect is fun, and does respectably even compared to expert analysts, even though it is based on a small data set.

    I think you’re right that Beatpaths may need to tinker with loops or additional variables in order to do better as the predictive side of things. Past win-loss record is not necessarily the best predictor of future win-loss record. The iterative method tries to break loops by determining which games are flukey. The weighted method adds in the scores as an additional variable. There’s certainly a lot more potential sets of additional variables or loop-breaking methods that could be tried.

    The key question is: what data set is the most predictive of future wins, given strength of schedule? One data set I happened across seems to do a perfect job of predicting all of the playoff matchups so far, using end-of-regular season defensive data: http://www.coldhardfootballfacts.com/Articles/2_1135_Def._Hog_Index.html

    I have no problem with a vanilla Beatpaths as the principal descriptive method, with a Beatpaths variant (using additional variables or loop breaking criteria) as the predictive method. However, I think the beauty of Beatpaths is its parsimony. If we can work together to find one, perhaps two, variables that have the most significant predictive power relative to strength of schedule, I would be 100% behind a new predictive Beatpaths variant.

  9. Alternator says:

    Rick, here’s the key point that you seem to not grasp, taken from your comment:

    You’re working with a small data set anyway, so making an subjective shift on a game here or there wouldn’t make a difference.

    Beatpaths is purely, absolutely without bias and without subjectivity. Any other possible system involves The Management trying to make decisions on what to include or not include as meaningful–Beatpaths strips this down to the bare minimum data possible, disregarding all the bells and whistles.

    It still does a pretty good job overall as a predictive tool, one lousy postseason notwithstanding.

  10. Alternator says:

    I accidentally submitted that before appending the second comment:

    Tom, while I’m a fan of CHFF and (until this year) a regular forum poster, I’m not sure how much you can commend a specific stat (especially when it is merely an average of multiple other stats) when, outside its top five, playoff appearances are essentially randomly scattered among the remaining teams.

  11. Tom says:


    One nitpick about Beatpaths: it’s not 100% objective. After the all, the decision to select one variable (win-loss record) over any number of other variables is a subjective decision. After that decision is made–then, yes, everything that follows is objective.

    As far as CHFF, my interest is in the difference between descriptive statistics and predictive statistics. The final 2008 “defensive hog” numbers are not very descriptive of playoff appearances. Then again, neither is Beatpaths. I discussed the distortions introduced by the divisions with Rick last week: http://beatpaths.com/?p=329#comment-96025

    Nevertheless, their aggregation of three defensive statistics seems to do a very good job at predicting the playoff performance of teams–100% so far. I’m just throwing it out as an example here. I don’t know how it did week-on-week during the regular season, and its ability to predict the playoffs this year could just be a fluke (I understand they’ve changed the mix of stats from those used for 2007).

    The key strength I think Beatpaths has is looking at a predictive statistic relative to strength of schedule. Something like CHFF’s “defensive hog” index doesn’t take into account *against whom* those stats were racked up, whereas Beatpaths is well-positioned to do so.

  12. ThunderThumbs says:

    I think using the outcome of this week’s games as proof or evidence of problems with beatpaths is remarkably silly. In order to have had better results this week, we would have had to have Arizona ranked ahead of Carolina, Baltimore ahead of Tennessee, or Philadelphia ranked ahead of the Giants.

    Carolina had beaten Arizona head-to-head, and Arizona hadn’t beaten anyone that had beaten Carolina. There was no reason for beatpaths to have ranked Arizona ahead of Carolina. This was an upset.

    Tennessee had beaten Baltimore head-to-head, and Tennessee had also beaten Pittsburgh, who had beaten Baltimore. This was an upset.

    The Giants and Philadelphia had a season split, but the Giants had beaten more quality teams this year than Philadelphia. Philadelphia had a significantly worse record. There’s been a lot said this year about weird problems with the Giants and with the Jets, but keep in mind – the Giants were favored over Philadelphia, and, if the Giants had been consistently ranked higher this year, the system would have actually had a worse record picking the Giants’ wins/losses. In this case, any beatpath problems relating to the Giants only made it more likely Philadelphia would have been picked. This game outcome was also an upset.

    I think beatpaths was right in all four of these matches – even after the week is out, I do think the three “wrong calls” really were upsets. The victors should not have been favored in these matches. Beatpaths didn’t make the wrong call.

    None of these picks were the result of some kind of strange beatpaths judgment call going the wrong way; a tweak wouldn’t have made this week’s record better.

    Any sort of problem with this system can also be ascribed to *any* system that relies on wins and losses.

    It’s still true that the graph and rankings are going to look odd this coming week – but none of that oddness is going to negatively impact the four remaining teams, because the Jets aren’t in the playoffs. We are going to have an extremely close pick between the NFC teams, and an extremely close pick between the AFC teams. And either way, the AFC team should be favored in the super bowl.

  13. The MOOSE says:

    “But it needs a bit of tinkering to improve its track record and predictive relevance….which ultimately is the goal of these types of exercises.”

    This is just wrong. That is not the goal of this exercise. It is an interesting side-effect. The goal is to come up with a reasonable ranking system with the least amount of data possible in an objective manner. Predictive value is understood to be limited, but we track it out of curiosity. The methods are NOT designed to increase predictive value.

  14. Tom says:

    @ MOOSE

    Are you going to put up post-season graphs?

  15. The MOOSE says:

    I hope to, but I currently have a lot going on that’s keeping me busy. It has kept me from updating the graphs as well as finishing development of the NBA and NHL versions. I hadn’t planned on posting any post-Bowl NCAA games but can get around to it later. Once things settle down for me I’ll have some new things to post.

Leave a Reply

Your email address will not be published. Required fields are marked *