Hi folks, Tom here. Last season I did a series of posts looking at the rankings produced by the Beatpaths system, and trying to determine what information we could glean from them. I looked for two things:
- Did the Beatpaths system arrive at more accurate rankings as the season progressed? And,
- Does the BeatPower score tell us anything about the confidence of the system in picking the winner of a given matchup?
At the end of the 2008 regular season, I also did a retrospective look at how well the rankings reflect a team’s overall record.
The point is to think about Beatpaths as a whole system, instead of just looking at any one team. This is the idea of the Beatpaths system in its essence: to look at each team relative to all other teams.
Stability in the Ranks
Below are two graphs: the final graph of ranking shifts from last season, and the first graph of this season. The purpose of these graphs is to look at how many teams shift in the rankings each week, and how dramatically they shift. The method is simple: add up the difference between each team’s rank from this week and the previous week. The higher the number, the less ‘stable’ the rankings are overall. If the Beatpaths system is working, the rankings should become more and more stable as the season progresses, because we will have more information about each team relative to all the others, and can more accurately place them in context.
While this year and last both started out with a high degree of instability in the rankings (typical of the early season), this year’s instability has dropped much more sharply. This may be due to the new tie-breaking criterion being used by TT this year. It also may be due to the lack of beatloops so far this season, because the creation or breaking of beatloops tends to create significant shifts throughout the rankings. We’ll see if this level of stability in the rankings will hold or not.
Making Picks with Confidence
Beyond the looking at the stability of the rankings as a whole, I also looked into which matchups the BeatPower scores of each team predicted with the most confidence. You’ll notice when looking at the weekly rankings that they include a BeatPower score on the right-hand side. The method here is simple: compare the matchups for Week 5 by weighing the rival teams’ BeatPower scores against one another. The greater the disparity, the greater the confidence the system has in picking the winner. If the BeatPower scores are close, the system has less confidence in picking a clear winner. Last year I found as a general rule that the top half of the ‘Confidence’ chart (high confidence picks) would have about half as many incorrect picks as the bottom half of the chart (low confidence picks).
Here are the matchups for Week 5 and the ‘confidence’ that the system has in them:
(out of 100)
(predicted winner – predicted loser)
Note the negative confidence pick at the bottom. We ran into these a few times last year. These occur because the Beatpath Rankings and BeatPower scores occasionally diverge (i.e. a team with a lower BeatPower score will nevertheless be ranked higher than another team with a higher BeatPower score). Since these uniformly happen with low confidence games, it doesn’t seem to make a difference that the score is negative: the Beatpaths system isn’t terribly confident about picking a winner one way or the other. My advice is not to bet on those games 😉
Also note: that the ‘confidence’ scores are simply something I’ve been playing around with out of curiosity. TT always gives the official Beatpaths picks, and this year he’s comparing Beatpaths, Isaacson-Tarbell, and a hybrid of those two methods (along with his personal picks, as always).
A final way of gaining insight into the Beatpaths system and how well it’s functioning is to look to the recent past. Specifically, how well does the current ranking of each team retroactively predict its win-loss record? Teams that are consistently good or consistently bad will usually be well represented by their rank relative to other teams. However, teams that are flukey, win against good opponents but find ways to lose to bad opponents, will be difficult to rank no matter what system you’re using. Looking at each team’s retroactive pick record is a good way to identify which teams these are and assess how well the Beatpaths system is handling them.
At this point in the season, without any beatloops, the Beatpath Rankings have a 4-0 (or 3-0 for the bye week teams) retroactive pick records. While the tie-breakers between certain teams may be disputable, the rankings at the moment do not have any team ranked higher than a team that defeated it. Only when beatloops are formed will ambiguities enter into the rankings, which Beatpaths attempts to resolve by simply removing those paths that form a loop. When ambiguous data is removed the rankings may no longer correspond with who-beat-who, and we can begin to examine the retroactive pick record for each team in detail.
That’s it from me this week. I’ll update this post on game day to fill in the pick confidence table with actual game results.