I messed up a bit on the previous bit of backtesting, although the way I messed up was informative. The previous round of backtesting tested the beatflukes variant (not vanilla), and each year had all the game results of the previous seasons in my dataset. So, 2006′s results was using a beatpath graph based off of game outcomes from 2003 – 2006.
I’ve managed to configure the backtesting script a bit more correctly now, and I’ve also included a first stab of the tiebreaker method that uses paths in and paths out, which is what Moose uses for his non-weighted rankings. Below is the grid for the vanilla variant.
| Year | Random | uNet | Prev. Week | Beatloop Str. | Fractional | BPower/Win/Loop | Str. of Beatwins | Bucklin | UPower | UPower/Loop | UNet-Lookahead | Paths In/Out |
| 2004 (50-33) | 149-118 | 160-107 | 153-114 | 158-109 | 153-114 | 162-105 | 158-109 | 156-111 | 153-114 | 161-106 | 153-114 | 157-110 |
| 2005 (71-30) | 160-107 | 167-100 | 167-100 | 164-103 | 165-102 | 162-105 | 169-98 | 168-99 | 165-102 | 162-105 | 167-100 | 167-100 |
| 2006 (56-46) | 147-120 | 154-113 | 151-116 | 150-117 | 154-113 | 156-111 | 155-112 | 154-113 | 154-113 | 153-114 | 151-116 | 150-117 |
| 2007 (89-35) | 166-101 | 170-97 | 160-107 | 162-105 | 168-99 | 170-97 | 166-101 | 170-97 | 168-99 | 169-98 | 160-107 | 173-94 |
| 2008 (73-44) | 156-110 | 152-114 | 157-109 | 158-108 | 152-114 | 147-119 | 154-112 | 159-107 | 152-114 | 148-118 | 157-109 | 152-114 |
| TOTAL: 64.33% | 58.32% | 60.19% | 59.07% | 59.37% | 59.37% | 59.74% | 60.12% | 60.49% | 59.37% | 59.45% | 59.07% | 59.90% |
And here are all the tiebreaker variants for single-year data sets using beatflukes graphs:
| Year | Random | uNet | Prev. Week | Beatloop Str. | Fractional | BPower/Win/Loop | Str. of Beatwins | Bucklin | UPower | UPower/Loop | UNet-Lookahead | Paths In/Out |
| 2004 (54-34) | 151-116 | 157-110 | 152-115 | 161-106 | 156-111 | 161-106 | 159-108 | 156-111 | 156-111 | 158-109 | 152-115 | 158-109 |
| 2005 (76-35) | 152-115 | 169-98 | 168-99 | 168-99 | 168-99 | 165-102 | 172-95 | 166-101 | 168-99 | 162-105 | 168-99 | 165-102 |
| 2006 (67-53) | 146-121 | 145-122 | 151-116 | 148-119 | 148-119 | 153-114 | 146-121 | 148-119 | 148-119 | 150-117 | 151-116 | 147-120 |
| 2007 (89-37) | 161-106 | 172-95 | 166-101 | 164-103 | 167-100 | 175-92 | 166-101 | 167-100 | 167-100 | 171-96 | 166-101 | 173-94 |
| 2008 (80-49) | 154-112 | 156-110 | 159-107 | 161-105 | 156-110 | 153-113 | 159-107 | 165-101 | 156-110 | 152-114 | 159-107 | 155-111 |
| TOTAL: 63.76% | 57.27% | 59.90% | 59.67% | 60.12% | 59.60% | 60.49% | 60.12% | 60.12% | 59.60% | 59.45% | 59.67% | 59.82% |
So, conclusions? First it’s odd and a bit discouraging that the accuracy for a season is pretty much indistinguishable from accuracy based off of multiple seasons of games all mixed together. As for the accuracy levels, I found one resource saying that Isaacson-Tarbell was 158-97-1 in 2008 – in the regular season. That’s 61.9%. None of the variants beat that, but the actual beatpath picks (a subset of all the games) do beat that. But, who’s to say that Isaacson-Tarbell wouldn’t pick that same subset even more accurately? At any rate, further investigation is required to see if some combination of beatpaths picks, better-record, and home team could beat Isaacson-Tarbell.