Backtesting, Revised

I messed up a bit on the previous bit of backtesting, although the way I messed up was informative. The previous round of backtesting tested the beatflukes variant (not vanilla), and each year had all the game results of the previous seasons in my dataset. So, 2006’s results was using a beatpath graph based off of game outcomes from 2003 – 2006.

I’ve managed to configure the backtesting script a bit more correctly now, and I’ve also included a first stab of the tiebreaker method that uses paths in and paths out, which is what Moose uses for his non-weighted rankings. Below is the grid for the vanilla variant.

Year Random uNet Prev. Week Beatloop Str. Fractional BPower/Win/Loop Str. of Beatwins Bucklin UPower UPower/Loop UNet-Lookahead Paths In/Out
2004 (50-33) 149-118 160-107 153-114 158-109 153-114 162-105 158-109 156-111 153-114 161-106 153-114 157-110
2005 (71-30) 160-107 167-100 167-100 164-103 165-102 162-105 169-98 168-99 165-102 162-105 167-100 167-100
2006 (56-46) 147-120 154-113 151-116 150-117 154-113 156-111 155-112 154-113 154-113 153-114 151-116 150-117
2007 (89-35) 166-101 170-97 160-107 162-105 168-99 170-97 166-101 170-97 168-99 169-98 160-107 173-94
2008 (73-44) 156-110 152-114 157-109 158-108 152-114 147-119 154-112 159-107 152-114 148-118 157-109 152-114
TOTAL: 64.33% 58.32% 60.19% 59.07% 59.37% 59.37% 59.74% 60.12% 60.49% 59.37% 59.45% 59.07% 59.90%

And here are all the tiebreaker variants for single-year data sets using beatflukes graphs:

Year Random uNet Prev. Week Beatloop Str. Fractional BPower/Win/Loop Str. of Beatwins Bucklin UPower UPower/Loop UNet-Lookahead Paths In/Out
2004 (54-34) 151-116 157-110 152-115 161-106 156-111 161-106 159-108 156-111 156-111 158-109 152-115 158-109
2005 (76-35) 152-115 169-98 168-99 168-99 168-99 165-102 172-95 166-101 168-99 162-105 168-99 165-102
2006 (67-53) 146-121 145-122 151-116 148-119 148-119 153-114 146-121 148-119 148-119 150-117 151-116 147-120
2007 (89-37) 161-106 172-95 166-101 164-103 167-100 175-92 166-101 167-100 167-100 171-96 166-101 173-94
2008 (80-49) 154-112 156-110 159-107 161-105 156-110 153-113 159-107 165-101 156-110 152-114 159-107 155-111
TOTAL: 63.76% 57.27% 59.90% 59.67% 60.12% 59.60% 60.49% 60.12% 60.12% 59.60% 59.45% 59.67% 59.82%

So, conclusions? First it’s odd and a bit discouraging that the accuracy for a season is pretty much indistinguishable from accuracy based off of multiple seasons of games all mixed together. As for the accuracy levels, I found one resource saying that Isaacson-Tarbell was 158-97-1 in 2008 – in the regular season. That’s 61.9%. None of the variants beat that, but the actual beatpath picks (a subset of all the games) do beat that. But, who’s to say that Isaacson-Tarbell wouldn’t pick that same subset even more accurately? At any rate, further investigation is required to see if some combination of beatpaths picks, better-record, and home team could beat Isaacson-Tarbell.

Leave a Reply

Your email address will not be published. Required fields are marked *