One of the things I’m toying with for future algorithm variants is how to measure “Strength Of Beatgroups”.

Here’s an example. For this last week in the NFL, only four teams have no beatlosses: Indianapolis, Denver, Tampa Bay, and Carolina. These are the only possible options for who can be ranked #1.

The question becomes who should be number 1? Here are some beatpath stats about each team. Remember, we’re only paying attention to wins and losses here: beatwins, beatlosses, and beatloops.

Team | Beatwins | Beatlosses | Beatloops | Beatpower | Avg beatpower of beatwins | Avg beatpower of beatloops |

IND | 25 | 0 | 0 | 1 | .355 | N/A |

DEN | 27 | 0 | 1 | .964 | .39 | .844 |

TB | 11 | 0 | 4 | .733 | .189 | .567 |

CAR | 6 | 0 | 5 | .545 | .131 | .524 |

Other possible data to look at: the average beatpower of each team’s **direct** beatwins and beatlosses.

Questions I’m mulling: Denver’s beatwins are more impressive than Indianapolis. Is there an argument for Denver being ranked first instead of Indianapolis?

In this example it’s pretty clear that under any measurement, TB and CAR are below DEN and IND. But as soon as one team gets placed in the rankings, another set of teams can take their place. For instance, if we rank IND #1, then the next comparison is between DEN, TB, CAR, and NE.

Team | Beatwins | Beatlosses | Beatloops | Beatpower | Avg beatpower of beatwins | Avg beatpower of beatloops |

NE | 12 | 1 | 3 | .688 | .235 | .852 |

In some ways, NE is better than TB, and in other ways they’re worse.

The principles I’m playing with so far is that the average strength of beatloop teams should pull the team towards that average. Teams should be rewarded for having higher quality beatwins. There’s an argument for ignoring the average power of a team’s beatlosses, each time you’re evaluating a team, you’ve already decided who they’re ranked behind.

The current algorithm just sorts on beatpower – it’s pretty solid, but if there’s a way to mathematically relate the “strength of beatgroup” data without making the rules too arbitrary and subjective, I’d consider it.

One possibility to reward a team for stength of beatwins is to keep the 0-1 scale, and use beat power + (1-beat power)* beatwins avg beatpower. Since avg beatpower of beatwins does not get too high, since by definition they have beatlosses, it won’t cluster everyone near 1. In fact, in the above example, it would not switch any ranks, but it would bring NE closer to TB.

Likewise, some similar formula could bring the score towards beatloop beatpower. There you’ll have to be careful about how you apply any formula, to avoid overadjusting. You could relate the strength of the adjustment to the number of beatloops.

OK, if you want the avg beatpower of beatloops to drag the rating towards it, how about just going for a weighted average:

(beatwins*beatpower + beatloops*avgBPofBL) / (beatwins + beatloops)

Then, to make the average beatpower of a teams beatwins count, just multiply it by this weighted average:

(beatwins*beatpower + beatloops*avgBPofBL) / (beatwins + beatloops) * avgBPofBW

Of course, maybe avgBPofBW isn’t so great. I mean, say the graph stays the same, but Denver adds a beatpath to a team with a beatpower lower than .39 … then their stats actually get worse (although I guess then they do have one more BW)… I don’t know.

I think that you have to differentiate between direct and indirect victories – winning head to head is more valuable than a victory over a common opponant.

So you get a set of all undefeated, all on-loss, all two-loss, etc… then rank them on head-to-head matchups, and then you can rate them with respect to their indirect games.

While you can fiddle around with trying to make a comparison between Denver and Indianapolis since they will not play in the regular season, trying to make a an argument that TB and Carolina should be peers in some sense doesn’t make any sense given Carolina beat TB in head-to-head competition.

In terms of graph theory, you are giving equal weight to all edges, and equal cost to traverse all nodes, which isn’n necessarily representative of the competative landscape… a narrow victory is different from a blowout when trying to judge the better of two teams that don’t meet on the field and maybe long beatpaths filled with garbage teams shouldn’t be weightier than short or medium beatpaths filled with a killing field of potent teams.

So, I guess what I’m saying is that I think that the groups you are picking are the wrong groups – beatpaths are the right way to look at all 32 teams at once, but beatloss groups aren’t the right comparison sets for one-to-one or division sized group comparisons because there are better metrics.

#2 is exactly the kind of on-the-one-hand on-the-other-hand thinking I’ve been torturing myself with the past couple of days. ðŸ™‚

I’ve had some fairly good success with this:

Take the team’s beatpower, and multiply it by the number of *direct* beatwins they have. Then take the avg beatloop power and multiply by the number of beatloop teams (that aren’t already in a beatpath). add together, divide by (num of direct beatwins plus num of beatloop teams).

It doesn’t feel extremely elegant but it’s better than a lot of alternatives. The thought is to make a team’s beatpower more stable if they have a lot of direct beatwins, less if they don’t. That way if they have a really unstable place at the top of the graph due to only one (high-quality) beatwin, then it won’t be weighted so high.

The results for this week’s power results in some reshuffling, but mostly in ways that make sense. Tampa Bay is a few slots lower, for instance, because their beatloop teams are so poor.

This approach doesn’t really change what happened with Carolina the last couple of weeks, but I’m really not so sure there’s anything to be done about that.

#3… that’s why I’m thinking that paying attention to average beatloss power doesn’t really make sense. There’s just too many ways to break it, and plus, after considering it, there just doesn’t seem to be a way to mathematically relate or mix the number in with the other numbers (avg beatloop power).

The avg beatloop power still does make sense to me – especially if it’s limited to only looking at the teams that aren’t otherwise in a beatpath. For instance, right now, Denver has a beatpath to every team in its beatloops except for New England. So it wouldn’t make sense for San Diego’s low rating to pull Denver down. But Denver does have an ambiguous relationship to New England, meaning “unclear if better or worse”, so it does make sense for its rating to be pulled down in the direction of New England’s.

Also, the point of this system is to only pay attention to wins and losses, the theory being that all those smaller stats (point differential etc) tend to cancel themselves out in the wash of a season. TB and CAR were in a beatloop, so there wasn’t a good reason to rank CAR ahead.