|
The problem of boardgame evaluation is that what we have multiple populations that have very different ideas of how good a game is, and are so aware of their preferences that they will not rate all that many games outside of their favored niches. Therefore, making head to head comparisons is not really about rating games, but the number of people in each group that decide to even rate the game. Any analysis that doesn't attempt to separate those populations will, with little doubt, be more about said confounding population effects than anything else. This is easy to tell by using the "Fans also Like" feature, which IIRC was built in consultation with someone working on recommendations at Netflix. Many top games have little correlation with each other: If you like Brass, you are likely to enjoy heavy eurogames like Terraforming Mars, but you might not like Pandemic Legacy, which happens to be a coop, Gloomhaven, which is more of an American-style game, or War of the Ring, which is mostly for 2 players, and a thematic wargame. The BGG adjusted ratings also have a significant damper on games that have few ratings, which is why more traditional wargames (like, say, A Victory Lost, or EastFront) are always going to be capped by the genre's lack of popularity among the site's visitors. Doing analysis like this is far more computationally heavy though, but it's far more relevant when it comes to telling people what they tend to want to know: Tell me of good games that I will probably enjoy, and aren't quite the same thing that I am playing right now. |
Flawed: you may not like this, even if you usually like things like this
Good: perfectly serviceable example of this type of thing, if you are interested you will probably like it
Exemplary: people who like this kind of thing tend to agree this is one of the best
Outstanding: you may even like this thing even if you haven’t liked things like this in the past, and if this is your thing then don’t miss it
I find that it’s fairly easy to sort things into the first three buckets by scanning through reviews and comments, but the fourth one is hard because by definition you need atypical reviews. Would be interesting to see an analysis that more or less predicted what reviewers would like and then plucked out positive outliers.