This notebook provides some basic exploratory analysis of games on boardgamegeek (BGG) in support of my work [predicting ratings for upcoming games] and [predicting games for individual users].
For this write up, I examine games published through 2021 that have achieved at least 30 user ratings by the time of writing.[^ I have additionally excluded some games that were cancelled or never released, or have data quality issues with their profiles on BGG.]
What are the designers of games on BGG and how do they relate to community ratings?
Which designers have published the most games on BGG? Which designers are the highest rated? Most popular?
‘Uncredited’ is by far the most common listing for designer, so I have excluded it from the following visualization.
I’ll then plot the distribution of games with the 25 most frequent designers (for games that have achieved at least params$min_ratings ratings). What designers are typically associated with heavier games? What designer is the highest rated (on average)?
The following (interactive) table displays the mean of each BGG outcome by designer. Use the filters below the column names to filter through the table by designer or set a minimum number of games.
Which designers have the largest effect on a game’s average? User ratings?
We could simply look at the games published by each designer and then take their average rating/average number of user ratings. However, this doesn’t account for the fact that some designers design more complex games than others, or that recently published games tend to have higher averages.
To get an estimate of each designer’s partial effect on an outcome, I run penalized regressions lasso) of the BGG average/usersrated on dummies for individual game designers along with the game’s weight and effects for year published.
I handle time effects in two different ways, first creating an indicator for games published before 1900, then fitting cubic splines to the truncated publishing range of years 1900 to present.
\[ Average = \beta_0 + Weight * \beta_1 + Designer_1 * \beta_2 + Designer_2 * \beta_3 * Designer_k ... Designer_n * \beta_n + YearPublished\]