How is this calculated?

Most of these charts use two stats tricks: Bayesian shrinkage so rare matchups don't dominate the leaderboards, and Wilson confidence intervals so we can show honest error bars on small samples. Cells with fewer than 5 games are hidden; cells with 5–9 games are shown with low color saturation as a hint that the prior is doing most of the work.

Ancient One difficulty

We compute the loss rate (defeats divided by total games) for each Ancient One that has at least 5 games logged. The error bars are 95% Wilson score intervals — well-behaved on small samples, unlike the naïve normal approximation. The list is sorted hardest (highest loss rate) first.

Investigator × Ancient One

For each (investigator, Ancient One) pair, the raw win rate is just wins / games. The shrunk win rate uses a Beta(α, β) prior centered on the global win rate, with prior strength of 10 — i.e. each cell is pulled toward the mean as if 10 extra games at the global average had been added. Cells with fewer than 5 games are omitted entirely.

Shrinkage, visually

The leaderboards and tier list use shrunk win rates, computed against a community-mean prior with strength 10. The plot below shows what that actually does to each investigator's number. Investigators with thousands of games barely move; those with only the 30-game floor get yanked toward the middle.

Shrinkage in action

Each dot is an investigator. The diagonal would mean 'no shrinkage'; the horizontal line at the community mean shows the pull. Small-sample investigators (smaller dots) get yanked toward the middle.

N= 96,488 HOW?

Calibration check

A good shrunk estimate should match observed reality in aggregate: if we bucket cells by their predicted (shrunk) win rate, the observed (raw) average inside each bucket should fall close to the diagonal. Systematic deviation would indicate the prior is biased.

Calibration of shrunk win rates

Within each bucket of predicted (shrunk) win rate, we plot the actual observed win rate. Points on the diagonal = well-calibrated. Systematic deviation = the model over- or under-predicts.

N= 96,488 HOW?

Doom-track distribution

For each Ancient One we build a histogram of the final doom-track value across all games where it was reported. The ridgeline view normalizes each AO's histogram so the densities are comparable across rows. Bimodal shapes (mass near 0 and near 15) indicate a "swingy" AO where games end decisively in either direction; smooth unimodal shapes indicate predictable pacing.

Source

The underlying spreadsheet is maintained by the Eldritch Horror community. We fetch the raw submissions tab once a day, normalize, and rebuild. See the repository for the pipeline source.