The Great Vote Count Analysis (Pre-Discussion)
- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
The Great Vote Count Analysis (Pre-Discussion)
Let's say hypothetically that I have my hands on a dataset tracking every vote anyone's ever made across hundreds or potentially even thousands of completed non-theme games. I've got each vote's post number, target, and maker, whatever. I even know which Day they happened. And for each game in the data set, I also hypothetically have all the stuff you usually see in a mod's OP - stuff like who won, each slot's role/alliance/fate, replacements, etcetera. I also of course have every one of each game's posts.
What are some useful or interesting questions that I can ask of this data? I know that people examine vote counts a lot to figure out a game - what are some ways we can ground these sorts of analyses in evidence?
An organized list of exploratory questions posed as of the linked postLast edited by Psyche on Sat Feb 08, 2020 4:08 pm, edited 4 times in total.- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
Relative to a base rate of like .28 for a 2/7 setup, for example, .5 would be great from just looking at votes.
But i don't really think collecting a bunch of statistics comparing town and scum rates of some behavior in a given situation is going to help anyone find scum in a real game. Even if one of these comparisons found some substantial factional difference on average, I dunno how'd that information would be applied in particular situations.
Think I will do these exploratory analyses, but am hoping I can try building and testing models of voting behavior that apply across situations. Maybe I'll start with a naive model that assumes every player randomly votes for whomever they don't know is in their faction, see where the data diverges from that, and go from there.- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
Is there a possibility of some EV model at the resolution of individual votes? We compute for each post the current EV of voting each player given the gamestate and the posting player's knowledge, win condition? It seems to me that if you don't include some model of town scumhunting and scum trying to defeat their scumhunting (which either vastly complicates things or comes out to a wash or both), then it'd just end up w/ town voting mostly randomly and scum voting for the biggest existing wagon on town.
It'd be really convenient for town if this had any ground to it, as it asserts observable differences between factional voting patterns even before roles get flipped - scum almost always jumping to the largest wagon around while town don't go out of their way to sheep at all. If we find that this model doesn't track scum voting patterns well at all, then it becomes way more important to consider their strategies for appearing town, or alternatively to suppose that maybe they have some other voting strategy unconcerned w/ deceiving town. The Charisma model discussed below, for example, asserts that scum will prefer to support the most "charismatic" townie's mislynch instead of whichever mislynch seems closest.
As another approach, consider RC/MathDino's Charisma Model of Mafia.
It makes predictions about how people vote over the long run, so our hypothetical dataset can conceivably evaluate this model's gains over more simplistic EV calculations (ones that assume random voting/lynching) when it comes to predicting how games really turn out. We have to come up with some way to quantify/infer ability (or perceived ability) to avoid being lynched - perhaps we look at each player's careerlong success, perhaps we try to discover what kind of posting is associated w/ success at lynch avoidance, or something else. Each metric is its own model, and we can evaluate each one.My assumptions:
Gamestates result from predetermined charisma levels, where charisma is defined as "ability to avoid being lynched".
Town will always attempt to lynch the least charismatic (scummiest) player. Town will never no-lynch.
Scum will always attempt to kill the most charismatic (towniest) town player.
PRs will claim at L-1. If uncc'd, they become most charismatic.
Scum will fakeclaim confirmable roles at L-1 in order to out TPRs, and will be lynched if counterclaimed.
Scum are aware of the charisma list, and will counterclaim PRs if they are more charismatic and if doing so would not lose them the game.
I'm sure both these models are too simple to not be pretty shit, but they'll be a nice start maybe.
ok now it's out of my system for a while i hope pls lord- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
Found a way to port markdown over to bbcode, so a lot easier to produce simply formatted posts. Tried to organize and reflect on the proposed exploratory analyses so far.
Categories of Proposed Exploratory Analyses
Here, exploratory means there's not a particular model in the foreground driving the question. Instead, we're just producing statistics over the data set.
Doable With Initial Dataset- % of scum self-votes to town self-votes
- % of scum first vote busses vs scum first vote town votes
- a version of the above that excludes self-hammers
- % of scum L-1s that are not hammered
- % of scum hammers on town
- % of scum hammers on scum
- % of town hammers on town
- % of town hammers on scum
- how often scum bus (2x)
- how often there's more than one scum on a wagon (2x)
- how often scum vote right next to each other (2x)
- average frequency of town posts
- average frequency of scum posts
- average frequency of town votes
- average frequency of scum votes
- how often do scum vote someone, vote somewhere else, then return to the original vote, and how often town do it in contrast
- what % of chronological time during a day phase someone was voting no one?
Potentially Sketchy Without Full Dataset
For each player X belonging to all of the set of games find out...- What % of lynches with their vote on them as town flip scum
- What % of lynches with their vote on them as town flip town
- What % of lynches with their vote on them as scum flip scum
- What % of lynches with their vote on them as scum flip town
- "have i ever voted correctly"
Reflections
I'd had a category for analyses that might be technically challenging, but no one really proposed any like that just yet that wasn't just modeling. We may want to rework some of these questions to appreciate complexities in the data - for example, for some questions we might want to control for the influence of PRs or other sources of information besides player intuitions that might drive voting patterns.
The analyses above should be a great foundation for work on two kinds of substantive questions.- Most questions will help explore whether VCA in the systematic sense of the word is possible - whether, without even considering context much, interpretable rules can be applied to public voting patterns to help discriminate town from scum. Finding any alignment-discriminative results in these would be pretty surprising and a big deal, but even then just a first step toward the holy grail.
- The output of RC's(/Schadd's) line of requests is useful for exploring whether mafia is more a game of skill or of chance. We discussed that prospect a bit in site chat; this NYTimes article by Daniel Kahneman is super relevant.
Last edited by Psyche on Tue Feb 25, 2020 11:31 am, edited 1 time in total.- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
otoh i dont really mind if discovering something changes how people play per se and in fact will be interested in studying if gameplay substantially changes when new stats are posted to this subforum.In post 35, DrDolittle wrote:the issue with studying these questions is that the system is fragile. I.e. if you find that with p = 0.01 scum votes right next to each other, the very nature of discovering this result will change how people play, and make the result irrelevant (or will quickly equilibriate to having no predictive power). That's also why (my hypothesis) that Elli's computer codes are kept hidden and he rarely if ever discusses findings. Either that or he throws everything into a huge Neural Network and doesn't ask why something happens but just gives an outcome as probability a player is mafia.
and on the other im interested in bigger game than particular statistics- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
professor just delayed a assignment deadline to after spring break
get fked n
if it can be done w the dataset, it's gameIn post 40, Ankamius wrote:I'm most interested in how scum lynch timings effect winrate statistically but that seems outside the scope of this(?)
As in how often town wins with the first scum lynch at full plist, 80%, etc.- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
https://en.wikipedia.org/wiki/Goodhart%27s_lawIn post 35, DrDolittle wrote:the issue with studying these questions is that the system is fragile. I.e. if you find that with p = 0.01 scum votes right next to each other, the very nature of discovering this result will change how people play, and make the result irrelevant (or will quickly equilibriate to having no predictive power). That's also why (my hypothesis) that Elli's computer codes are kept hidden and he rarely if ever discusses findings. Either that or he throws everything into a huge Neural Network and doesn't ask why something happens but just gives an outcome as probability a player is mafia.
let's test it; let's test goodhart's law- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
To win my avatar bet, the analyses I'm gonna do before month's end are:
% of scum self-votes to town self-votes
% of scum hammers on town
% of scum hammers on scum
% of town hammers on town
% of town hammers on scum
average frequency of town posts
average frequency of scum posts
average frequency of town votes
average frequency of scum votes
how often scum bus
i pick these because they're especially easy to implement given my codebase. more complete/interesting analyses will come later.- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
- Psyche
-
Psyche he/theySurvivor
- Psyche
- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
the coronavirus stuff put a lot of new work on my plate and i resolved to get ahead on that again before returning to this
but there was a broader problem too and it was that i was starting to do a lot of manual coding/review for a project focused on automating an otherwise overwhelmingly tedious/time-consuming workflow. i think i'll dig in again when i find a way over that hump.- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
- Psyche
-
Psyche he/theySurvivor
- Psyche
- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
So I've run through the code and reviewed my notes.
My codebase is clearly enough written that I can remember what each part of it wassupposedto do, but it doesn't look like I was careful when documenting new problems/gaps in the actual implementation. They're listed in my issue tracker, but descriptions are pretty terse: a given issue might get short summary sentence and a couple examples outputs proving something's wrong, and that's it. These notes are better than nothing, but the result is that I don't really know where in the 13 currently listed issues I should start, or remember what my approach would've been for addressing these issues. I'll probably have to retrace my steps through the project all over again before I can make changes confidently again. I'll come up with some better practices to avoid this work in the future.
I'll work on this again on Tuesday and then again on Friday or Saturday. Will maybe avoid updates until I, like, have actual new stuff to report.- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10653
- Joined: April 28, 2011
- Pronoun: he/they
Okay, here's the plan for getting this project back into gear.
The idea is to follow yessiree's advice: scale down and focus my ambitions.
I've spent a lot of time trying to get my automatic votecounter to perform perfectly on every game in my development dataset. I'll accept that its performance now is probably close to the ceiling possible for my particular approach to the problem, and stop substantial efforts to improve the automatic votecounter.
Even though the votecounter doesn't perform perfectly, I do have a solid way to tell if the votecounter has done a good job: I can check if extracted votes accurately predict 1) who if anyone has been lynched in a given game Day and 2) the post number at which a given game Day has ended. Predicting both of these accurately doesn't guarantee that the votecounter has done its job perfectly, but it comes pretty close (I'll try to back this claim up with numbers at some point). So I canvalidatevoting data collected for a particular game, even if I can't develop a votecounter that codes every game perfectly.
We never needed a perfect or near-perfect votecounter for this project. We just needed quality voting data collected over a large sample of games. So far I've been only using ~300 relatively old games to develop my votecounter. I'll collect/preprocess and run my votecounter over data associated with most or all completed games on MS instead. The votecounter will generate valid data for most of those. And if the pool of processed games is big enough, we'll have an adequately sized, validated dataset to support further analyses. And we'll always have the option to add data for further games if new games finish, or if we work out the issues with the votecounter's performance on the games where it has trouble.
I've always thought that the votecounting bit has been the hard part of this project. But this way we can make it a lot easier and get to the fun part a lot faster. And all I have to do is get over myself. - Psyche
Copyright © MafiaScum. All rights reserved.
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche