Computational Mafia
-
-
skitter30 she/herLast Laughshe/her
- Last Laugh
- Last Laugh
- Posts: 36617
- Joined: March 26, 2017
- Pronoun: she/her
- Location: Est
Ya there's a bunch of things i'd love to look at if i had a workable dataset of votes, lynches, and flips; i have some ideas regarding scum voting patterns that i'd like to test
But i dont entirely know how to ~put together~ a dataset like that
You seem to have a p solid game plan.
When you say 'game archives', what is that referring to?
Like the op of each game that (ideally?) contains that info?
Or is that all compiled in one giant spreadsheet somewhere?ShowHiatus once more.
'skitter is fucking terrifying' ~ town-bork about scum-me
'Skitter [was] terrifying to play against ngl' ~ scum-bork about town-me
'Going into lylo against scum!skit unprepared is like having someone force feed you dull razor blades. It's painful, and once it starts, you're pretty much dead' ~ NMSA
'Skitter you're a spirit animal's spirit animal' ~ slaxx-
-
Psyche he/theySurvivorhe/they
- Survivor
- Survivor
- Posts: 10905
- Joined: April 28, 2011
- Pronoun: he/they
-
-
Psyche he/theySurvivorhe/they
- Survivor
- Survivor
- Posts: 10905
- Joined: April 28, 2011
- Pronoun: he/they
oh and here's an example archive: viewtopic.php?f=53&t=29549
had to do quite a bit of cleaning to make this something that doesn't create trouble but it's better than collecting it all myselfYou can't step in the same river twice.-
-
popsofctown SheSurvivorShe
- Survivor
- Survivor
- Posts: 12356
- Joined: September 23, 2008
- Pronoun: She
Wouldn't it be cool if there was a thingy that would produce links to a player's last five completed games?
Seems possible if like a script perused post history and looked for user's name in vcs to rule out post game commentary, etc"Let us say that you are right and there are two worlds. How much, then, is this 'other world' worth to you? What do you have there that you do not have here? Money? Power? Something worth causing the prince so much pain for?'"
"Well, I..."
"What? Nothing? You would make the prince suffer over... nothing?"-
-
Psyche he/theySurvivorhe/they
- Survivor
- Survivor
- Posts: 10905
- Joined: April 28, 2011
- Pronoun: he/they
-
-
Psyche he/theySurvivorhe/they
- Survivor
- Survivor
- Posts: 10905
- Joined: April 28, 2011
- Pronoun: he/they
this is the absolute worst
You can't step in the same river twice.-
-
Psyche he/theySurvivorhe/they
- Survivor
- Survivor
- Posts: 10905
- Joined: April 28, 2011
- Pronoun: he/they
-
-
yessiree heMafia Scumhe
- Mafia Scum
- Mafia Scum
- Posts: 4480
- Joined: June 6, 2013
- Pronoun: he
it wouldn't be difficult at all to put together something like that, scrape the posts from player X of their last Y gamesIn post 78, popsofctown wrote:Wouldn't it be cool if there was a thingy that would produce links to a player's last five completed games?
Seems possible if like a script perused post history and looked for user's name in vcs to rule out post game commentary, etc
even better tho it can be combined with Bob's deception classifier to analyze the tendency to which a player engages in deceptive speech as either alignment, and basically spits out a likelihood that X is scum when given Z number of new posts-
-
Psyche he/theySurvivorhe/they
- Survivor
- Survivor
- Posts: 10905
- Joined: April 28, 2011
- Pronoun: he/they
-
-
chamber Cases are scummy
- Cases are scummy
- Cases are scummy
- Posts: 10703
- Joined: November 20, 2005
search.php?keywords=&terms=all&author=p ... mit=SearchIn post 78, popsofctown wrote:Wouldn't it be cool if there was a thingy that would produce links to a player's last five completed games?
Seems possible if like a script perused post history and looked for user's name in vcs to rule out post game commentary, etc
Doesn't take much effort to already do this (approximately) with the search features that exist.Taking a break from the site.-
-
Psyche he/theySurvivorhe/they
- Survivor
- Survivor
- Posts: 10905
- Joined: April 28, 2011
- Pronoun: he/they
it's been almost a week have an update
i partly cleaned up my two main data sets - one the transition spreadsheet identifying where phase transitions happened throughout each game, and the other basically a computer readable and cleaned-up subsection of the mini normal archives already maintained on this site identifying every relevant player in the game (plus mods), their roles, and their fates
the cleaned up stuff covers only about ~300 mini normals and mostly only handled the prospect of missing information - not inaccurate
furthermore, i still need to get my votecounter working again
As mentioned before, the strategy for identifying errors is to try to combine the three data sources to complete a task that's hard to complete unless they're all accurate:
With the votecounter and the archive indicating which players are still alive on a given Day, i'll try to infer (for every day phase in my data set) where a hammer has happened.
If the next mod post after the hammer isn't the transition point my transition spreadsheet indicates for that Day, then there's a good chance of an issue in my data or code. Similarly, if my player archive and my votecounter disagree on who the hammered player is, that's also a red flag.
If I can successfully infer the transition post and hammered player from voting data and player information for every phase of every game in my data set, I'll proceed to also try to infer which team won the game and compare that against what my archive says (though nothing in my codebase can infer who got killed in a night phase, or handle vote-manipulating PRs, so the code will get some help from the fate-identifying part of the archive).
There could still be issues in the data set or code after achieving all this, but I imagine that if I expand the data set far enough the initial 300 games while trying to use the data set to do more ambitious things, I'll catch them eventually. I'll hire the filipino lady again and help her provide for her children.
And then a side effect of all this testing will be a cleaned data set tracking every vote that happened across all 300 games. That's when we can start the Great Vote Count Analysis. And then we'll find absolutely no reliable pattern in any of these votes even after appling fancy NLP tools to take "context" into account and I'll finally leave mafiascum.net forever. Oh but at least we'll also have a functioning votecounter that works across even old games w/ loose mods. And no one will use it. Well at least I'll be able to move on.You can't step in the same river twice.-
-
Psyche he/theySurvivorhe/they
- Survivor
- Survivor
- Posts: 10905
- Joined: April 28, 2011
- Pronoun: he/they
fixed the votecounter - it can determine the person lynched on D1 across all 300 mini normals in my sample
it's kinda slow though?
i guess im ok with slow as long as it always works
still need to...
extend test to determine transition posts from post# of hammer
extend test to Days beyond D1
extend test to more games
convert test results into cleaned data set
plan out and start analyses
maybe i'll try to solicit ideas a little further down the lineYou can't step in the same river twice.-
-
gobbledygook Jack of All Trades
-
-
Psyche he/theySurvivorhe/they
- Survivor
- Survivor
- Posts: 10905
- Joined: April 28, 2011
- Pronoun: he/they
-
-
Psyche he/theySurvivorhe/they
- Survivor
- Survivor
- Posts: 10905
- Joined: April 28, 2011
- Pronoun: he/they
i did a little work
solved the speed bottleneck
seems the votecounter is no longer perfect now that im using the new spellchecker
i might just switch back to 32bit and return to pyenchant again
EDIT: no the new spellchecker is just as accurate but much faster. my problems are even scarier.You can't step in the same river twice.-
-
Psyche he/theySurvivorhe/they
- Survivor
- Survivor
- Posts: 10905
- Joined: April 28, 2011
- Pronoun: he/they
Down the Line: Read Extraction
So the Great VCA is of course on the horizon, but we all know that people's votes are only a small portion of the information people produce in a typical game, and of the basis for most people's reads. Understanding Mafia requires engaging with the information in people's *posts*, but it's hard (just engaging fully w/ the posts in one game is a big effort!). People have historically managed this challenge by either focusing their analyses on specific cues/situations where manual coding/interpretation of each case is feasible, or by emphasizing global textual features like comparative post or wordcount while avoiding deep consideration of the content in players posts. Projects that train machine learning classifiers over large corpuses of gameplay text to discriminate alignment seems to be the most sophisticated-imaginable examples of this latter category of work.
To take content seriously without constraining research scale, I want to try leveraging current state-of-the-art tools for extracting structured, machine-readable representations of the information in text. NLP folk call this Open information extraction; tools scan over arbitrary sentences (a simple example might be "Obama was born in Hawaii") and using grammar rules extracts simple subject-object-relation propositions (like ['Obama', 'was born in', 'Hawaii']) much more amenable to automated analysis.
A natural extension of my VCA project that could leverage this kind of tool might be Read Extraction. Votes are themselves a good window into each player's beliefs and status at a given moment in a given Day, and in some contexts they might even be the best window. But they're a limited window - at their best they only represent a player'sbiggestscumread but more often they occur as part of a negotiation between perceived "viable" Day outcomes. Pairing our vote dataset with a reads dataset could enable analysis of how people [pretend to] form and act on their beliefs throughout mafia games that's far more extensive and robust than what might be achieved from studying votes alone.
Skeleton
How do we do that, though? Here's a broad skeleton for a potential Read Extraction tool:
- Player Identification. We need to reliably discern when someone's talking about another player. My VoteCounter already does this to infer the targets of people's votes when the target's exact username isn't mentioned. It already successfully negotiates the ambiguity in abbreviations, acronyms, misspellings and other issues. We'll need more than this - for example, we'll need to tell when players are referenced w/ quotes or replies, and will also need to apply coreference resolution tools to infer who people are talking about when they use pronouns like "He".
- Claim extraction. We'll leverage the best available OpenIE pipeline for extracting people's statements abut identified players and converting them into simple, machine-readable representations.
- Read inference. Finally, we'll have to infer from a person's claims about a player what their professed read about that player is. There are a lot of ways to go about doing this that each have their own precision/recall tradeoffs. Sentiment analysis won't be sufficient - people can have negative attitudes about townreads, and positive attitudes about scumreads. We do have the option to focus on explicit read announcements ("Psyche is scum!"), but we'd miss a lot of reads. It's the toughest part of all this.
I think I would prospectively focus first on extracting and building a dataset of machine-readable representations of as-close-as-possible-to-*all* mentions of and/or claims about players by players throughout every game I can get good voting data for. That's a substantive problem and in and of itself, and the dataset would be interesting in its own right too.
From there, we explore the challenge of classifying extracted claims, or maybe we choose a unique classification scheme to suit each of whichever research questions we decide are interesting.
Validation
The thing that's made this votecount dataset effort viable was finding a way to get quick feedback on design/implementation decisions. From there I could just incrementally improve my pipeline's success rate, making the project far more manageable. Is something like that possible in this domain?
This is where focusing on a close extension of my VCA project might be relevant. We expect votes to be a partial extension of a person's reads, so we can try to validate a read extraction pipeline against a player's votes. Using our existing vote dataset, we'll track when a vote seems to conflict or occur in accordance w/ the voter's detected attitude about the target. When the time comes to evaluate performance, we can produce discordance and concordance rates, or inspect particular examples of discordance to inform further development. We can potentially exclude votes that occur near the end of a game's Day to avoid the intrusion of social/viability concerns to sharpen the analysis.
There's basically zero possibility of a 100% concordance rate (people don't only vote their professed top scumreads!), and one wouldn't even 100% confirm the quality of the pipeline. But it's at least a strategy for highlighting potential issues in my pipeline and driving improvements.
Final Note
The most substantive challenge I've seen to efforts like the Great VCA is that it's fruitless if one doesn't take thecontextof votes into account. Supplementing our votes dataset with a reads dataset like the one I'm proposing seems a serious way to start doing exactly that. There's definitely other information in people's posts beyond indications of their attitudes about everyone else, but we're starting somewhere.You can't step in the same river twice.-
-
Psyche he/theySurvivorhe/they
- Survivor
- Survivor
- Posts: 10905
- Joined: April 28, 2011
- Pronoun: he/they
-
-
popsofctown SheSurvivorShe
- Survivor
- Survivor
- Posts: 12356
- Joined: September 23, 2008
- Pronoun: She
It's a good thing you're planning out.
I'm skeptical that the inaccuracies from the limitation of the IE will be systematic enough for the data to be any besides just good good stuff."Let us say that you are right and there are two worlds. How much, then, is this 'other world' worth to you? What do you have there that you do not have here? Money? Power? Something worth causing the prince so much pain for?'"
"Well, I..."
"What? Nothing? You would make the prince suffer over... nothing?"-
-
Krazy Jack of All Trades
- Jack of All Trades
- Jack of All Trades
- Posts: 7079
- Joined: January 28, 2011
-
-
Krazy Jack of All Trades
- Jack of All Trades
- Jack of All Trades
- Posts: 7079
- Joined: January 28, 2011
-
-
Psyche he/theySurvivorhe/they
- Survivor
- Survivor
- Posts: 10905
- Joined: April 28, 2011
- Pronoun: he/they
votecounter now at peak performance on D1
there are about 5 games over my initial 300 game sample that it just gets wrong (ugh!), but that's still a >98% success rate.
i have to see if that kind of performance extends to other days and other games, too, though
what i'll ultimately do is filter out games i can't get good results on according to my validation scheme
still, fixing as many of the errors i catch as possible is important for minimizing the errors i don't catch
big remaining to-dos:
- take power role interactions into account when processing votes and predicting lynches (especially doublevoters and dayvigs and N0s)
- extend votecountertest to predict phase transitions and test the dataset beyond D1
- add more games beyond my original 300, probably focusing on newer games and everything bob collected. god i wonder where tamuz's newbie dataset is. i know it's somewhere...
- fix all the problems in my dataset or votecounter uncovered from the above expansions
- save and share the whole dataset
- first gamut of analyses
and i gotta do it all before month's endYou can't step in the same river twice.
Copyright © MafiaScum. All rights reserved.