Computational Mafia

This forum is for discussion related to the game.
User avatar
Psyche
Psyche
he/they
Survivor
User avatar
User avatar
Psyche
he/they
Survivor
Survivor
Posts: 10052
Joined: April 28, 2011
Pronoun: he/they

Computational Mafia

Post Post #0 (ISO) » Tue Feb 12, 2019 5:37 am

Post by Psyche »

I thought maybe it'd be good to have a unitary thread to discuss computational approaches to mafia (running/managing games, playing games, reviewing games) absent any focus on particular implementations. There's a sizable community of people who do this, but i feel that due to differences in strategy we're often get into our own islands and don't really interact much or support one another. Let's use this thread to trade notes from time to time!

We can talk about what we're up to, discuss general/recurring problems, share data, whatever.

Some Relevant Projects
Last edited by Psyche on Thu Jun 13, 2019 9:54 pm, edited 2 times in total.
youtube playlist extracter | donbot | game scraper | vca | setupsim | strategist | llm
User avatar
Psyche
Psyche
he/they
Survivor
User avatar
User avatar
Psyche
he/they
Survivor
Survivor
Posts: 10052
Joined: April 28, 2011
Pronoun: he/they

Post Post #1 (ISO) » Tue Feb 12, 2019 6:57 am

Post by Psyche »

Here's a summary of where I think we are. Roughly put, I think the big challenge of computational mafia is how messy and difficult to automatically process game threads are. Natural language processing of player posts is one thing, but even the detection and processing of relatively standardized happenings like player votes, role flips, moderator transitions between game phases, and the like are profoundly challenging things to make a computer do! The result is that the more interesting potential projects in computational mafia require either many hours of coding, many hours of manual data labeling, or forcing players/moderators to change their behavior so that game threads going forward are more streamlined.

We've some success of these approaches, but also a lot of failure. A lot of the necessary hand-prepared data sets exist (e.g. the Game Archive threads), but other data sets often go unprepared due to the huge amounts of work or even unshared after many hours are put into their preparation. Players/Mods have standardized their behavior considerably over the years but a lot remain rightfully resistant to enacting controls on their in-game behavior for the sake of computationalism, and often undisciplined anyway if they aren't; and this method leaves many years of already existing game data abandoned anyway.

As for coding work, some key problems like vote detection/counting under relevant conditions have been solved many times over through distinct projects (including mine) around the site. The result is that in most game threads it's possible to automatically count every vote ever made. Ellibereth famously managed to enhance his Mafia play to near-ceiling levels by formalizing some aspects of his scumhunting strategy. Other tools/formal methods exist or nearly exist for generating and reviewing game setups, timing votecount posts, and other tasks essential to modding. Unfortunately, due to interface issues, secrecy, mod preferences or other challenges, utilization rates for tools like these is rare. Other websites like Mafia Universe cohesively integrate moderator automation tools into the design of their site, and the holy grail of replicating that same success on MafiaScum is a project that remains unfinished but widely anticipated. The more lofty dream of public tools supporting the actual classification of scum and town players from their posts seems at turns substantially further off and all but already here.

Going forward, absent any big changes, we're likely to see more work in fits and starts on these efforts along these lines. Big milestones we might see that might change everything? The site upgrade could happen soon and be successful. Maybe further breakthroughs in thread processing will make it a lot easier for people across the site to build scumhunting tools, this will turn out to be relatively easy, and work will shift to regulating and improving them rather than exploring if it's possible. Sure, there's a chance that instead the site will eventually die out, but that's far off and regardless Mafia is probably here to stay; it's possible that people will just take advantages of sites that implement the concept of forum mafia more cleanly and thus offer cleaner data.
youtube playlist extracter | donbot | game scraper | vca | setupsim | strategist | llm
User avatar
Psyche
Psyche
he/they
Survivor
User avatar
User avatar
Psyche
he/they
Survivor
Survivor
Posts: 10052
Joined: April 28, 2011
Pronoun: he/they

Post Post #2 (ISO) » Tue Feb 12, 2019 7:32 am

Post by Psyche »

there are a few ways i think we can make our work smoother, though:

Share Data

I think most people with coding projects share their code and then no one bothers to read it. I believe that there's a much bigger demand for data not captured or just cleaner than what's in the public archives. I think the user who was named Tamuz once hand-coded an archive for every Newbie game that's ever happened, catalogueing player information, game outcomes, all of that; it seems to be totally lost now. Ground truth data is absolutely necessary either to implement practically any potential computational mafia project or to evaluate if one actually works.

Generate Data

So obviously it would be nice if someone would do the tedious work of organizing the ground truth data needed for most computational mafia projects off the ground. Luckily, people volunteer surprisingly often to do this kind of stuff. My interested in computational mafia has been reinvigorated lately because I've finally found an affordable way to pay other people to do this work too in the way that's necessary for our work, something that could hopefully power much more rapid progress in this domain.

But bigger than that, I think a lot of people (including me) have missed the kind of data that their own computer programs could generate for other people to use if they just made a few tweaks and formatted the output. For example, I mentioned earlier that it's now possible to count every vote in most game threads. People not interested in figuring out how to use my votecounter might find a data set including all of these votecounts really useful, either for their own projects or just for in-game meta-analyses. Similarly, my votecounter includes a tool for matching nicknames and abbreviations to particular players in a game - a dictionary of players and their common nicknames/abbreviations might be similarly broadly useful. I suspect a lot of people doing computational mafia have in their codebases affordances like that but haven't made the effort to extract as much value from them as possible by generating, organizing, and sharing it.

Talk Through Common Challenges and Solutions

No one enjoys looking at other people's code, so the old practice of sharing github repositories seems insufficient for supporting each other through our shared challenges. In general in our big "here's my project" threads I think we should spend more time talking about the problems we faced and how we solved them alongside what our projects achieve. And more specifically, hopefully we can talk about computational mafia as a broad effort here!

And idk, drop more of your lives and just spend more time on this. I'm pretty sure the big reason nothing happens on this project is because the work required to work importance ratio is a bit high. That's...understandable.
Last edited by Psyche on Tue Feb 12, 2019 7:46 am, edited 1 time in total.
youtube playlist extracter | donbot | game scraper | vca | setupsim | strategist | llm
User avatar
Psyche
Psyche
he/they
Survivor
User avatar
User avatar
Psyche
he/they
Survivor
Survivor
Posts: 10052
Joined: April 28, 2011
Pronoun: he/they

Post Post #3 (ISO) » Tue Feb 12, 2019 7:35 am

Post by Psyche »

Going forward I'm personally focusing on finally organizing a comprehensive data set of all the non-post content information that exists across the game threads on MS that are useful as data. Once that happens, MS becomes much more fun as a playground of behavioral data. Like I said before, I'm pulling from my hobby budget to make the work someone else's problem. Hopefully it's all done in a month. It's so fun to be young and have money.
youtube playlist extracter | donbot | game scraper | vca | setupsim | strategist | llm
User avatar
Flubbernugget
Flubbernugget
Survivor
User avatar
User avatar
Flubbernugget
Survivor
Survivor
Posts: 11751
Joined: June 26, 2014

Post Post #4 (ISO) » Tue Feb 12, 2019 7:51 am

Post by Flubbernugget »

How successful was modbot adoption?
User avatar
Psyche
Psyche
he/they
Survivor
User avatar
User avatar
Psyche
he/they
Survivor
Survivor
Posts: 10052
Joined: April 28, 2011
Pronoun: he/they

Post Post #5 (ISO) » Tue Feb 12, 2019 7:56 am

Post by Psyche »

at some point i realized that all i really cared about was building the API

speaking of which, look how easy to use and thoroughly documented my Donbot is! https://github.com/MafiaScum-Unofficial ... ter/donbot
youtube playlist extracter | donbot | game scraper | vca | setupsim | strategist | llm
User avatar
Psyche
Psyche
he/they
Survivor
User avatar
User avatar
Psyche
he/they
Survivor
Survivor
Posts: 10052
Joined: April 28, 2011
Pronoun: he/they

Post Post #6 (ISO) » Wed Feb 13, 2019 9:41 am

Post by Psyche »

think another way i might help is by writing some kind of review or map of all the progress made so far bc clearly im not aware of it all

Will populate a pile of links and notes here in this post for the time being.

The Mafiascum Dataset
Bicephalous Bob cleans/prepares a dataset of 700 games/10000 documents is freely available for academic use and achieves a good automated role classification model, too. Great leap forward for the effort.

The Newbie 2d3 stats thread
Data could serve work to predict game outcomes, formalize setup balance/generation, etc. Human performance statistics are useful for evaluating computational model performance; Similarly: viewtopic.php?f=5&t=39739

MathBlade's Vote Scrubber
Automated vote detection, even in cases of misspelled/nicknamed players (meaning it's a username matcher, too). Interesting links to other stuff.

etc
youtube playlist extracter | donbot | game scraper | vca | setupsim | strategist | llm
User avatar
Claus
Claus
Mafia Scum
User avatar
User avatar
Claus
Mafia Scum
Mafia Scum
Posts: 1734
Joined: June 1, 2007
Location: Tsukuba
Contact:

Post Post #7 (ISO) » Thu Feb 14, 2019 11:29 pm

Post by Claus »

If anyone is interested, I am helping organize an online contest of computational Mafia. This will be the first English edition of a Japanese competition that has been running for 4 years at the Japanese version of GDC.

This competition uses a fixed communication protocol (No NLP needed), and each agent is allowed 10 messages using the protocol before a voting round is enforced. The supported programming languages are Java, Python and C#.

This competition will have the following game rules: Night start without mafia kill on 0th night. Two game sizes: 15P: 8 vanilla, 1 cop, 1 lynch-role revealer, 1 doc, 3 goons, 1 GF and 5P: 2 vanilla, 1 cop, 1 goon, 1 GF.

Agents will play sets of 100 games against random opponents in the competition, and earn 1 point for their team's victory. Best average score wins. Deadline for agent submission is May 20th.

If anyone is interested in participating, let me know!
http://www.youtube.com/watch?v=XVVmAG0RXmo
User avatar
Ellibereth
Ellibereth
Deus ex Machina
User avatar
User avatar
Ellibereth
Deus ex Machina
Deus ex Machina
Posts: 9752
Joined: November 6, 2009
Location: Location location location

Post Post #8 (ISO) » Fri Feb 15, 2019 1:14 am

Post by Ellibereth »

Can you send me the details on Discord - I'm Ellibereth#1113.
Thanks.
FLASH OF GREEN
User avatar
yessiree
yessiree
he
Mafia Scum
User avatar
User avatar
yessiree
he
Mafia Scum
Mafia Scum
Posts: 4349
Joined: June 6, 2013
Pronoun: he

Post Post #9 (ISO) » Fri Feb 15, 2019 6:52 am

Post by yessiree »

the main issue with trying to automate anything on mafiascum is the lack of an API. (it's in development?) It's FeelsBadMan to work with <document> response types, which means an addition layer of reading/parsing XML/HTML content which makes the whole process more time-consuming and error prone

It would be much much better if you could do something like query "[b]API[/b]/viewtopic.php?f=5&t=78755" and get back JSON instead
User avatar
Psyche
Psyche
he/they
Survivor
User avatar
User avatar
Psyche
he/they
Survivor
Survivor
Posts: 10052
Joined: April 28, 2011
Pronoun: he/they

Post Post #10 (ISO) » Fri Feb 15, 2019 11:16 am

Post by Psyche »

Could you be exact about what you’d like for the API to do?
youtube playlist extracter | donbot | game scraper | vca | setupsim | strategist | llm
User avatar
Flubbernugget
Flubbernugget
Survivor
User avatar
User avatar
Flubbernugget
Survivor
Survivor
Posts: 11751
Joined: June 26, 2014

Post Post #11 (ISO) » Fri Feb 15, 2019 11:27 am

Post by Flubbernugget »

In post 7, Claus wrote:If anyone is interested in participating, let me know!
Can i take interest as a spectator?
User avatar
yessiree
yessiree
he
Mafia Scum
User avatar
User avatar
yessiree
he
Mafia Scum
Mafia Scum
Posts: 4349
Joined: June 6, 2013
Pronoun: he

Post Post #12 (ISO) » Fri Feb 15, 2019 5:17 pm

Post by yessiree »

In post 10, Psyche wrote:Could you be exact about what you’d like for the API to do?
imo, the API would ideally
- require an API key that people need to register and get approval for
- be rate limited
- RESTful
- returns JSON

eg. if you wanted to grab the posts in a thread, now you need to query the url of this page, get back a fully rendered web document, then parse that document to get the content you want
with the API, you would query the API url instead, and get back a JSON of all the posts in a thread without the things you don't need
User avatar
Psyche
Psyche
he/they
Survivor
User avatar
User avatar
Psyche
he/they
Survivor
Survivor
Posts: 10052
Joined: April 28, 2011
Pronoun: he/they

Post Post #13 (ISO) » Mon Feb 25, 2019 5:57 am

Post by Psyche »

shouldn't be too hard implement something at least really close

anyway, now have labels for phase number for every mini normal post in bob's data set (plus any newer games that have finished) and will soon have the same for both normals and opens
once it's fully cleaned up i'll share it with everyone!

this'll enable phase-by-phase analysis of game threads that before was impossible without considerable manual work or tiny data sets
and also more detailed game archives to help people doing metas get where they're trying to get faster
with definite day transition post numbers coupled with a good vote detecter, we'll be able to generate wagon data through every one of these hundreds of games
Last edited by Psyche on Mon Feb 25, 2019 6:36 am, edited 1 time in total.
youtube playlist extracter | donbot | game scraper | vca | setupsim | strategist | llm
User avatar
MathBlade
MathBlade
He/Him
Technical Support
User avatar
User avatar
MathBlade
He/Him
Technical Support
Technical Support
Posts: 42761
Joined: September 9, 2013
Pronoun: He/Him
Location: Western US

Post Post #14 (ISO) » Mon Feb 25, 2019 6:34 am

Post by MathBlade »

In post 9, yessiree wrote:the main issue with trying to automate anything on mafiascum is the lack of an API. (it's in development?) It's FeelsBadMan to work with <document> response types, which means an addition layer of reading/parsing XML/HTML content which makes the whole process more time-consuming and error prone

It would be much much better if you could do something like query "[b]API[/b]/viewtopic.php?f=5&t=78755" and get back JSON instead
Omg so much this.

Like that’s the reason my VoteCount scrubber is slower than I would like.

Getting all that html and crunching it for hundreds of pages is a pita.
ScumBlade's eloquent performance left me utterly disoriented, debased, depraved and sent me spiraling into a horrific murky abyss with emotional turmoil and immense despair as my only companions until slowly I suffocate in my own gloom, surrounded by failure. I will never recover. -- Zachstralkita about Mini 1841
GTKAS -- MathBlade
User avatar
Psyche
Psyche
he/they
Survivor
User avatar
User avatar
Psyche
he/they
Survivor
Survivor
Posts: 10052
Joined: April 28, 2011
Pronoun: he/they

Post Post #15 (ISO) » Mon Feb 25, 2019 6:42 am

Post by Psyche »

i mean at the end of the day if that's your goal then maybe you should join the site redesign effort and just build your votecounter into the system
youtube playlist extracter | donbot | game scraper | vca | setupsim | strategist | llm
User avatar
MathBlade
MathBlade
He/Him
Technical Support
User avatar
User avatar
MathBlade
He/Him
Technical Support
Technical Support
Posts: 42761
Joined: September 9, 2013
Pronoun: He/Him
Location: Western US

Post Post #16 (ISO) » Mon Feb 25, 2019 9:34 am

Post by MathBlade »

Hypothetically I may be doing just that. :)
ScumBlade's eloquent performance left me utterly disoriented, debased, depraved and sent me spiraling into a horrific murky abyss with emotional turmoil and immense despair as my only companions until slowly I suffocate in my own gloom, surrounded by failure. I will never recover. -- Zachstralkita about Mini 1841
GTKAS -- MathBlade
User avatar
Flubbernugget
Flubbernugget
Survivor
User avatar
User avatar
Flubbernugget
Survivor
Survivor
Posts: 11751
Joined: June 26, 2014

Post Post #17 (ISO) » Wed Feb 27, 2019 9:55 am

Post by Flubbernugget »

Idk if this is a pink elephant or a "goes without saying" thing but the best contributions to site health are going to be with backend development, which has the highest barrier to entry
User avatar
MathBlade
MathBlade
He/Him
Technical Support
User avatar
User avatar
MathBlade
He/Him
Technical Support
Technical Support
Posts: 42761
Joined: September 9, 2013
Pronoun: He/Him
Location: Western US

Post Post #18 (ISO) » Wed Feb 27, 2019 10:04 am

Post by MathBlade »

100% agree

And people willing to test stuff once done.
ScumBlade's eloquent performance left me utterly disoriented, debased, depraved and sent me spiraling into a horrific murky abyss with emotional turmoil and immense despair as my only companions until slowly I suffocate in my own gloom, surrounded by failure. I will never recover. -- Zachstralkita about Mini 1841
GTKAS -- MathBlade
User avatar
Psyche
Psyche
he/they
Survivor
User avatar
User avatar
Psyche
he/they
Survivor
Survivor
Posts: 10052
Joined: April 28, 2011
Pronoun: he/they

Post Post #19 (ISO) » Wed Feb 27, 2019 10:10 am

Post by Psyche »

whatever happens there’ll always be other sites
youtube playlist extracter | donbot | game scraper | vca | setupsim | strategist | llm
User avatar
Flubbernugget
Flubbernugget
Survivor
User avatar
User avatar
Flubbernugget
Survivor
Survivor
Posts: 11751
Joined: June 26, 2014

Post Post #20 (ISO) » Wed Feb 27, 2019 3:54 pm

Post by Flubbernugget »

In post 18, MathBlade wrote:100% agree

And people willing to test stuff once done.
This...actually might be a lot less time consuming up front? What's involved? Spinning up a docker container and pointing browsers to a loopback address?
User avatar
Psyche
Psyche
he/they
Survivor
User avatar
User avatar
Psyche
he/they
Survivor
Survivor
Posts: 10052
Joined: April 28, 2011
Pronoun: he/they

Post Post #21 (ISO) » Sun Mar 03, 2019 2:07 pm

Post by Psyche »

https://docs.google.com/spreadsheets/d/ ... sp=sharing

Hi guys. The above spreadsheet is unfinished and will need to be cleaned a lot once done, but it will include the post number of every phase transition in all the games covered in Bob's data set, plus any newly completed games in the relevant subforums since. This kind of data should permit or facilitate a broad array of computational efforts tied to mafia and MS. I'll let you know when it's fully "done", but I'm too excited not to share it now. More later.
youtube playlist extracter | donbot | game scraper | vca | setupsim | strategist | llm
User avatar
Claus
Claus
Mafia Scum
User avatar
User avatar
Claus
Mafia Scum
Mafia Scum
Posts: 1734
Joined: June 1, 2007
Location: Tsukuba
Contact:

Post Post #22 (ISO) » Sun Mar 03, 2019 2:57 pm

Post by Claus »

In post 8, Ellibereth wrote:Can you send me the details on Discord - I'm Ellibereth#1113.
Thanks.
Hey people, I am sorry for the delay, work happened. I don't really access MS Discord, but if you sent me a PM or an e-mail I would be happy to send more information.

Right now, I have the following sites to share:
http://web.tuat.ac.jp/~katfuji/ANAC2019/ -- Competition Website for this year
http://aiwolf.org/en/ -- AIWolf project website (lots of information in Japanese, but if you dig around you should find quite some information in English too)
https://github.com/caranha/AIWolfCompo -- a github repository with some documentations and sample code that I am putting together for contestants (WIP).
https://github.com/ehauckdo/AIWoof -- One of my students is making AIWolf bots in python, might be good for some examples.

Again, please do not hesitate in asking me questions.
In post 11, Flubbernugget wrote:
In post 7, Claus wrote:If anyone is interested in participating, let me know!
Can i take interest as a spectator?
Sure! The deadline for submissions will be on May, so there is some time before we actually start making the AIs play against each other, but if there is interest, I could put some game reports and logs here.

(That said, the quality of the robots is not stellar high yet ;-) Baby steps!)
Psyche wrote:https://docs.google.com/spreadsheets/d/ ... sp=sharing

Hi guys. The above spreadsheet is unfinished and will need to be cleaned a lot once done, but it will include the post number of every phase transition in all the games covered in Bob's data set, plus any newly completed games in the relevant subforums since. This kind of data should permit or facilitate a broad array of computational efforts tied to mafia and MS. I'll let you know when it's fully "done", but I'm too excited not to share it now. More later.
This is fantastic! If I were a little better at NLP I would love to do some analysis in Bob's data. (If anyone wants to apply for graduate school or an exchange period in Japan to work on this let me know ;-))
http://www.youtube.com/watch?v=XVVmAG0RXmo
User avatar
Psyche
Psyche
he/they
Survivor
User avatar
User avatar
Psyche
he/they
Survivor
Survivor
Posts: 10052
Joined: April 28, 2011
Pronoun: he/they

Post Post #23 (ISO) » Sun Mar 03, 2019 4:20 pm

Post by Psyche »

any other kinds of data collection too hard to have a computer do well short of actively reading games? haven't quite figured out what if anything i'll have my coder do after she finishes this thing
youtube playlist extracter | donbot | game scraper | vca | setupsim | strategist | llm
User avatar
Bicephalous Bob
Bicephalous Bob
Mafia Scum
User avatar
User avatar
Bicephalous Bob
Mafia Scum
Mafia Scum
Posts: 3828
Joined: June 4, 2013
Location: I don't know why you're linking me to pictures of babies on Facebook

Post Post #24 (ISO) » Mon Mar 04, 2019 11:29 am

Post by Bicephalous Bob »

In post 14, MathBlade wrote:
In post 9, yessiree wrote:the main issue with trying to automate anything on mafiascum is the lack of an API. (it's in development?) It's FeelsBadMan to work with <document> response types, which means an addition layer of reading/parsing XML/HTML content which makes the whole process more time-consuming and error prone

It would be much much better if you could do something like query "[b]API[/b]/viewtopic.php?f=5&t=78755" and get back JSON instead
Omg so much this.

Like that’s the reason my VoteCount scrubber is slower than I would like.

Getting all that html and crunching it for hundreds of pages is a pita.
The average game should be around 15 light-weight pages if you use &view=print&ppp=200, of which only the last three pages are probably relevant for a vc

A twenty-line artoo.js game scraper: https://bitbucket.org/bopjesvla/thesis/ ... ew-default
Locked

Return to “Mafia Discussion”