Project ELO (Dataset released)
-
-
Realeo Jack of All Trades
- Jack of All Trades
- Jack of All Trades
- Posts: 5238
- Joined: February 11, 2016
- Location: Indonesia
I do not compute the average score."The debate on whether short multi postings or a long wall of post is good or not is like a debate on gun control--we would never understand each other and we have to make peace with it." -Realeo
I'm mabye a serious player, but I'm capable of joke. Ok?-
-
LicketyQuickety Survivor
- Survivor
- Survivor
- Posts: 12785
- Joined: May 14, 2015
- Location: Where the moon and the sea meet.
Why not?In post 100, Realeo wrote:I do not compute the average score.I was anything worse than you! Anything worse than you was I!
You was doided teh aposit_tisopa het dedoid saw em.-
-
Realeo Jack of All Trades
- Jack of All Trades
- Jack of All Trades
- Posts: 5238
- Joined: February 11, 2016
- Location: Indonesia
Because they are believed not to converge yet, remember?
If you mean the average of considered player, that would come in next update. However, as you said, the curve of ELO distribution is like bell curve."The debate on whether short multi postings or a long wall of post is good or not is like a debate on gun control--we would never understand each other and we have to make peace with it." -Realeo
I'm mabye a serious player, but I'm capable of joke. Ok?-
-
Raskolnikov Jack of All Trades
-
-
Realeo Jack of All Trades
- Jack of All Trades
- Jack of All Trades
- Posts: 5238
- Joined: February 11, 2016
- Location: Indonesia
I am considering that after I tabulated Open. From 2013, there are around 200 games so that bumpts the tabulized game total to around 700ish.
Although, it's not how do I calculate town/scum. It's how I calculate town/scum1/scum2/SK that becomes the problem."The debate on whether short multi postings or a long wall of post is good or not is like a debate on gun control--we would never understand each other and we have to make peace with it." -Realeo
I'm mabye a serious player, but I'm capable of joke. Ok?-
-
Realeo Jack of All Trades
- Jack of All Trades
- Jack of All Trades
- Posts: 5238
- Joined: February 11, 2016
- Location: Indonesia
Because town ELO is defined as scumhunting skill and scum ELO is defined as scumavoiding skill.
In Town v Mafia v Werewolf v SK, everyone is scumhunting of the other factions. Which ELo is used for SK?"The debate on whether short multi postings or a long wall of post is good or not is like a debate on gun control--we would never understand each other and we have to make peace with it." -Realeo
I'm mabye a serious player, but I'm capable of joke. Ok?-
-
LicketyQuickety Survivor
- Survivor
- Survivor
- Posts: 12785
- Joined: May 14, 2015
- Location: Where the moon and the sea meet.
-
-
Realeo Jack of All Trades
- Jack of All Trades
- Jack of All Trades
- Posts: 5238
- Joined: February 11, 2016
- Location: Indonesia
You said in a thread that you would only mentor-hydra if you have at least 55 games. Educated guess.
Some early Open Games are lost due to TIGERS attack =/
=/"The debate on whether short multi postings or a long wall of post is good or not is like a debate on gun control--we would never understand each other and we have to make peace with it." -Realeo
I'm mabye a serious player, but I'm capable of joke. Ok?-
-
Realeo Jack of All Trades
- Jack of All Trades
- Jack of All Trades
- Posts: 5238
- Joined: February 11, 2016
- Location: Indonesia
You are welcomed to give idea, especially about how to calculate elo in multi ball if the elo is split town/scum"The debate on whether short multi postings or a long wall of post is good or not is like a debate on gun control--we would never understand each other and we have to make peace with it." -Realeo
I'm mabye a serious player, but I'm capable of joke. Ok?-
-
Raskolnikov Jack of All Trades
-
-
LicketyQuickety Survivor
- Survivor
- Survivor
- Posts: 12785
- Joined: May 14, 2015
- Location: Where the moon and the sea meet.
Ah, It was based on misconception then. I said I would only want to get taught by someone who has 50 games played, not 55 iirc. I think I have 53 completed games with some ongoing and some I have subbed out of. I think you are a bit behind in the games I have played calculating my ELO if you only have 8 games on record. I have 9 newbie games alone which have been completed (which is not a lot). naturally, 8/53 completed games is not exactly a good depiction of how good a player I am. Granted I only have 21 (?) games completed on this site. So 8/21 is a low number.In post 107, Realeo wrote:You said in a thread that you would only mentor-hydra if you have at least 55 games. Educated guess.
Some early Open Games are lost due to TIGERS attack =/
=/I was anything worse than you! Anything worse than you was I!
You was doided teh aposit_tisopa het dedoid saw em.-
-
LicketyQuickety Survivor
- Survivor
- Survivor
- Posts: 12785
- Joined: May 14, 2015
- Location: Where the moon and the sea meet.
I would account for win rates of SK. Just a thought, I have no idea what you are actually doing with all this math stuff. I am quite rusty when it comes to maths.In post 108, Realeo wrote:You are welcomed to give idea, especially about how to calculate elo in multi ball if the elo is split town/scumI was anything worse than you! Anything worse than you was I!
You was doided teh aposit_tisopa het dedoid saw em.-
-
LicketyQuickety Survivor
- Survivor
- Survivor
- Posts: 12785
- Joined: May 14, 2015
- Location: Where the moon and the sea meet.
I meant average win rates in standard games with SKs. *shrug*In post 111, LicketyQuickety wrote:
I would account for win rates of SK. Just a thought, I have no idea what you are actually doing with all this math stuff. I am quite rusty when it comes to maths.In post 108, Realeo wrote:You are welcomed to give idea, especially about how to calculate elo in multi ball if the elo is split town/scumI was anything worse than you! Anything worse than you was I!
You was doided teh aposit_tisopa het dedoid saw em.-
-
mhsmith0 Balancing Act
- Balancing Act
- Balancing Act
- Posts: 10830
- Joined: March 7, 2016
- Location: Phoenix, AZ
For kicks, I did a quick log regression of the newbie games (1370-1765), rating only players with at least 8 games in a particular faction (anyone without at least 8 games in at least one alignment was dropped), dropping in a dummy win and a dummy loss (for each player for each alignment) for smoothing purposes. Results:
Spoiler:
Interpretation:
A game with NONE of these players would be rated at the intercept, -0.422, probability of a town win = 39.6% (L = -0.422, exp(L) = 0.656, p(win) = exp(L) / (1+exp(L) = 39.6%, which compares to the actual town win rate of 44.2% in that set of games)
A game with any of these players would be rated at the intercept plus that player's rating (or plus multiple if multiple are in the game). As a simple example, if there are eight "other" players and thor, then the odds become:
If Thor is town: L = -0.422 + 0.530 = 0.108 -> p(town win) = 52.7% (Thor's actual town record: 12-12, 50%)
If Thor is scum: L = -0.422 + -0.369 = -0.791 -> p(town win) = 31.2% (Thor's actual scum record: 6-3, 67% which inverts to 33% town odds)
So basically, high positive numbers are good if town, high negative numbers are good if scum, and I combined them for a total rating (though most players don't have the data to really be rated for scum).Showhttp://wiki.mafiascum.net/index.php?title=Mhsmith0
Conq: you, sir, are great at being town.
BATMAN: Only jugg was the only one we didn’t scum read at least not me
Quick: There is little to no chance this slot is Power-Wolfing.
SR: I want to give him a day
Life is simply unfair, don't you think?-
-
Realeo Jack of All Trades
- Jack of All Trades
- Jack of All Trades
- Posts: 5238
- Joined: February 11, 2016
- Location: Indonesia
What is the train/test data split"The debate on whether short multi postings or a long wall of post is good or not is like a debate on gun control--we would never understand each other and we have to make peace with it." -Realeo
I'm mabye a serious player, but I'm capable of joke. Ok?-
-
mhsmith0 Balancing Act
- Balancing Act
- Balancing Act
- Posts: 10830
- Joined: March 7, 2016
- Location: Phoenix, AZ
None, it's one big regression. There isn't NEARLY enough data to split out a validation sample so I simply didn't, instead choosing as a methodology to just chuck out data for individual/alignment pairings with too low of a sample size for me to consider credible (and 8 is somewaht arbitrary of course).Showhttp://wiki.mafiascum.net/index.php?title=Mhsmith0
Conq: you, sir, are great at being town.
BATMAN: Only jugg was the only one we didn’t scum read at least not me
Quick: There is little to no chance this slot is Power-Wolfing.
SR: I want to give him a day
Life is simply unfair, don't you think?-
-
Realeo Jack of All Trades
- Jack of All Trades
- Jack of All Trades
- Posts: 5238
- Joined: February 11, 2016
- Location: Indonesia
Do you train per player? If you do, please be aware of overfitting.
And how do you exactly do that? Multivariate logistic regression?
I think I may as well spilt ELO from town/mafia without waiting to input Open Data. I don't think I'm capable of inputing data during college, but coding for it is definitely a go."The debate on whether short multi postings or a long wall of post is good or not is like a debate on gun control--we would never understand each other and we have to make peace with it." -Realeo
I'm mabye a serious player, but I'm capable of joke. Ok?-
-
mhsmith0 Balancing Act
- Balancing Act
- Balancing Act
- Posts: 10830
- Joined: March 7, 2016
- Location: Phoenix, AZ
Yes, multivariate logistic regression (X = indicators, 0/1, for each player; Y = 0 or 1, town loss or town win). There is a clear possibility, indeed likelihood, of overfitting. The main protection I imposed was simply cutting off data that didn't have enough of a sample size, but beyond that, I've pretty much accepted that there is substantial noise in the numbers, and that they're a loose but reasonable approximation of what they "should" be. I just don't see much of a gain from doing more complicated train/test mechanisms here; there's enough data to get a reasonable approximation of what it ought to say, and IMO a logit methodology is relatively more accurate than most others, and at least has some reasonable level of statistical support, compared to ones like the (IMO) fairly arbitrary ELO.Showhttp://wiki.mafiascum.net/index.php?title=Mhsmith0
Conq: you, sir, are great at being town.
BATMAN: Only jugg was the only one we didn’t scum read at least not me
Quick: There is little to no chance this slot is Power-Wolfing.
SR: I want to give him a day
Life is simply unfair, don't you think?-
-
Realeo Jack of All Trades
- Jack of All Trades
- Jack of All Trades
- Posts: 5238
- Joined: February 11, 2016
- Location: Indonesia
Now that you mention that, I should probably fine tune the K factor.In post 117, mhsmith0 wrote:arbitrary ELO."The debate on whether short multi postings or a long wall of post is good or not is like a debate on gun control--we would never understand each other and we have to make peace with it." -Realeo
I'm mabye a serious player, but I'm capable of joke. Ok?-
-
Realeo Jack of All Trades
- Jack of All Trades
- Jack of All Trades
- Posts: 5238
- Joined: February 11, 2016
- Location: Indonesia
Is the win/rate distribution even a logit? I thought the win/rate distribution is a gaussian?In post 117, mhsmith0 wrote:IMO a logit methodology is relatively more accurate than most others"The debate on whether short multi postings or a long wall of post is good or not is like a debate on gun control--we would never understand each other and we have to make peace with it." -Realeo
I'm mabye a serious player, but I'm capable of joke. Ok?-
-
mhsmith0 Balancing Act
- Balancing Act
- Balancing Act
- Posts: 10830
- Joined: March 7, 2016
- Location: Phoenix, AZ
Skimming results, in general, rated players' win rates predicted by the logit (intercept + rating) are pretty close to their actual. One itneresting exception is hopkirk, who is 5-3 (63% win rate) but is predicted at a 48% win rate.
Spoiler: Looking into hopkirk's games in more detail
So in this case, it's not surprising that there was a notable gap between Hopkirk's actual town win rate and the win rate suggested by the simple model of intercept + hopkirk's own rating. Of course, with a simple size of 8 there's very probably noise in the numbers too, but this is a result that overall makes sense on further review, which is what you hope to see in seemingly surprising model outputs.
Another interesting difference is fferyllt, who has a very high town rating (expected logit win rate: 59%) but not quite as high as you might expect from a 12-5 record (71%)
Spoiler: Looking into fferyllt's games in more detail
As with hopkirk, this suggests that ffery's win rate was to some degree bossted by overall playing with teammates that have been above-average successful as town, which is why her rating, while still very high, is lower than you might expect just from looking at her 12-5 record.Last edited by mhsmith0 on Fri Jan 20, 2017 9:23 am, edited 3 times in total.Showhttp://wiki.mafiascum.net/index.php?title=Mhsmith0
Conq: you, sir, are great at being town.
BATMAN: Only jugg was the only one we didn’t scum read at least not me
Quick: There is little to no chance this slot is Power-Wolfing.
SR: I want to give him a day
Life is simply unfair, don't you think?-
-
zoraster He/HimDisorganized CrimeHe/Him
- Disorganized Crime
- Disorganized Crime
- Posts: 21680
- Joined: June 10, 2008
- Pronoun: He/Him
- Location: Belmont, CA
In post 88, MagnaofIllusion wrote:
So you aren't going to consider what in all likelyhood amounts to a majority of the games on site?In post 86, Realeo wrote:There are 90 games yet to analyze~a combined players list of at least 1 000 players.
We haven't consider Open Games.
We won't consider Theme Games.
Well thanks for the clarity on that.In post 84, MagnaofIllusion wrote:Feel free to disagree but throwing stats that aren't meaningful isn't very helpful.
Your 25 players listed may not even be in the real Top 100 of all players. Or Top 200. or Top 500. There is no way to know given the small sample size you are using. As I said until you get at least half the player-base included in your pool of eligible players then your Top is just listing of the best ranked players in a small strata of the site's games.
I mean I understand the fun in doing it for you.
What is your deal..-
-
mhsmith0 Balancing Act
- Balancing Act
- Balancing Act
- Posts: 10830
- Joined: March 7, 2016
- Location: Phoenix, AZ
Well, W/L for every given game is a binary 0 or 1 variable, with pretty explicit input variables (I've chosen to handwave away the setup here, though certainly 1/2/3/A/B/C is a valid variable affecting win rates and a more fine-tuned distribution would adjust for it). A gaussian variable, which can be looked at via normal distribution, is a continuous variable (or, for a large enough sample size, can be discrete and it's "good enough").In post 119, Realeo wrote:Is the win/rate distribution even a logit? I thought the win/rate distribution is a gaussian?
So you can pretty reasonably run a logistic regression based on the combined data each individual game (0 or 1 for result), with whatever independent binary variables (0 or 1 for town!GIF, 0 or 1 for town!thor, 0 or 1 for scum!thor, etc) you want to account for. https://en.wikipedia.org/wiki/Logistic_regression for the curious (I'd guess you already know that link but others might be curious from reading along).Showhttp://wiki.mafiascum.net/index.php?title=Mhsmith0
Conq: you, sir, are great at being town.
BATMAN: Only jugg was the only one we didn’t scum read at least not me
Quick: There is little to no chance this slot is Power-Wolfing.
SR: I want to give him a day
Life is simply unfair, don't you think?-
-
Realeo Jack of All Trades
- Jack of All Trades
- Jack of All Trades
- Posts: 5238
- Joined: February 11, 2016
- Location: Indonesia
The premise of logistic regression is promising.
I have an idea. Is it possible for the algorithm to merge to identical players? Apply cluster algorithm by merging similar player with higher data."The debate on whether short multi postings or a long wall of post is good or not is like a debate on gun control--we would never understand each other and we have to make peace with it." -Realeo
I'm mabye a serious player, but I'm capable of joke. Ok?-
-
Realeo Jack of All Trades
- Jack of All Trades
- Jack of All Trades
- Posts: 5238
- Joined: February 11, 2016
- Location: Indonesia
Since some player are similiar, you would have bigger W/L data to process if you merge one into one?
Plot the W/L of players of both mafia and town. If the Pythaghoras distance of 2 player is very small, we can conclude that this player has similiar skill. You can merge them, add up their W/L to create one super player."The debate on whether short multi postings or a long wall of post is good or not is like a debate on gun control--we would never understand each other and we have to make peace with it." -Realeo
I'm mabye a serious player, but I'm capable of joke. Ok?
Copyright © MafiaScum. All rights reserved.