Page 1 of 8

The Newbie Matrix6 stats thread (complete)

Posted: Wed Mar 05, 2014 6:59 am
by Toomai
Percentages may not add up to exactly 100% due to rounding and/or draws.
Spoiler: General/subsetup stats
SetupWinsLossesWinrateGames
1 (JK)403553.3%75
2 (RB, Cop, Doc)453854.2%83
3 (BP, Trk)344841.5%82
A (JK, RB, BP)314839.2%79
B (Cop)334542.3%78
C (Doc, Trk)344244.7%76
Total21725645.9%473

Image

Spoiler: Result stats
Average player types per game: 5.5 Newbies, 2.5 SEs, 1.0 ICs
PlayerWin %Total slots
Town Newbies46.1%2029
Scum Newbies52.8%566
Total Newbies47.6%2595
Town SEs45.3%924
Scum SEs57.0%265
Total SEs47.9%1189
Town ICs45.8%358
Scum ICs53.9%115
Total ICs47.8%473
All town45.9%3311
All scum54.1%946
All players47.7%4257

Image

Spoiler: Replacement stats
Notes:
If a slot is replaced twice, that counts as one replaced slot and two replaced players. Therefore, the player rate cannot be any lower than the slot rate. A slot rate of 100% means that all 9 slots were replaced at least once. A player rate of 100% means that 9 players (from any slots) were replaced (so it can exceed 100%). I don't know which type of rate is better to use, though using both is useful (it tells us whether all the replacements came from one slot or were distributed amongst multiple slots). (Reminder: The Matrix6 setup has 9 players, hence why 9 appears in the previous.)

On average, games replace 2.9 slots with a standard deviation of 1.63, and 3.5 players with a standard deviation of 2.21.
PlayerSlot replacement ratePlayer replacement rate
Town Newbies38.9%46.8%
Scum Newbies43.5%56.3%
Total Newbies39.9%48.9%
Town SEs24.3%27.0%
Scum SEs26.7%29.3%
Total SEs24.8%27.5%
Town ICs10.3%11.1%
Scum ICs16.4%16.4%
Total ICs11.8%12.4%
All town31.7%37.4%
All scum35.5%43.9%
All players32.6%38.9%

Image
Image
Image

Spoiler: Power role stats
The rate of each role living to endgame is as follows:
RoleSetupsTown winScum winTotal
Jailkeeper1 and A48% (71 games)12% (83 games)29% (154 games)
Cop2 and B51% (78 games)7% (83 games)29% (161 games)
Doctor2 and C51% (79 games)19% (80 games)35% (159 games)
Bulletproof3 and A86% (65 games)68% (96 games)75% (161 games)
Tracker3 and C49% (68 games)11% (90 games)27% (158 games)
Roleblocker2 and A0% (by definition)88% (86 games)47% (162 games)

Interesting notes (ask for more):
  • In setup 2, in scum wins, 8% of Cops get lynched Day 1, and 39% get killed Night 1, leaving only 53% alive for Day 2 (38 games)
  • In setup 2, in scum wins, 18% of Doctors get killed Night 1, and 21% get killed Night 2 (38 games)
  • In setup 2, in scum wins, the Roleblocker is alive at the end 92% of the time (38 games)
  • In setup 3, 35% of Trackers are killed Night 2 in town wins, while 42% are killed N2 in scum wins (82 games)
  • In setup 3, in scum wins, only 25% of Trackers are alive by Day 3, and 10% survive to endgame (48 games)
  • In setup B, in town wins, 24% of Cops get killed Night 1, and only 48% survive to endgame (33 games)
  • In setup B, in scum wins, 11% of Cops get lynched Day 1, 42% of Cops get killed Night 1, and only 16% of Cops are alive by Day 3 (45 games)
  • In setup C, in scum wins, 26% of Doctors get killed Night 1 while another 50% die Night 2, leaving 17% alive for Day 3 (42 games)

Spoiler: Day 1 stats
  • Town-scum-NL rate: 368-94-10 (random lynching would give 367-105-0; the difference is [0.19%]-[-2.31%]-[+2.12%])
  • If town is lynched D1, they go 137-231 (37.2%)
  • If scum is lynched D1, they go 17-77 (18.1%)
  • If there is no lynch D1, town goes 2-8 (20.0%)

Spoiler: Lynch accuracy stats
In the following table, the first three columns are the raw percentages, while the last two are the difference between that and random (positive = more, negative = less). So when town does better than random, it has a negative in lynching town and a positive in lynching scum (random never no-lynches, so the difference is the data).
Note that, due to the way in which I store the stats, if a game has a no-lynch followed by a no-kill (two chances to lynch with the same ratio), the first one is ignored and the second one counts. As a result there's probably a slightly higher number of no-lynches than is being reported.
RatioTownScumNo lynchTownScum
7:2 (474 samples)78.7%19.8%1.5%+0.9%-2.4%
7:1 (30 samples)53.3%46.7%0.0%-34.2%+34.2%
6:2 (83 samples)55.4%38.6%6.0%-19.6%+13.6%
6:1 (76 samples)57.9%40.8%1.3%-27.8%+26.5%
5:2 (308 samples)64.6%33.1%2.3%-6.8%+4.5%
5:1 (68 samples)52.9%45.6%1.5%-30.4%+28.9%
4:2 (78 samples)38.5%29.5%32.1%-28.2%-3.8%
4:1 (123 samples)57.7%41.5%0.8%-22.3%+21.5%
3:2 (194 samples)59.8%38.7%1.5%-0.2%-1.3%
3:1 (77 samples)53.2%32.5%14.3%-21.8%+7.5%
2:1 (132 samples)53.8%44.7%1.5%-12.9%+11.4%

Image
Image

Spoiler: Game length stats
  • No games have ended with Day 1 lynch (only possible via modkills)
  • No games have ended with Night 1 kill (only possible via modkills)
  • Games that end with Day 2 lynch take 20.0 days with standard deviation of 5.43 (40 games)
  • No games have ended with Night 2 kill (only possible via modkills)
  • Games that end with Day 3 lynch take 34.6 days with standard deviation of 9.99 (196 games)
  • Games that end with Night 3 kill take 39.5 days with standard deviation of 9.10 (10 games)
  • Games that end with Day 4 lynch take 44.5 days with standard deviation of 12.86 (180 games)
  • Games that end with Night 4 kill take 46.7 days with standard deviation of 8.16 (4 games)
  • Games that end with Day 5 lynch take 49.9 days with standard deviation of 14.49 (39 games)
  • One game has ended with Night 5 kill; it took 62.5 days
  • Games that end with Day 6 lynch take 56.6 days with standard deviation of 10.88 (4 games)
  • No games have ended with Night 6 kill
  • One game has ended with Day 7 lynch; it took 84.0 days
  • Overall, games take 39.0 days with standard deviation of 13.92; the vast majority of games end with a Day 3 or Day 4 lynch (79.16%), with Day 2 and Day 5 lynches being tied for second-most common (16.63%)

Spoiler: Replacement vs. length stats
  • When 0 slots are replaced all game, games take 27.9 days with standard deviation of 11.86 (24 games)
  • When 1 slot is replaced all game, games take 34.3 days with standard deviation of 12.81 (76 games)
  • When 2 slots are replaced all game, games take 35.5 days with standard deviation of 12.23 (102 games)
  • When 3 slots are replaced all game, games take 38.4 days with standard deviation of 13.26 (103 games)
  • When 4 slots are replaced all game, games take 44.2 days with standard deviation of 14.43 (84 games)
  • When 5 slots are replaced all game, games take 44.2 days with standard deviation of 12.34 (58 games)
  • When 6 slots are replaced all game, games take 47.6 days with standard deviation of 12.35 (20 games)
  • When 7 slots are replaced all game, games take 50.1 days with standard deviation of 14.57 (7 games)
  • One game replaced 8 slots; it took 71.5 days
  • No games have had 9 slots replaced
  • When 0 players are replaced all game, games take 27.9 days with standard deviation of 11.86 (24 games)
  • When 1 player is replaced all game, games take 33.8 days with standard deviation of 12.87 (66 games)
  • When 2 players are replaced all game, games take 34.9 days with standard deviation of 12.62 (85 games)
  • When 3 players are replaced all game, games take 37.0 days with standard deviation of 11.83 (89 games)
  • When 4 players are replaced all game, games take 41.0 days with standard deviation of 14.41 (78 games)
  • When 5 players are replaced all game, games take 42.7 days with standard deviation of 12.99 (48 games)
  • When 6 players are replaced all game, games take 46.3 days with standard deviation of 12.54 (37 games)
  • When 7 players are replaced all game, games take 45.3 days with standard deviation of 11.13 (22 games)
  • When 8 players are replaced all game, games take 54.4 days with standard deviation of 9.85 (14 games)
  • When 9 players are replaced all game, games take 56.5 days with standard deviation of 16.49 (4 games)
  • When 10 players are replaced all game, games take 50.9 days with standard deviation of 14.19 (6 games)
  • When 11 players are replaced all game, games take 51.1 days with standard deviation of 28.88 (2 games)
  • No games have had more than 11 players replaced

Spoiler: Replacement vs. winrate stats
For slots:
  • When 0 slots are replaced all game, town goes 13-11 (54.2%)
  • When 1 slot is replaced all game, town goes 40-36 (52.6%)
  • When 2 slots are replaced all game, town goes 48-54 (47.1%)
  • When 3 slots are replaced all game, town goes 51-52 (49.5%)
  • When 4 slots are replaced all game, town goes 41-43 (48.8%)
  • When 5 slots are replaced all game, town goes 21-36 (36.8%)
  • When 6 slots are replaced all game, town goes 3-16 (15.8%)
  • When 7 slots are replaced all game, town goes 0-7 (0.0%)
  • When 8 slots are replaced all game, town goes 0-1 (0.0%)
  • No games have had 9 slots replaced
Image
For players:
  • When 0 players are replaced all game, town goes 13-11 (54.2%)
  • When 1 player is replaced all game, town goes 35-31 (53.0%)
  • When 2 players are replaced all game, town goes 39-46 (45.9%)
  • When 3 players are replaced all game, town goes 43-46 (48.3%)
  • When 4 players are replaced all game, town goes 34-44 (43.6%)
  • When 5 players are replaced all game, town goes 22-26 (45.8%)
  • When 6 players are replaced all game, town goes 17-20 (45.9%)
  • When 7 players are replaced all game, town goes 7-14 (33.3%)
  • When 8 players are replaced all game, town goes 4-9 (30.8%)
  • When 9 players are replaced all game, town goes 1-3 (25.0%)
  • When 10 players are replaced all game, town goes 2-4 (33.3%)
  • When 11 players are replaced all game, town goes 0-2 (0.0%)
  • No games have had more than 11 players replaced
Image

Spoiler: Replacement vs. role stats
RoleSlotsSlots replacedPlayers replaced
Vanilla Townie252833.0%39.2%
Jailkeeper15631.4%33.3%
Cop16120.5%24.2%
Doctor15933.3%39.6%
1-Shot Bulletproof16330.1%35.6%
Tracker15822.8%25.9%
Mafia Goon78634.7%44.4%
Mafia Roleblocker16439.0%41.5%
The following is a z-score graph. 0 means "is exactly average". 1 means "is 1 standard deviation above average"; -1 means "is 1 standard deviation below average". In other words, high bars mean "more replaces", while low mean "less replaces".
Image

Spoiler: Perfect win stats
  • Out of 217 town wins, 40 wins (18.4%) involved only scum lynches.
    • 26 wins (12.0%) were semi-perfect (no town lynches).
    • 14 wins (6.5%) were perfect (no town deaths).
  • Out of 256 scum wins, 146 wins (57.0%) were perfect (no scum deaths).


Notes
  • For any stat that counts wins and losses, draws are ignored.
  • The replacement stats recorded here are likely lower than in reality, as replacements are not recorded if the player never confirmed, or the player was force-replaced due to a mod error or something similar that was not the replaced player's fault. If a player posts in any way (or has been acknowledged by the mod as having confirmed/read their role PM), or gets themselves force-replaced (such as getting banned), replacing them counts.
  • Slot-based stats might be a bit off due to cases where slots "level up" via replacement (e.g. an experienced player replaces into a newbie slot).
  • Players are assumed to have won regardless of modkills (unnecessary complication).
  • Only notable and requested PR stats are listed. Ask for one to get it added.
For more comprehensive graphs, please see the original spreadsheet (.xlsx, 3.57MB).

If you'd like to know something that isn't here, I'll see what I can do.

Posted: Wed Mar 05, 2014 12:33 pm
by Damon_Gant
Oh I am a sucker for stats! I'm going to have to play around with your spreadsheet - I think there is a lot to be learned from stat analysis, both about the successfulness of the Matrix6 setup but also about mafia strategy in general.

Some interesting things I note from this at a glance are:
- The win rates overall are looking about right, but whether the individual setups are balanced remains to be seen.
- Experience does not seem to play a huge role with regards to win rates. It just means you're less likely to need to be replaced.
- Day 1 is crucial

Posted: Wed Mar 05, 2014 2:23 pm
by Cheery Dog
I'm confused how it was possible for games to have ended on night 3.

I thought modkills, but if a town is modkilled, then the game should be going into night.

Posted: Wed Mar 05, 2014 3:07 pm
by Toomai
In post 2, Cheery Dog wrote:I'm confused how it was possible for games to have ended on night 3.
Example would be 1376. No lynch occured in D3 LyLo, so the N3 kill made the game 2:2 for the win. The JK was still alive so it was still possible for town to pull it out, which I guess is why the game wasn't ended on the spot.

Posted: Wed Mar 05, 2014 4:52 pm
by Mr. Flay
^

Posted: Mon Mar 10, 2014 11:06 am
by callforjudgement
I guess the big question here is "why do people no-lynch in lylo?". (EDIT: I read the game, the answer is "mass V/LA near deadline".)

Posted: Mon Mar 10, 2014 7:40 pm
by ~Jordan`
presenting data and then talking about it afterward seems like a good plan

Posted: Tue Mar 11, 2014 1:15 am
by Mr. Flay
In post 5, callforjudgement wrote:I guess the big question here is "why do people no-lynch in lylo?". (EDIT: I read the game, the answer is "mass V/LA near deadline".)
Decision paralysis? For scum the result of No Lynch is almost as good as a mislynch...

Posted: Tue Mar 11, 2014 4:15 am
by TierShift
Okay, I've been running some chi-square tests.
Scumteams aren't significant, alas. Setups, however, are.

Setup 1 is significantly better for town than setup C (p~0.02) and setup 3 (p~0.01)
the rest of the tests have come up as not statistically significant (at p<0.05).

This means that a JK is better for town than a tracker+doc or tracker+1-shot BP.
Does this need balancing? I'm leaning towards no, as newbie games are more about learning the game and about the most used roles than about absolute balance, but if it is something that needs to be done, tracker should be replaced by a slightly stronger investigative role, such as (x-shot) watcher. It's possibly worth discussing.

Posted: Tue Mar 11, 2014 5:23 am
by Cogito Ergo Sum
Does that take into account that you're cherry-picking the set-ups that have the most extreme win-rates? The Jailkeeper set-up's win-rate specifically seems weirdly high. I also doubt Doc+Tracker is really that bad, with Town having 2 perfectly serviceable PRs.

Also, Watchers don't seem fun in Newbies (just consider an IC-newbie scum team; who do you think is going to make the kill?) and Watcher-Doctor specifically is very much undesirable.

Posted: Tue Mar 11, 2014 5:25 am
by Eden
I don't think so. You're getting significant p values, sure, but the number of attempts of each setup aren't significant (setup 1 has 18 iterations, setup 3 has 22, C has 15). I'd revisit this question once every setup has been tried 30 times and see if you're still getting significant differences

Posted: Tue Mar 11, 2014 6:57 am
by TierShift
Hmm...apparently my knowledge of statistical tests is not large enough.

It wouldn't suprise me at all to find significant results at 30 games and even more at 100 games each.

Posted: Tue Mar 11, 2014 8:58 am
by Eden
There's a hundred
total
games, not a hundred
games with setup 1, C or 3
. Like Toomai said in the OP, the results of each individual permutation aren't significant yet (they're all in italics which indicates statistically insignificant results).

Like I said, give it a few months and we'll probably pass the 30-game threshold for individual setups and be able to compare from there.

Posted: Tue Mar 11, 2014 9:00 am
by TierShift
I'm not dumb. When we get to a 100 games each there's probably gonna be a lot of statistical significant differences out there, is what I'm saying.

Posted: Tue Mar 11, 2014 9:59 am
by Eden
Maybe? That's why I said I'd suggest revisit once we get more results. Like I said, the initial differences in these results aren't significant so we can't really extrapolate anything. In particular I'd expect the jailkeeper setup with such a high win % to regress toward whatever the eventual mean is going to be. Beyond that I don't think anyone can really guess.

Posted: Fri Mar 14, 2014 3:20 am
by GreyICE
As I recall the only thing we concluded is that newbie scum are significantly more likely to replace out.

Posted: Fri Mar 14, 2014 7:24 am
by Mr. Flay
All I remember is that newbies, period, are more likely to replace out. What data are you recalling?

Posted: Fri Mar 14, 2014 8:28 am
by GreyICE
Go look at the data at the top, Flay:

Town newbies: 28%-33%-39% (412 newbies)
Scum newbies: 25%-20%-54% (114 newbies)

That jumps from 39% to 54% based on whether they're town or scum.

Posted: Fri Mar 14, 2014 9:19 am
by ~Jordan`
Coooool.

Posted: Tue Apr 22, 2014 12:45 pm
by Quilford
Just bumping this so that everyone is aware of nuu data

Posted: Wed Apr 23, 2014 3:48 am
by Zachrulez
In post 9, Cogito Ergo Sum wrote:Does that take into account that you're cherry-picking the set-ups that have the most extreme win-rates? The Jailkeeper set-up's win-rate specifically seems weirdly high. I also doubt Doc+Tracker is really that bad, with Town having 2 perfectly serviceable PRs.

Also, Watchers don't seem fun in Newbies (just consider an IC-newbie scum team; who do you think is going to make the kill?) and Watcher-Doctor specifically is very much undesirable.
The ability to stop kills is probably quite a bit more powerful than we give it credit for in a 9p setup when the role has two chances to stop the kill. (Hitting the kill target or the source.) Notice the winrate when paired with a roleblocker is much lower.

It'll probably take a bit bigger sample size to solidify that thought process though.

Posted: Wed Apr 23, 2014 3:57 am
by GreyICE
In post 12, Eden wrote:There's a hundred
total
games, not a hundred
games with setup 1, C or 3
. Like Toomai said in the OP, the results of each individual permutation aren't significant yet (they're all in italics which indicates statistically insignificant results).

Like I said, give it a few months and we'll probably pass the 30-game threshold for individual setups and be able to compare from there.
Statistical significance doesn't quite work like that. Something is not magically insignificant at 29 games, and significant at 30. If a setup was 15-0, for instance, you wouldn't need to sit around waiting for 30 to see if it was a fluke.

That being said, they do appear to be close enough to be a fluke, but I do think you can draw some interesting observations.

P.S. Statistical significance must be the most misunderstood concept in human history.

Posted: Thu Apr 24, 2014 7:08 pm
by Burning_Earth
Yeah.

Posted: Fri Apr 25, 2014 7:23 am
by Majiffy
Lol'd at >50% scum newbie replace-rate.

Inb4 replacement PLs become a thing sitewide for a short while.

Posted: Fri Apr 25, 2014 7:50 am
by Mr. Flay
Yeah, but "on average, games replace 3.0 slots with a standard deviation of 1.72". How will you pick the right replacement to PL?

I know, I know, rhetorical....