Previously, the Matrix6 setup in newbie games had impressively balanced per-faction and per-player winrates despite its later-found mechanical shortcomings. 2d3 is next - let's see how it does.
Percentages may not add up to exactly 100% due to rounding and/or draws.
Spoiler: General/subsetup stats
Setup
Wins
Losses
Winrate
Games
A1
6
2
75.0%
8
A2
8
6
57.1%
14
A3
7
2
77.8%
9
B1
6
1
85.7%
7
B2
7
3
70.0%
10
B3
4
3
57.1%
7
C1
5
8
38.5%
13
C2
7
2
77.8%
9
C3
7
3
70.0%
10
A
21
10
67.7%
31
B
17
7
70.8%
24
C
19
13
59.4%
32
Total
57
30
65.5%
87
Spoiler: Result stats
Average player types per game: 5.7 Newbies, 2.5 SEs, 0.8 ICs
Type
Winrate
Total
Town Newbies
64.6%
379
Scum Newbies
28.2%
117
Total Newbies
56.0%
496
Town SEs
68.4%
171
Scum SEs
53.3%
45
Total SEs
65.3%
216
Town ICs
62.7%
59
Scum ICs
25.0%
12
Total ICs
56.3%
71
All town
65.5%
609
All scum
34.5%
174
All players
58.6%
783
Spoiler: Replacement stats
Notes:
If a slot is replaced twice, that counts as one replaced slot and two replaced players. Therefore, the player rate cannot be any lower than the slot rate. A slot rate of 100% means that all 9 slots were replaced at least once. A player rate of 100% means that 9 players (from any slots) were replaced (so it can exceed 100%). I don't know which type of rate is better to use, though using both is useful (it tells us whether all the replacements came from one slot or were distributed amongst multiple slots). (Reminder: The 2d3 setup has 9 players, hence why 9 appears in the previous.)
On average, games replace 2.5 slots with a standard deviation of 1.43, and 2.8 players with a standard deviation of 1.63.
Player
Slot replacement rate
Player replacement rate
Town Newbies
30.3%
34.5%
Scum Newbies
37.8%
44.5%
Total Newbies
32.1%
36.9%
Town SEs
19.1%
20.2%
Scum SEs
42.2%
48.9%
Total SEs
23.9%
26.1%
Town ICs
5.0%
5.0%
Scum ICs
25.0%
25.0%
Total ICs
8.3%
8.3%
All town
24.7%
27.6%
All scum
38.1%
44.3%
All players
27.7%
31.3%
Spoiler: Power role stats
The rate of each role living to endgame is as follows:
Role
Setups
Town win
Scum win
Total
Cop
A1, A3, B1, C1
50.0% (24 games)
7.7% (13 games)
35.1% (37 games)
Doctor
A2, A3, B3, C3
65.4% (26 games)
0.0% (14 games)
42.5% (40 games)
Jailkeeper
A2, B2, C2
54.5% (22 games)
9.1% (11 games)
39.4% (33 games)
Tracker
B1, B2, C3
60.0% (20 games)
28.6% (7 games)
51.9% (27 games)
Neapolitan
A1, B3
80.0% (10 games)
20.0% (5 games)
60.0% (15 games)
Roleblocker
A1, A2, A3
4.8% (21 games)
70.0% (10 games)
25.8% (31 games)
Rolecop
B1, B2, B3
0.0% (17 games)
71.4% (7 games)
20.8% (24 games)
Interesting notes will become available once I notice any, though you can ask for specific ones.
Spoiler: Day 1 stats
Town-scum-NL rate: 60-26-1 (random lynching would give 68-19-0; the difference is [-8.81%]-[7.66%]-[1.15%])
If town is lynched D1, they go 34-26 (56.7%)
If scum is lynched D1, they go 4-22 (15.4%)
If there is no lynch D1, town goes 1-0 (100.0%)
Spoiler: Lynch accuracy stats
In the following table, the first three columns are the raw percentages, while the last two are the difference between that and random (positive = more, negative = less). So when town does better than random, it has a negative in lynching town and a positive in lynching scum (random never no-lynches, so the difference is the data).
(I changed how I store this from the Matrix6 stats, hopefully getting rid of a minor issue involving consecutive no-kills.)
Ratio
Town
Scum
No lynch
Town
Scum
7:2 (88 samples)
68.2%
30.7%
1.1%
-9.6%
8.5%
7:1 (4 samples)
50.0%
50.0%
0.0%
-37.5%
37.5%
6:2 (10 samples)
50.0%
40.0%
10.0%
-25.0%
15.0%
6:1 (26 samples)
53.8%
46.2%
0.0%
-31.9%
31.9%
5:2 (51 samples)
37.3%
58.8%
3.9%
-34.2%
30.3%
5:1 (13 samples)
46.2%
53.8%
0.0%
-37.2%
37.2%
4:2 (9 samples)
44.4%
55.6%
0.0%
-22.2%
22.2%
4:1 (36 samples)
52.8%
47.2%
0.0%
-27.2%
27.2%
3:2 (18 samples)
44.4%
55.6%
0.0%
-15.6%
15.6%
3:1 (14 samples)
57.1%
42.9%
0.0%
-17.9%
17.9%
2:1 (23 samples)
47.8%
47.8%
4.3%
-18.8%
14.5%
Spoiler: Game length stats
No games have ended with Day 1 lynch (only possible via modkills)
No games have ended with Night 1 kill (only possible via modkills)
Games that end with Day 2 lynch take 15.0 days with standard deviation of 5.40 (15 games)
No games have ended with Night 2 kill (only possible via modkills)
Games that end with Day 3 lynch take 24.0 days with standard deviation of 8.81 (32 games)
Games that end with Night 3 kill take 27.4 days with standard deviation of 6.54 (2 games)
Games that end with Day 4 lynch take 29.9 days with standard deviation of 10.98 (35 games)
No games have ended with Night 4 kill
Games that end with Day 5 lynch take 38.9 days with standard deviation of 3.36 (3 games)
One game has ended with Night 5 kill; it took 18.7 days
No games have ended with Day 6 lynch
No games have ended with Night 6 kill
No games have ended with Day 7 lynch
Overall, games take 24.5 days with standard deviation of 11.29; the vast majority of games end with a Day 3 or Day 4 lynch (76.14%), with Day 2 and Day 5 lynches being tied for second-most common (20.45%)
Spoiler: Replacement vs. length stats
When 0 slots are replaced all game, games take 14.5 days with standard deviation of 2.40 (6 games)
When 1 slots are replaced all game, games take 24.7 days with standard deviation of 12.93 (14 games)
When 2 slots are replaced all game, games take 22.8 days with standard deviation of 8.79 (29 games)
When 3 slots are replaced all game, games take 29.0 days with standard deviation of 10.05 (20 games)
When 4 slots are replaced all game, games take 27.0 days with standard deviation of 9.17 (12 games)
When 5 slots are replaced all game, games take 36.0 days with standard deviation of 10.09 (5 games)
No games have had 6 slots replaced
When 7 slots are replaced all game, games take 25.3 days with standard deviation of 12.43 (2 games)
No games have had 8 slots replaced
No games have had 9 slots replaced
When 0 players are replaced all game, games take 14.5 days with standard deviation of 2.40 (6 games)
When 1 players are replaced all game, games take 20.3 days with standard deviation of 8.48 (11 games)
When 2 players are replaced all game, games take 23.3 days with standard deviation of 9.21 (25 games)
When 3 players are replaced all game, games take 29.9 days with standard deviation of 12.98 (17 games)
When 4 players are replaced all game, games take 26.9 days with standard deviation of 8.98 (18 games)
When 5 players are replaced all game, games take 33.3 days with standard deviation of 11.39 (5 games)
When 6 players are replaced all game, games take 31.3 days with standard deviation of 4.27 (4 games)
One game replaced 7 players; it took 12.9 days
One game replaced 8 players; it took 37.7 days
No games have had 9 players replaced
Spoiler: Replacement vs. winrate stats
For slots:
When 0 slots are replaced all game, town goes 4-2 (66.7%)
When 1 slots are replaced all game, town goes 12-2 (85.7%)
When 2 slots are replaced all game, town goes 18-10 (64.3%)
When 3 slots are replaced all game, town goes 12-8 (60.0%)
When 4 slots are replaced all game, town goes 7-5 (58.3%)
When 5 slots are replaced all game, town goes 2-3 (40.0%)
No games have had 6 slots replaced
When 7 slots are replaced all game, town goes 2-0 (100.0%)
No games have had 8 slots replaced
No games have had 9 slots replaced
For players:
When 0 players are replaced all game, town goes 4-2 (66.7%)
When 1 players are replaced all game, town goes 9-2 (81.8%)
When 2 players are replaced all game, town goes 16-8 (66.7%)
When 3 players are replaced all game, town goes 13-4 (76.5%)
When 4 players are replaced all game, town goes 10-8 (55.6%)
When 5 players are replaced all game, town goes 1-4 (20.0%)
When 6 players are replaced all game, town goes 2-2 (50.0%)
When 7 players are replaced all game, town goes 1-0 (100.0%)
When 8 players are replaced all game, town goes 1-0 (100.0%)
No games have had 9 players replaced
Spoiler: Replacement vs. role stats
Role
Slots
Slots replaced
Players replaced
Vanilla Townie
463
14.9%
26.1%
Cop
38
18.4%
18.4%
Doctor
41
22.0%
22.0%
Jailkeeper
32
21.9%
21.9%
Tracker
27
11.1%
11.1%
Neapolitan
15
33.3%
33.3%
Mafia Goon
120
33.3%
40.0%
Roleblocker
32
31.3%
31.3%
Rolecop
24
37.5%
37.5%
The following is a z-score graph. 0 means "is exactly average". 1 means "is 1 standard deviation above average"; -1 means "is 1 standard deviation below average". In other words, high bars mean "more replaces", while low mean "less replaces".
Spoiler: Team type stats
Town
vs
Scum
Nb
SE
IC
Nb
SE
IC
Town winrate
Games
3
3
1
2
0
0
100.0%
6
4
2
1
1
1
0
20.0%
10
4
3
0
1
0
1
50.0%
2
5
1
1
0
2
0
0.0%
2
5
2
0
0
1
1
0.0%
2
4
2
1
2
0
0
73.9%
23
5
1
1
1
1
0
68.8%
16
5
2
0
1
0
1
100.0%
6
6
0
1
0
2
0
50.0%
2
6
1
0
0
1
1
100.0%
2
3
4
0
2
0
0
100.0%
2
4
3
0
1
1
0
100.0%
2
5
2
0
0
2
0
-
0
4
3
0
2
0
0
71.4%
7
5
2
0
1
1
0
40.0%
5
6
1
0
0
2
0
-
0
5Nb-3SE-1IC
40.9%
22
6Nb-2SE-1IC
75.5%
49
5Nb-4SE-0IC
100.0%
4
6Nb-3SE-0IC
58.3%
12
Notes
For any stat that counts wins and losses, draws are ignored.
The replacement stats recorded here are likely lower than in reality, as replacements are not recorded if the player never confirmed, or the player was force-replaced due to a mod error or something similar that was not the replaced player's fault. If a player posts in any way (or has been acknowledged by the mod as having confirmed/read their role PM), or gets themselves force-replaced (such as getting banned), replacing them counts.
The general rule is: a replacement counts if it's the player's fault (intent is irrelevant); a replacement doesn't count if it's the fault of someone else.
The idea is so the replacement stats are more of a "players who quit the game" than a "players who need to be replaced".
Slot-based stats might be a bit off due to cases where slots "level up" via replacement (e.g. an experienced player replaces into a newbie slot).
Players are assumed to have won regardless of modkills (unnecessary complication). They are however still dead. So far there have been no complications involving multiple scum/PRs dying in the same phase.
If the last scum concedes, it's recorded as a lynch on them. If both scum are alive when they concede, it's recorded as no lynch; the game simply ends in a town win with both scum alive. (This is how a mafia PR might still be alive at endgame in a town win.)
Only notable and requested PR stats are listed. Ask for one to get it added.
Apparently Newbie 1859 was given up by scum while they were both still alive. For now I'll just say the game ended Day 2 with no lynch and let the stats sort themselves out.
This should be required reading for...everyone for anything, really.
I would have to say that, while only 12 completed games in and thus not yet worthy of action, the new deadline rules do not look promising so far for game balance.
(shorthand: [days for D1] : [days for D2+] - [hours for N])
Last edited by Toomai on Sun Oct 21, 2018 10:12 am, edited 1 time in total.
This should be required reading for...everyone for anything, really.
In post 22, 2 718281828459 wrote:I saw some 1's and think that I was in some of those games. (The one where town lost from a 6v1, and the one where the IC replaced as town. Those were actually both the same game.)
In post 25, NotAJumbleOfNumbers wrote:I’d like to know the win rate of certain scumteam pairings (newbie/newbie, newbie/SE, newbie/IC, SE/SE, SE/IC).
This is available on the "Result-Team Stats" sheet of the Excel workbook. I'll add it to the first post the next time I update it. (Which I'd like to be today but things happen.)
This should be required reading for...everyone for anything, really.
After a fair amount of technical difficulties on my end, I have an update.
And it's not good: after 68 games, town winrate is 63.2%. That's a lot closer to "scum only wins a third of the time" than "both teams have a 50/50 shot". Concurrently, scum newbies have a paltry 29.9% winrate (though this is probably just because newbies are the most common playertype, rather than newbies being that much worse).
And this is even worse: Most games recently are of the form [6 newbies, 2 SEs, 1 IC], and town flat-out dominates them to the tune of 73.9%. (This is new in the first post under "Team type stats".) Having the 6 newbies makes it that much more likely to get a scumteam of 2 newbies, and they appear to git rekt most of the time. Matrix6 had a similar pattern but it wasn't nearly as extreme (the winrate difference between 5-3-1 and 6-2-1 was only 10.6%).
It's still early; 68 games doesn't leave a lot of room for the 9 subsetups to stabilize. But it's worrying.
This should be required reading for...everyone for anything, really.
Cop and Doctor are the most fundamental roles in the game across all sources, not just this site, so including them in the newbie setup in some fashion is a very strong consideration (natural familiarity for any new players). The rest of the setup is pretty much a slow and constant evolution to deal with discovered problems.
I said this a while ago when the Matrix6 BP discussion was alive and people were responding with a huge variety of ideas:
In post 198, Toomai wrote:May I remind everyone that the newbie setup started with the most fundamental, well-known roles and has evolved very slowly with intent to keep things fair and reasonable without sacrificing those basic roles.
Original: 5v2, Cop + Doctor vs 2 goons. Follow the Cop was developed within 100 games.
C9: Same as Original, but to break up FtC, it's a 50/50 chance that there's no Cop, and also a 50/50 chance that there's no Doctor. Obviously evened up very scumsided.
Pie E7: Same as Original, but to break up FtC, gives scum a Roleblocker. The Matrix6 version of this has 2 more players and is pretty balanced, meaning this 7-player version must have been scumsided.
California: Same as Original, but without the Doctor altogether. Scumsided.
F11: Moved to 9 players. Four subsetups: you can have a Cop, or a Doctor, or a Roleblocker, or all three. The "all three" one is balanced as we now know, but the rest are badly scumsided.
2of4: Gives scum a Rolecop and town two of the following: Cop, Doctor, Jailkeeper, or nothing (VT). Worked pretty well, but the Doctor+Jailkeeper subsetup was too good for town.
Matrix6: What we have now. Was almost certainly built around the Cop+Doctor+RB trio, with the JK, Tracker, and BP added to even out the matrix.
We don't have any real reason to suddenly blow everything up and add all these new roles. Just keep it simple.
All that being said, there's no reason to seriously consider trashing 2d3 yet. It's worrying, and possibly worth discussing options, but the setup as a whole isn't yet proven awful.
(Besides if absolutely necessary we can go back to Matrix6. It's not unbalanced enough to return to if people get sick of 2d3 faster than something new can be developed.)
(And yes, any change at all is a new setup. Don't confuse newbies who look up "2d3" and get two different matrices.)
This should be required reading for...everyone for anything, really.
In post 44, Not Known 15 wrote:It is complicated. Newbie games should not be overly complicated.
Setups naturally become more complicated in order to plug the holes discovered in simpler setups. And as long as the matrix can still be expressed in a list-of-possible-setups form (which it can), it can't be
Good balance is certainly desireable, but logically a queue full of inexperienced players will have games of larger swing. And for newbie games, balance is arguably less important than player retention.
In post 44, Not Known 15 wrote:It invites specific role-based strategies that are extremely rare outside of semi - open games which are indeed rare.
This is fundamentally inevitable. Newbie games must be open or semi-open for a more consistent experience, and meta role play always follows from that.
In post 44, Not Known 15 wrote:The Newbie queue should discard 2d3 because it is inferior to the old setup.
The numbers are scratching their chin on this, but it's not conclusive yet. And with the removal of the IC role, things might change in the future simply from that (such as no more "the IC was killed night 1, might be a newbie scum team").
This should be required reading for...everyone for anything, really.
In post 48, northsidegal wrote:i think the argument that people make (and that lycanfire is making specifically) is that the newbie setup doesn't necessarily need to be some "system" like matrix6 or 2d3, it can just be a small number of setups that are known to be balanced and are randomized from
Practically, how is that different than having one setup comprised of multiple subsetups?
This should be required reading for...everyone for anything, really.
I think this ^ (making scum abilities factional rather than assigning them to a slot) would be a good simple change if we're looking to keep things relatively similar to the way they are but help scum out a bit.
If there's consensus the setup has to change to help scum not git rekt, I think this is a solid answer that's ultimately minor but may very well have a good effect.
Spoiler: 2d3 modified (changes in bold)
A
B
C
Mafia
Mafia Roleblocker
Mafia Rolecop
(none)
Row 1
Town Cop and Town Neapolitan
Town Cop and Town Tracker
Town Cop and Vanilla Townie
Row 2
Town Jailkeeper and Town Doctor
Town Jailkeeper and Town Tracker
Town Jailkeeper and Vanilla Townie
Row 3
Town Cop and Town Doctor
Town Neapolitan and Town Doctor
Town Tracker and Town Doctor
Each Newbie Game will be given a setup that incorporates one mafia
factional ability
from the top of a column, and then two town roles from a row below the selected mafia role.
The mafia is composed of
two mafia goons
, but either of them may use their factional ability each night, in addition to the mafia factional kill.
Mafia are able to communicate in their Private Topic at all times (daytalk).
Note that both goons would flip as "Mafia Goon", so town would never know (for confirmed) what PR scum has.
A weaker alternative would be to make the second goon a Universal Mafia Backup, which would still mean scum can never be without their PR, but give town information if they die first.
This should be required reading for...everyone for anything, really.
In post 130, Not Known 15 wrote:I think that's not effective enough. Currently, the win rates(and those are by 33% still from the more favourable long deadlines) are at about 60% for column C. And Column C would not change much!
I think it could be more effective than it seems on the surface. Making the scum PR factional would:
make the scum PR unkillable
deny town information used to narrow the setup
make both scum equal; one of them doesn't have to fight harder to live and so might not resort to riskier fakeclaims
I don't know if it's something that happens a lot but #3 here could be important.
In post 133, popsofctown wrote:Here's a question I have though: Are lynches made at or near the deadline to avoid a no lynch hitting with a different accuracy than lynches made because a town is satisfied with their day? And how do those two outcomes impact subsequent days?
My stats aren't able to capture this question but I'm sure another interested individual might be willing to investigate.
This should be required reading for...everyone for anything, really.
Unusual idea to stack on top of the 2d3 modified setup: If scum have a Rolecop, the mod gives them a head start of a night 0 result on a random VT. (As opposed to a selectable or fully random result, which might hit a town PR and that'd be huge swing.) Combined with 2d3M mostly giving Roleblockers more power, that'd do a fair bit of scum tilting in the A- and B-groups. (C-group isn't as wacked, I think it's fine to leave alone if we're poking the other two.)
This should be required reading for...everyone for anything, really.
The reason every newbie setup thus far has started with Cop and Doctor as possibilities is because of how universal such roles are in the game across all its variants and communities, and thus provides a stable foundation to build a setup on that newbies are likely to at least partially understand. I think it is more important to keep them around than to go off somewhere else.
Similarly I also dislike Neapolitan. It's the only town role with more replace-outs than average (higher than the scum Roleblocker even) (admittedly with low sample size), so it's clear that newbies don't like playing it, probably because they don't quite understand how to use it well.
This should be required reading for...everyone for anything, really.