Personality and Popularity in the Speakeasy (A Study!)
- Plotinus
-
Plotinus Kitten Caboodle
- Plotinus
- Kitten Caboodle
- Kitten Caboodle
- Posts: 7611
- Joined: March 13, 2015
- Location: UTC+1
- Contact:
/inThe failure mode of clever is asshole.
Modding checklists | Sequencer is in Game 5 | Space II is in Day 4- PJ.
-
PJ. Hell in a Cell
- PJ.
- Hell in a Cell
- Hell in a Cell
- Posts: 4601
- Joined: January 5, 2007
- Location: somewhere better than you =*
- SleepyKrew
-
SleepyKrew he/himSnark Attack
- SleepyKrew
he/him- Snark Attack
- Snark Attack
- Posts: 15746
- Joined: April 27, 2011
- Pronoun: he/him
- Location: quack
- Quilford
-
Quilford Jack of All Trades
- Quilford
- Jack of All Trades
- Jack of All Trades
- Posts: 8438
- Joined: March 11, 2011
- Egg
-
Egg Jack of All Trades
- Egg
- Mina
-
Mina The Shipwright
- Mina
- The Shipwright
- The Shipwright
- Posts: 3059
- Joined: October 1, 2009
Wow, Psyche, this is an incredible amount of work. Thank you for putting all that thought and effort into your analysis. This was far more than I was expecting. TBH, I'm a bit sceptical of the IBM test; there are probably some markers in individuals' posting styles or in the threads that they usually participate in that trip up the algorithm--for example, there's probably something causing Drench to rank near the top in so many traits and N to rank at the bottom (maybe N's Normal queue posts that consist of lists of made-up words confuse the program). But like you said, overall averages, and that graph of distinctiveness vs. popularity is really compelling.
While looking at the results, I thought of a hypothesis that might be completely wrong because I'm not very statistically literate, so I apologize in advance if the actual scientists laugh at me for this. Couldn't the correlation between distinctiveness and popularity be partially caused by certain traits having such a skewed average on MS that outliers arerepresentativerather than distinctive? For example, if the site average in a trait like Depression is 0.9, then someone who has a score of 0.5 would be "distinctive" even if that person is typical for the general population. Likewise, someone with an artsiness of 0.5 would appear unusually artsy compared to an average of 0.1.
The pattern that jumped out at me from the completely unscientific process of skimming results for my scores was that the distinctiveness vs. popularity correlation is strongest for traits like depression/artsiness that seem.to have lopsided MS averages rather than those hovering around 0.4-0.6. So if the average score for every trait is 0.5, it's possible that we're actually biased towards representative people...only these people are representative of the entire world rather than the subset of the world that posts on mafiascum.net.
Could you plot popularity vs. the distinctiveness of a trait relative to the general population rather than to MS's community? (E.g., find the difference of a user's trait from 0.5 instead of from the MS average.) I'm curious to see how the two graphs compare.- Brandi
-
Brandi Awwwrtist
- Brandi
- Awwwrtist
- Awwwrtist
- Posts: 2426
- Joined: May 4, 2008
Other factors that might be hard to gauge that could affect how liked or disliked someone is is how many scummers they've met in person, when the last meet was and how recently they posted and such.
Quadz was #1 the first time but now he's been very very busy being an awesome dad so he hasn't had much interactions on here this year, I noticed he scored a lot lower. But I don't think people like him less, he's just not fresh in many people's minds.- Oman
-
Oman NK Immune Miller Vig
- Oman
- NK Immune Miller Vig
- NK Immune Miller Vig
- Posts: 7014
- Joined: June 19, 2007
- Untrod Tripod
-
Untrod Tripod Fat and Sassy
- Untrod Tripod
- Fat and Sassy
- Fat and Sassy
- Posts: 11652
- Joined: September 1, 2003
- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10749
- Joined: April 28, 2011
- Pronoun: he/they
- Happy Scumday!
The graph below relates a trait's "skew of representation" within our community with its popularity. When skew of representation is near 0, that means the trait's representation in our community is close to its representation in the general population. When skew of representation is near .5, then the trait is either overrepresented or underrepresented in our community. What we found is that when a trait's representation isn't skewed in the community, it doesn't contribute much to popularity. When representation is skewed, it does contribute to popularity. (Correlation was .293 with p=.035).
I made this graph because I thought it spoke to Mina's question, but it doesn't so I'm throwing it out without throwing it out. What we really want is a measure of every single user's (notevery single trait's) "averageness" relative to the general population related with every single user's popularity. Let's do that.- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10749
- Joined: April 28, 2011
- Pronoun: he/they
- Happy Scumday!
a more detailed note about interpreting results:
one thing to keep in mind is that it means to score high or low in a trait according to the service
for example self-expression as measured by the service is part of itsNeedsmodel
when it scores you high in self-expression it's not saying you express yourself a lot
it's saying that you "Enjoy discovering and asserting your own identities."
which is different from what one might expect
on my individual profiles post i included a link to a page that will help you interpret the meaning of the different traits the service has measured for you;
maybe skim it if you're interested
but the most important thing is that the service's measures aren'tthatstrongly correlated with the scores you'd get on a more rigorous personality test
IBM conducted a validation study to understand the effect of media on inferred characteristics. To determine the accuracy of the service's approach to estimating characteristics, IBM compared scores that were derived by its models with survey-based scores for Twitter users (for instance, approximately 500 users for English and more than 600 for Spanish).
To establish ground truth, participants took three sets of standard psychometric tests: 50-item Big Five (derived from the International Personality Item Pool, or IPIP), 52-item fundamental Needs (developed by IBM), and 26-item basic Values (developed by Schwartz). IBM conducted the study in two phases:
For the first study, conducted in 2013, IBM recruited 256 Twitter users (Gou et al., 2014). Although the models were learned from different sources, IBM found that for more than 80 percent of the Twitter users, scores for characteristics that were inferred for all three models correlated significantly with survey-based scores (p-value < 0.05 and correlation coefficient between 0.05 and 0.80). Specifically, scores that were derived by the service correlated with survey-based scores as follows:
For 80.8 percent of participants' Big Five scores (p-value < 0.05 and correlation coefficients between 0.05 and 0.75)
For 86.6 percent of participants' Needs scores (p-value < 0.05 and correlation coefficient between 0.05 and 0.80)
For 98.21 percent of participants' Values scores (p-value < 0.05 and correlation coefficients between 0.05 and 0.55)
For the second study, conducted in 2015, IBM recruited another set of 237 Twitter users. The study found that for Big Five and Values, scores for inferred characteristics correlated significantly with survey-based scores (p-value < 0.05 and correlation coefficient between 0.07 and 0.21) for every Twitter user. For needs, such significant correlation was observed for 90 percent of the users (p value < 0.05 and correlation coefficient between 0.01 and 0.20).
In both studies, participants also rated on a five-point scale how well each derived characteristic matched their perceptions of themselves. Their ratings suggest that the inferred characteristics largely matched their self-perceptions. Specifically, means of all ratings were above 3 ("somewhat") out of 5 ("perfect").
For the 256 Twitter users of the first study, means were 3.4 (with a standard deviation of 1.14) for Big Five, 3.39 (with a standard deviation of 1.34) for Needs, and 3.13 (with a standard deviation of 1.17) for Values.
For the 237 Twitter users of the second study, means were 3.31 (with a standard deviation of 1.18) for Big Five, 3.37 (with a standard deviation of 1.22) for Needs, and 3.36 (with a standard deviation of 1.18) for Values.
The key information from this quote is that even while the service does a decisively better job than random at profiling people, it's not the gold standard. Further, a little under half of people are less than even somewhat satisfied with their measured characteristics in the survey. So, yeah. If you really want to know your personality profile, you'll probably want to take a test. Still, I think there's a lot of reason to be excited about the service and its usefulness for research projects like these.While the correlation between inferred and survey-based scores of approximately 80 percent is both positive and significant, the results imply that inferred scores might not always correlate with survey-based results. Researchers from outside of IBM have also done experiments to compare how well inferred scores match those obtained from surveys, and none reported a fully consistent match:
Golbeck et al. (2011) reported an error rate of 10 to 18 percent when matching inferred scores with survey-based scores.
Sumner et al. (2012) reported approximately 65-percent accuracy for personality prediction.
Mairesse and Walker (2006) reported 60- to 70-percent accuracy for Big Five personality prediction.
In general, it is widely accepted in research literature that self-reported scores from personality surveys do not always fully match scores that are inferred from text. What is more important, however, is that IBM found that characteristics inferred from text can reliably predict a variety of real-world behavior.- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10749
- Joined: April 28, 2011
- Pronoun: he/they
- Happy Scumday!
can you be clearer about what you're asking here?In post 9, Cheery Dog wrote:I can have data?
guess i'll make a quick table of contents in the opIn post 17, SleepyKrew wrote:where are the results
most recent post covers this, but i'm gonna try to more carefully grapple with these issues in future studies and make a FAQ for people in this oneIn post 30, Mina wrote:Wow, Psyche, this is an incredible amount of work. Thank you for putting all that thought and effort into your analysis. This was far more than I was expecting. TBH, I'm a bit sceptical of the IBM test; there are probably some markers in individuals' posting styles or in the threads that they usually participate in that trip up the algorithm--for example, there's probably something causing Drench to rank near the top in so many traits and N to rank at the bottom (maybe N's Normal queue posts that consist of lists of made-up words confuse the program). But like you said, overall averages, and that graph of distinctiveness vs. popularity is really compelling.
Okay.
Thanks. You've really given me a strategy for further research on this.While looking at the results, I thought of a hypothesis that might be completely wrong because I'm not very statistically literate, so I apologize in advance if the actual scientists laugh at me for this. Couldn't the correlation between distinctiveness and popularity be partially caused by certain traits having such a skewed average on MS that outliers arerepresentativerather than distinctive? For example, if the site average in a trait like Depression is 0.9, then someone who has a score of 0.5 would be "distinctive" even if that person is typical for the general population. Likewise, someone with an artsiness of 0.5 would appear unusually artsy compared to an average of 0.1.
The pattern that jumped out at me from the completely unscientific process of skimming results for my scores was that the distinctiveness vs. popularity correlation is strongest for traits like depression/artsiness that seem.to have lopsided MS averages rather than those hovering around 0.4-0.6. So if the average score for every trait is 0.5, it's possible that we're actually biased towards representative people...only these people are representative of the entire world rather than the subset of the world that posts on mafiascum.net. :P
Could you plot popularity vs. the distinctiveness of a trait relative to the general population rather than to MS's community? (E.g., find the difference of a user's trait from 0.5 instead of from the MS average.) I'm curious to see how the two graphs compare.
Let's focus on your particular recommendation. I focused on the popularity of traits, but I definitely want to pay attention to users. You posed one way to do this. We relate each user's popularity score with each user's deviation from the general population's average score (.5 on everything!). The low end of the spectrum consists of users who are really representative of the general population, and the high end is users who are really distinctive within the population. There are so so so many other ways to calculate distinctiveness compared to the general population, for sure (and man is that annoying), but that's one of them.
Another things I'll do compare that chart with whatmyinterpretation actually hypothesizes - that the most distinctive people in their communities are the most popular. So far I've only concluded that distinctive traits make people popular, which is similar and suggests the broader distinctiveness hypothesis, but isn't the same thing.
If I somehow don't confirm that result, that'll be sort of amazing. It'll mean that people somehow compensate for their distinctive traits with traits that are ordinary, or something similar to that.
I feel like there's lurking somewhere a grave mistake I've made interpreting what I've collected. This'll help clear things up.
So I'll get to that shortly.
for sure! personality definitely isn't the only factor behind people's popularityIn post 31, Brandi wrote:Other factors that might be hard to gauge that could affect how liked or disliked someone is is how many scummers they've met in person, when the last meet was and how recently they posted and such.
Quadz was #1 the first time but now he's been very very busy being an awesome dad so he hasn't had much interactions on here this year, I noticed he scored a lot lower. But I don't think people like him less, he's just not fresh in many people's minds.
now that i think of it would be cool to do an analysis of recent contests too, shifting the selection of text I profile back the requisite years too!
would also give me a spot to test prediction code
when I first heard of psychology, i imagined that by mastering the field i would acquire some set of skills that might distinguish me from normal people. i'd be able to tell what others were thinking, quickly discern their personalities, even manipulate them with simple strategies or fix their problems. besides that, i'd know myself - how my thoughts work, the key to my own happiness or sadness, the quickest way for me to learn or even become an expert at something. the kid version of me thought that studying psychology was a way to get real-life superpowers! And that was cool to think about.
if i could actually predict the outcomes of future popularity contests with this service...
well it'd be something kid-me could respect a little i guess
...
All of the updates I'll make before my next post:
Table of contents
Better individual profiles (you get a figure you get a figure everyone gets a figure; new measure: how "distinctive" are you?; instead of reporting your ranking relative to everyone else, i'll report your percentile)
More people profiled
More distinctiveness vs representativeness research (Mina's Track)
gonna race to finish by noon
once the dust settles i think i will follow the track suggested by brandi
examining (and 'predicting' the results of) contests past- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10749
- Joined: April 28, 2011
- Pronoun: he/they
- Happy Scumday!
- N
-
N Jack of All Trades
- N
- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10749
- Joined: April 28, 2011
- Pronoun: he/they
- Happy Scumday!
- Cheery Dog
-
Cheery Dog Kayak
- Cheery Dog
- Kayak
- Kayak
- Posts: 8038
- Joined: June 30, 2012
- Location: OMG BALL!
- Psyche
-
Psyche he/theySurvivor
- Psyche
he/they- Survivor
- Survivor
- Posts: 10749
- Joined: April 28, 2011
- Pronoun: he/they
- Happy Scumday!
another idea:
if i reverse score the unpopular traits (ie take 1-DepressionScore to get ~Depression) i can generate for every user a single, one-dimensional value measuring how "likeable" their personalities are (in the Speakeasy)
seems like a stretch of interpretation to do that but the transformation should make a lot of analysis easier anyway- xRECKONERx
-
xRECKONERx GD is my Best Man
- xRECKONERx
- GD is my Best Man
- GD is my Best Man
- Posts: 26087
- Joined: March 15, 2009
- zoraster
-
zoraster He/HimDisorganized Crime
- zoraster
He/Him- Disorganized Crime
- Disorganized Crime
- Posts: 21680
- Joined: June 10, 2008
- Pronoun: He/Him
- Location: Belmont, CA
- Majiffy
-
Majiffy Go with the Flow
- Majiffy
- Go with the Flow
- Go with the Flow
- Posts: 23825
- Joined: November 23, 2011
- Location: Memphis, TN
- Contact:
Surprised I ranked so low on depression considering my constant state of self-loathing.Only playing in games at personal moderator and/or 50%+ playerlist request.
How To Win Every Game At Mafiascum (The Flowchart)||In case anyone was unsure...
Svenskt Stål (23:38) majiffy, worst mod on ms? we talk to a surviving victim of his game- Cephrir
-
Cephrir he/himSurvivor
- Cephrir
he/him- Survivor
- Survivor
- Posts: 25258
- Joined: October 11, 2006
- Pronoun: he/him
- Location: Seattle-ish
- wgeurts
-
wgeurts They/ThemPokédex
- wgeurts
They/Them- Pokédex
- Pokédex
- Posts: 4771
- Joined: September 15, 2014
- Pronoun: They/Them
- Location: The Netherlands
- Contact:
In"i agree we should have a rule against wgeurts" -Davsto
"let's have 2 rules against wgeurts" -DeathRowKitty
- Mina
-
Mina The Shipwright
- Mina
- The Shipwright
- The Shipwright
- Posts: 3059
- Joined: October 1, 2009
Psyche ranked all 146 people in SUPP--he's just only releasing the individual profiles of the people who requested them.In post 43, zoraster wrote:I don't know that I think the popularity to personality scores is going to become very valuable until you get a wider range of popularity, including those at the bottom. - Mina
- wgeurts
- Cephrir
- Majiffy
- zoraster
- xRECKONERx
- Psyche
- Cheery Dog
- Psyche
- Psyche
- Psyche
- Psyche
- Psyche
- Untrod Tripod
- Oman
- Brandi
- Mina
- Quilford
- SleepyKrew
- PJ.
- Plotinus