Nate, since I’m standing at your desk today, I figured it’s only fair that I finally get back to you.
The first thing that comes to mind is social identity theory, which focuses on how comparisons with out-groups are central to defining and valuing group identities. Tajfel and Turner (1986) is the canonical piece; Hogg and Terry (2000) extend it into organizational contexts. Brubaker and Cooper (2000) place social identity theory in the context of other ways people talk about identity and is a helpful companion to any dive into identity theories. Though this would require specifying why intensified monitoring and enforcement leads to increased attention to social identities at all. Perhaps social identity is a tool supporting the monitoring and enforcement regimes you have in mind?
Another potential mechanism is that intensifying monitoring and enforcement leads to a greater need to coordinate decisions with others. For instance, enforcement actions with greater sanctioning may increase the importance that any given enforcer believes that their peers will approve of their enforcement decisions. A moderator may not worry so much about their peers’ perceptions when enforcement is limited to rolling back changes, but much moreso when banning someone. Correll et al. (2017) argue that the need to coordinate decisions will lead people to rely on conventional indicators of quality, as these give the best guess at what their peers are likely to believe. Whereas they focus on status symbols becoming more important, I could see an argument for in-group/out-group distinctions becoming more relevant under such conditions. That is, I have stronger enforcement options, so I worry more about what my peers think, so I am more likely to incorporate obvious signals, like group membership, into my decisions because I believe my peers will, too.
Finally, maybe group membership becomes more important because increased monitoring/enforcement gives the impression that group members have already been vetted by trusted others? Group membership under more stringent regimes might act as a specific status characteristic that signals quality to monitors. Under this set of mechanisms, it’s not the enforcers become biased against out-groups but biased towards in-group members. I’m not sure how you would distinguish this empirically. If you can get a copy, Cecilia Ridgeway’s new book Status is the best statement of the status theory underlying the argument. Otherwise, the canonical statement of expectations state theory, and the role played by diffuse and specific status characteristics is Berger et al.’s (1980) review piece.
Hope that helps! Happy to chat more if it’d be helpful.
Cheers! -Clark
Berger, Joseph, Susan J. Rosenholtz, and Morris Zelditch. 1980. “Status Organizing Processes.” Annual Review of Sociology 6: 479–508. Brubaker, Rogers, and Frederick Cooper. 2000. “Beyond ‘Identity.’” Theory and Society 29 (1): 1–47. Correll, Shelley J., Cecilia L. Ridgeway, Ezra W. Zuckerman, Sharon Jank, Sara Jordan-Bloch, and Sandra Nakagawa. 2017. “It’s the Conventional Thought That Counts: How Third-Order Inference Produces Status Advantage.” American Sociological Review, 0003122417691503. Hogg, Michael A., and Deborah J. Terry. 2000. “Social Identity and Self-Categorization Processes in Organizational Contexts.” The Academy of Management Review 25 (1): 121–40. https://doi.org/10.2307/259266. Tajfel, Henri, and John C Turner. 1986. “The Social Identity Theory of Intergroup Behavior.” In Psychology of Intergroup Relations, edited by Stephen Worchel and William G. Austin. Chicago: Nelson-Hall.
look at Morey’s paper on the fallacy of confidence intervals <- ** TODO Report the sample size by Wiki for each of our models.
use social identity theory to argue that measures like anonymity and having a user page constitute “identity based signals”
- is there another measure we can use for controversial sanctioning?
- is there another measure we can use for has user page?
- maybe raising the 20k strata limit will help.
- also double check for bugs.
use social identity theory to argue for taste-based discrimination on the basis of group membership as a plausible intuition for null findings
Increase the sample size so we have more non-reverted edits around the very likely damaging threhsold.
WONT DO Use use the same date range for all wikis and exclude those without data for the entire range?
Ask Halfak if we can log the scores DB automatically since this lets us stratify by score which is more convenient.
cite all the papers about the importance of studying Wikipedia in many languages. then we can cite the reading time paper maybe.
build argument that moderation is fast paced and stressful more to help with the above, it’s as easy as citing Sarah Roberts and Seering more.
Emphasize visibility and monitoring as a useful concept for thinking about governance. Visibility and salient signals are two different mechanisms that our two hypotheses try to tease apart.
Get scores from https://quarry.wmflabs.org/query/40712 if the missing data is bad.
Try a less restricted time series model: see if a long-run spline and a short-run spline (or a lagged dv) are stationary according to the Breush-Godfrey test. (Do this after i’m done with other things I can do first while the stan models run)S
for week of year with fixed effects for month instead of fixed effects for week.
This actually isn’t that important and we probably don’t have to do it unless reviewers ask. It’s probably enough to keep it up to date with the new wikis. Also, it’s a bit of a hassle. This is somewhat fraught. Seems like between wiki-heterogeneity makes it difficult to estiamte a pooling effect. So let’s hold off on that and either present an average-edit model or seperate models for each wiki. But which?What’s the right way to do this? Have equal sized samples from each wiki and don’t weight.
I didn’t find a good explanation, but I noticed that I wasn’t removing bots. Also we should model p.reverted instead of n.reverted. I’ll try again later.We don’t need to do this since we’ll want to compare estimates and so have a need for bayes.
probably should be fitting binomial models predicting proportion reverted instead it fits ok when we don’t do QR decomposition.Maybe it’s one cutoff per model but we exclude data on the other sides of the other cutoffs. Or we don’t. Mako might be helpful with that.
Git-annex isn’t installed on wmf machines. So I need to ask about it.- two different extreme assumptions could be: the same damage gets reverted, it takes more work. 2. Stuff doesn’t get reverted at all, the cost of debiasing is more damage getting through.