We introduce a novel test for self-recognition in language models and show that current frontier LMs do not consistently recognize their own outputs. Models instead prefer answers they perceive as best, regardless of source.
Nov 1, 2024
A causal study on Reddit shows that interactions between fringe and mainstream users drive fringe community growth. Toxic language and political context amplify this effect. Findings suggest moderation should focus on fringe cross-community interactions.
May 28, 2024
This paper shows that active and passive bill cosponsorship in the U.S. Congress reflect different motivations: loyalty to political allies versus content-based support. An Encoder+RGCN model predicts cosponsorship and generalizes to voting behavior.
Jul 1, 2023
Online platforms face pressure to keep their communities civil and respectful. Thus, banning problematic online communities from mainstream platforms is often met with enthusiastic public reactions. However, this policy can lead users to migrate to alternative fringe platforms with lower moderation standards and may reinforce antisocial behaviors. As users of these communities often remain co-active across mainstream and fringe platforms, antisocial behaviors may spill over onto the mainstream platform. We study this possible spillover by analyzing 70,000 users from three banned communities that migrated to fringe platforms r/The_Donald, r/GenderCritical, and r/Incels. Using a difference-in-differences design, we contrast co-active users with matched counterparts to estimate the causal effect of fringe platform participation on users' antisocial behavior on Reddit. Our results show that participating in the fringe communities increases users' toxicity on Reddit (as measured by Perspective API) and involvement with subreddits similar to the banned community---which often also breach platform norms. The effect intensifies with time and exposure to the fringe platform. In short, we find evidence for a spillover of antisocial behavior from fringe platforms onto Reddit via co-participation.
Jun 2, 2023
We study the migration decisions of users from banned radical communities and their continued presence on mainstream platforms. Using the RECRO framework, we show that user-level behavior predicts who migrates to fringe platforms and who stays active across both ecosystems.
Apr 30, 2023
CGA is a conditional VAE model that enables high-quality, multi-attribute text generation. It improves downstream NLP tasks through controlled augmentation, often matching real data performance.
Nov 1, 2020