With all the resources, power and influence they possess, social media platforms could and should be doing more to detect hate speech, says a University of Michigan researcher.
In a report by the Anti-Defamation League, Libby Hemphill, Associate Research Professor at UM’s Institute for Social Research and ADL Belfer Fellow, explores the shortcomings of social media platforms when it comes to supremacy discourse white and how it differs from general speech. or non-extremist speech, and recommends ways to improve automated methods of identifying hate speech.
“We also sought to determine if and how white supremacists adapt their speech to avoid detection,” said Hemphill, who is also a professor at UM’s School of Information. “We found platforms often miss discussions of conspiracy theories about white genocide and Jewish power and malicious grievances against Jews and people of color. The platforms also allow suitable but defamatory speeches to persist.
How platforms can do better
White supremacist discourse is easily detectable, Hemphill says, detailing the ways it stands out from mundane discourse on social media, including:
- Frequently referring to racial and ethnic groups using plural noun forms (white, etc.)
- Adding “blank” to otherwise unmarked terms (e.g. power)
- Use less profanity than is common in social media to evade detection based on “offensive” language
- Be consistent across extremist and mainstream platforms
- Ensure consistency of complaints and messages from year to year
- Describe Jews in racial rather than religious terms
“Given identifiable linguistic markers and consistency across platforms, social media companies should be able to recognize white supremacist speech and distinguish it from general non-toxic speech,” Hemphill said.
The research team used commonly available computing resources, existing machine learning algorithms, and dynamic subject modeling to conduct the study.
“We needed data from both extremist and mainstream platforms,” Hemphill said, noting that the mainstream user data comes from Reddit and the extremist website user data comes from Stormfront.
What should happen next?
Even though the research team found that white supremacist speech is identifiable and consistent – with more sophisticated computing capabilities and additional data – social media platforms still miss a lot and struggle to distinguish non-profane speech. and hateful of profane and harmless speech.
“Leveraging more specific training datasets and reducing their emphasis on profanity can improve platform performance,” Hemphill said.
The report recommends that social media platforms: 1) apply their own rules; 2) use data from extremist sites to create detection models; 3) search for specific linguistic markers; 4) minimize profanity in detecting toxicity; and 5) train moderators and algorithms to recognize that white supremacist conversations are dangerous and hateful.
“Social media platforms can enable social support, political dialogue and productive collective action. But the companies behind them have civic responsibilities to fight abuse and stop hateful users and groups from harming others,” Hemphill said. “We hope these findings and recommendations will help platforms fulfill these responsibilities now and in the future.”