Hot on HuffPost Tech:

See More Stories
AOL Tech

Putting Out the Flame War: Algorithm Identifies Rude Comments

xkcd comicLately, we've been worried about the mask of anonymity that many Internet users don before they hop onto a comment thread and start spewing vile sentiments at their fellow users. Unlike day-to-day speech, which has its ingrained mores and relative codes of civility, the Internet is a whole new communication mechanism, where you don't need to look someone in the eye as you tell them that future generations would benefit from their sterility. (Or the old, "KILL URSELF HAHA.") Cyberbullying is such a problem that governments are enacting legislation to combat it, and some lone individuals are upset or off their rockers enough to hunt down their online adversaries with a loaded gun.

So now we look to the statistically oriented eye of the computer to figure out a solution. Elizabeth Churchill, a cognitive scientist working at Yahoo! Research, and Sara Owsley Sood of Pomona College have been analyzing Internet forum threads for trends in commenting, trying to develop an automated method that actively responds to the ugly side of Web commentary. Churchill's hope is that a computer system might be able to identify angry commenters who stray off-topic, and figure out a way to steer them back into the conversation without banning them outright, according to Wired. "We might want mechanisms where you can ask people to tone it down, or 'take it outside' to not disrupt others, or use humor to defuse situations," she said.

Looking at 168,095 threads taken from October 2009 articles on Yahoo! Buzz, the researchers used search-engine-like algorithms to determine whether the 782,934 comments were on- or off-topic, according to Wired. (Comments that more frequently employed words that appeared in the source stories were deemed relevant.) Using LiveJournal posts as a sort of emotional cipher, the researchers' software then determined which kinds of words and phrases correlated with a certain mood. (LiveJournal users can tag their entries with "moods" like "happy," "sad," "calm," and "angry.") Apparently the algorithms got it right most of the time; 65- to 80-percent of the comments automatically identified as happy, sad or angry coincided with the researchers' personal assessments.

There are still lexical quirks to be worked out, though; sarcasm is a difficult thing to identify without understanding context (a problem that many AI systems have when learning language), as are phrases like "terribly good." But Churchill noted at the 2010 Grace Hopper Celebration of Women in Computing in Atlanta, where she presented her findings, that the system could learn to read emoticons to give emotional context to ambiguous phraseology.

But is auto-moderating Internet forums a form of language despotism? You could argue that a computer algorithm designed to identify "happy" comments would be logically applied to moderating discussions to keep them positive -- and that would be a disservice to the concept of debate, which requires differing opinions, dissidence and, sometimes, anger.

The key to the researchers' work, though, is that it looks for a comment's relevance in relation to the original article. On many of our own posts, say, one involving a Facebook privacy update, we'll find one or more people who feel the need to throw "BARRY SOETORO AKA BARACK OBAMA IS A KENYAN COMMUNIST" into a debate that is completely apolitical. Do we censor these passionate pundits, or allow a computer to make that judgment? And then you have the militant grammarians, who are sometimes difficult to condemn, even if they do turn the conversation into a flame war over the misuse of an apostrophe. How do you make an algorithm tell people, "Yes, I understand that you're enraged by SEXXXYGAL43's use of the non-word 'irregardless,' but this is a story about refrigerators"?

The problem is that we humans are still struggling with a way to keep online conversations lively, engaging and on-topic, because we may sometimes even agree with the rogues' non sequiturs. An algorithm may help to identify and gain insight into comment bashing, but the macro-scale problem is one that needs to be addressed on the human level: what makes people want to be civil?

Tags: anger, comments, cyberbullying, ElizabethChurchill, etiquette, language, SarahOsleySood, top, web