How to automate your moderation policy

A moderation policy can make or break a website’s comments section. The good news is that Disqus offers several features that allow publishers to automate components of their moderation policies to save them time and effort.

Automate your moderation policy with Disqus tools

What moderation rules entail

Moderation rules allow you to define your automated moderation practices by assigning specific actions to filters. This feature enables your team to work more efficiently by freeing your time to focus on non-repetitive moderation tasks.

All sites can add automated moderation rules based on comment toxicity and their Restricted Words list. Sites on the Pro and Business plans will have access to additional moderation rules to target objectionable content, specifically by category and severity. We’ll get into more detail on advanced moderation rules further down.

The Toxicity Mod Filter empowers moderators to prioritize toxic content for moderation to lower their negative impact on the community and decrease reliance on users flagging comments. Toxicity is a label Disqus provides for publishers in the moderation panel. Publishers can see which comments are labeled as toxic from their moderation panels. Publishers can also sort by “toxic” to see all toxic comments. Comments labeled “toxic” are not moderated or pre-moderated by default. The toxicity label is just another piece of information to help publishers moderate more effectively.

Regarding Restricted Words, a publisher could establish a rule so comments containing these terms are deleted automatically, without any action from moderators. Clicking on each rule will expose analytics, showing how many comments from the last 30 days fit the conditions in your rule and how you moderated them. You can use this to forecast how effective each rule will be.

How to set your moderation rules

You can set up moderation rules from your Moderation Settings page.

Start by clicking the button labeled + Add Rule to add your first rule. Below is a sample rule that you can configure:

You can choose any combination of the following:

If the comment matches:

Contains link
Flagged at least five times

If the user matches:

Profile flagged at least five times

Then:

Send to “Pending”
Delete
Mark as spam

To enable the rule, click the toggle button from “Off” to “On.” Note that you can assign individual rules different priority by using the up and down arrows in each rule's left section. If a comment matches multiple rules, the topmost rule will take the highest priority.

Click Save to save your current set of moderation rules. Comments affected by a moderation rule will be marked with a reason like: In Pending because Toxic.

Establishing advanced moderation rules

Available with our Pro or Business plans, our AI-informed Advanced Moderation tooling will provide more specific categorization and controls to your site’s comments, allowing you to remove objectionable content with heightened precision and automation. Please see our Moderation Settings and Toxicity Filter documentation for additional moderation tools available to all sites.

Categories

Within Disqus’ Advanced Moderation, there are several different categories of objectionable content. Each category can be restricted or allowed independently. You can apply multiple categories to a single comment.

The categories are as follows:

Hate Speech
Violent Content
Sexual Content
Bullying
Promotion

Severity

Within each category, comments are also graded based on severity. Comments with a grade of “3” will be the most explicit or extreme content for that category. Comments with a grade of “1” will be the least extreme content that still fits the content category.

Below is a breakdown of the severity ratings within each category:

Hate Speech

3 - Hate Speech: Slurs, promotion of hateful ideology
2 - Slurs: Negative stereotypes or jokes, degrading comments, denouncing slurs, challenging a protected group's morality or identity, violence against religion
1 - Informational: Positive stereotypes, informational statements, reclaimed slurs, references to hateful ideology, immorality of protected group's rights

Violent Content

3 - Intimidation: Serious and realistic threats, mentions of past violence
2 - Instigation: Calls for violence, destruction of property, calls for military action, calls for the death penalty outside a legal setting, mentions of self-harm/suicide
1 - Description: Denouncing acts of violence, soft threats (kicking, punching, etc.), violence against non-human subjects, descriptions of violence, gun usage, abortion, self-defense, calls for capital punishment in a legal setting, destruction of small personal belongings, violent jokes

Sexual Content

3 - Explicit: Intercourse, masturbation, porn, sex toys, and genitalia
2 - Intent & nudity: Sexual intent, nudity, and lingerie
1 - Statements: Informational statements that are sexual, affectionate activities (kissing, hugging, etc.), flirting, pet names, relationship status, sexual insults, and rejecting sexual advances

Bullying

3 - Brutalizing: Slurs or profane descriptors toward specific individuals, encouraging suicide or severe self-harm, severe violent threats toward specific individuals
2 - Profane: Non-profane insults toward specific individuals, encouraging non-severe self-harm, non-severe violent threats toward particular individuals, silencing or exclusion
1 - Insults: Profanity in a non-bullying context, playful teasing, self-deprecation, reclaimed slurs, degrading a person's belongings, bullying toward organizations, denouncing bullying

Promotion (there is only one severity rating for Promotion)

Promotion: Asking for likes/follows/shares, advertising monthly newsletters/special promotions, asking for donations/payments, advertising products, selling pornography, giveaways

The severity descriptions above are also visible in your Moderation Rules section. Simply click into the white space of the rule to view the breakdown for that content category.

Setting up Rules

When setting up moderation rules on the content categories, please note that a rule for a specific severity level will also be applied to all severity levels above it. For example, suppose you set a rule to delete all comments that match the lowest tier of Bullying (1 - Insults). In that case, this rule will also delete comments labeled as Bullying 2 (Profane) and Bullying 3 (Brutalizing). If you instead set a rule to delete Bullying 3 comments, this will only delete comments matching Bullying 3, and comments matching Bullying 2 and Bullying 1 will not be removed.

You can set a severity level for each category and determine what happens to comments matching that severity and above. Comments matching the severity level for that category can be automatically deleted, automatically set to “Pending” for moderator review, or automatically marked as spam. Find additional instructions on setting up moderation rules here.

Monitoring and reviewing comments

Comments will show the category and severity ratings regardless of whether an automated rule removes them. These gradings will appear in tags on the comments in the comments stream of your moderation panel. Viewing the tags on your existing comments can help you understand what automated rules to implement.

Additionally, you can apply moderation filters to view only comments matching one or more of the content categories. If you’d like to view or moderate only comments that contain Bullying comments, you can select the Bullying filter here:

Once the rules have been active and running, clicking into each rule will report the number of comments that have had action taken on them by the given rule and a link to view your past comments that fit the category. This can help you determine which rules function as desired and which should be adjusted.