Dealing With Comment Spam

A popular feature on Petition and Signup pages shows recent signers in a scrolling view. This communicates activity, but it also attract spammers to your Pages. Spammers use your forms to get their content and links onto the Page, even if the displayed content is not rendered as HTML. The ultimate targets are search engine indexers or users silly enough to click the links.

Comment spam presents a real risk to your email reputation. Spammers submit non-deliverable emails at real (and usually) large domains, for example gmail.com and yahoo.com. The emails will bounce the first time you send a Mailing to them, but there is a potential reputation cost to trying to email a large number of bad addresses.

Note

Depending on how and when your ActionKit instance was initially set up, you may have spam checking completely disabled, or you might have it enabled but set to not actually suppress actions that are flagged as spam.

Quickstart

If you're having trouble with comment spam, you should:

  • Verify the Honeypot Check is enabled.
  • Check the Spam Check Log, adjust the Threat Score up or down.
  • Enable Suppress Actions and Unsubscribe Users.
  • Check the Spam Check Log daily, then weekly to make sure you aren't catching legitimate users.
  • Report any bugs!

"Trial Mode"

If you notice a spam problem, and want to start using this feature, you may want to start with it in a "trial" mode to make sure that it's configured properly, so that it catches actions that are spam, and doesn't catch ones that aren't.

  • Enable spam checking, but don’t enable suppression/unsubscribes.
  • Let that run for a few days during which you regularly review the spam check log.
  • Check to see whether the spam check is overly aggressive, or is not aggressive enough -- eg, is it missing obvious spam actions, or is it incorrectly flagging non-spam actions -- and tune the spam-check settings to be stricter or looser as needed.
  • Manually suppress and/or unsubscribe actions that you find flagged as spam during this time.

When you feel good about the spam check settings, turn on the site-wide automatic suppression and unsubscribe options. Then come back to the spam check log occasionally to check to see if there are items that got classified as spam that shouldn’t have been, and un-suppress them if needed.

More Details

By default we log actions that match the Honeypot Check with a threat score of 30. You can set the spam checks to Suppress Actions, Unsubscribe Users or both.

The tools include three checks: Honeypot, IP, and Content. You can manually enable each check on the Configuration screen.

Spam checking does not create blocks or bans on IPs or email address, because spammers rarely use the same email address or multiple times. You may manually add an IP to the IP Check Blacklist if you are sure it is a source of spam.

It may occasionally be useful to block specific IPs or set up a narrow Content Check for a short period of time.

Warning

By default the checks are configured to only log actions to the spam_spamchecklog table, you'll need to enable the Suppress Actions and Unsubscribe Users settings under Spam Check Settings if you want to automatically mark actions as spam.

How Spam Checking Works

Spam checking works by looking at Actions after they have been validated and saved, but before we send out confirmation emails or petition signatures. Any actions that the checks find suspect will be saved to the Spam Check Log. If you've enabled Suppress Actions, actions will be marked as spam, and confirmation, tell-a-friend and signature delivery emails will be suppressed. If you've enabled Unsubscribe Users, the user will be immediately unsubscribed.

Spam checking has a time limit when run during action processing. A second pass - with no timeout - occurs offline every few minutes.

You can review spam checked actions in the Spam Check Log

The columns listed are:
  • timestamp - the datetime the action was submitted
  • check - the name of the spam check that "caught" this action
  • email - the email of the user who submitted this action, linked to the user's record
  • user comment - the value for the comment custom action field
  • outcome - whether this action was automatically suppressed or not, whitelisted, or marked not spam by staff
If you click on any of the rows in that table, they'll toggle open, revealing some additional information:
  • page - the title of the page, linked to the page editor
  • whitelisted - you'll see a green checkmark if a spam checks' whitelist matched this action, most entries will have a little red icon meaning they were not whitelisted.
  • why - what the check found that triggered a match
  • reversed - if you've overridden the spam checking, this will be true and you'll see a little green checkmark.
  • reversed_at - the datetime when an admin restored this action and resubscribed the user
  • actions - use the "Not spam" button to mark the action as complete and to resubscribe the user. "This action matched spam filters, but was not suppressed" means the action looks like spam, but the Suppress Actions configuration option was not enabled when this log entry was added. In that case, use the "Delete this action, it's spam" button to delete that spam action.

The log entries are sorted from most recent to least by default.

The user page will also display some information for users who have been marked as comment spammers. Their subscription status will be:

Unsubscribed for Comment Spamming since Wed Mar 11, 2015 Resubscribe Resubscribe and restore actions

Select Resubscribe to resubscribe the user, but leave their actions as spam. Use Resubscribe and restore actions to resubscribe the user and restore all of their actions.

Spam Checks

You can configure spam checking on the Config page. Each spam check can be enabled and configured independently. By default, spam checking will not modify any actions or users.

Spam Check Settings

Page Types To Check

  • Default: Petition, Letter, Signup, Survey

This configuration option controls which Page Types will be considered for spam checking. Only actions on these types of pages will be checked. If you are having problems on pages not listed, you can add the page types to this list. You should use the values in the 'type' column of the core_page table.

Suppress Actions

  • Default: Off

Enable this option if you want spam checking to mark actions as "spam", to prevent them from being used by the recent actions tag or in petition deliveries.

Note

If you have custom code pulling actions in a report or custom SQL, you can limit your query to core_action rows with status = 'complete' to avoid including spammy actions.

Unsubscribe Users

  • Default: Off

Enable this option if you want spam checking to automatically unsubscribe users.

The user subscription history will track these unsubscribes as unsubscribe_spamcheck and you'll be able to re-subscribe users from the user admin page.

Honeypot Spam Checking

Enabled

  • Default: On

Check the submitter IP address against the Honeypot project.

Max Threat Score

  • Default: 30

The Honeypot Project provides a service that rates IP addresses as threats for spam. It returns a value from 0 - 100, roughly equal to the probability the IP is a spammer. A value of 20 or 30 seems to be safe, but we recommend you check the log regularly until you find the right setting.

Content Spam Checking

Enabled

  • Default: Off

Check for actions that contain a word from the banned wordlist.

Words Banned

  • Default: "http https"

This option allows you to provide a list of banned words. We recommend using an initial value of "http https" to filter out any users submitting links. Based on our not very scientific analysis, blocking content with "http" should be very effective against spammers - but will obviously prevent your users from submitting links.

We don't think this is a very effective option for avoiding hateful messages - it's simply too easy to be hateful with bad spelling.

The words are matched as whole words. The following fields are checked:

  • first_name, last_name, address1, address2, city, state, region, country, postal
  • all custom action fields

Email Blacklist Spam Checking

Enabled

  • Default: Off

Check for email addresses that match patterns you specify.

Email Patterns

  • Default: None

Sometimes a lot of spam actions come in with very similar addresses, e.g. all from the same domain, or only varying by some numbers in the address. If you're certain no legitimate actions will come from a group of addresses, you can block them with an email blacklist pattern.

The patterns are Python regular expressions, so you can use the syntax documented here in them. To simply block a whole domain, it's enough to just use "@example.com". Be sure to test any patterns you use here to make sure you aren't blocking more than you intended.

IP Spam Checking

Enabled

  • Default: off

Enable whitelisting and blacklisting of IPs.

IP Whitelist

  • Default: None

Add your own IPs here if you find that you are getting caught mistakenly. If you are using the APIs and have enabled Suppressing Actions or Unsubscribing Users, you should add any IPs that host your applications here.

IP Blacklist

  • Default: None

We don't believe managing a list of IPs directly is going to be an effective tool over the long term. However, it may be useful for short periods of time.