HOWTO: How do I block Spam based on URLs in the message body?

Expand / Collapse
 

HOWTO: How do I block Spam based on URLs in the message body?


This article applies to:

  • MailMarshal SMTP (SEG)

Question:

  • How do I block Spam based on URLs in the message body?
  • What is the MailMarshal URLCensor?
  • How do I pass URL domains to a domain Realtime Blackhole List (RBL)?
  • How do I search a mail message for a certain URL and then block that message?
  • Do I need to pay for any URL domain lookups?

Procedure:

One very effective way of blocking Spam is to extract domain information from URLs (Uniform Resource Locators - basically Weblinks or hyperlinks) in the body of the Spam message.  Typically Spammers quickly move from one IP address to the next when sending bulk mail, making IP blacklists very difficult to maintain.  However, Spammers usually advertise a Website in their messages and they tend to reuse the domains over longer periods.  This allows the URL blacklists to be very accurate.

MailMarshal's implementation of URL blacklisting is called URLCensor, and it contains the following advanced features:

  • Local caching of RBL results, to increase speed and to reduce the load on your DNS server.
  • Handles obfuscated URLs, including decimal, octal, hexadecimal notations of IP addresses.  Deobfuscated (decoded) URLs are passed to the blacklist.
  • Ability to use any number of RBL databases.

MailMarshal's native URLCensor is able to extract domain information from URLs in the message body and then pass these domains as a query against any RBL.  The actual RBL Lookup itself is nothing more than a query to your own DNS server.  URL blacklisting is implemented as a simple Category rule. If this rule does not exist, you need to create a rule, similar to the one below, to enable the URLCensor: 

Standard Rule: Block Spam - URLCensor
When a message arrives
Where message is incoming
Where message is categorized as 'URLCensor Blacklisted'
Move the message to 'Spam - URLCensor'

NOTE 1:  For this rule to work correctly, certain files need to be in place. These should be delivered automatically via your SpamCensor update process.  If using MailMarshal 6.x, the following two files files should be located in the {install}\config folder:

  • SpamSurbl.dll
  • URLCensor.xml

If using MailMarshal 5.5, the corresponding files will be named:

  • SpamSurbl55.dll
  • URLCensor55.xml

For the purposes of this article we will refer to the file names for MailMarshal 6.0 and above. If you are using MailMarshal 5.5 please keep in mind the different file names when reading the instructions below.

NOTE 2:  A valid DNS server is also required in MailMarshal to perform the RBL queries.  (See Tools | Server & Array Properties | Delivery | Primary DNS server.)

IMPORTANT NOTEFor most users, the default URLCensor rule should work fine "as-is", without any further configuration.  If you need more advanced or granular control, the following optional settings are available.

Advanced settings

How to configure a different RBL (Blacklist)

The URLCensor's blacklist is configured via the appropriate XML category file in the {install}\config folder.  The default XML file we use is URLCensor.xml.  As the URLCensor.xml file is automatically updated using the SpamCensor updates, we recommend that you do not modify this original file, rather you should use it as a template for custom XML configuration files.  View the URLCensor.xml (or a duplicate of that file created for customization purposes) and note the following Eval element:

<Eval Name="SURBL" Enabled="true" Score="60" Type="SURBL" Description="URLCensor SURBL Blacklisted" Library="SpamSurbl.dll" Domain="multi.surbl.org" />

Note the Domain entry "multi.surbl.org".  This can be any valid URL blacklist.  Other working examples include:

  • block.rhs.mailpolice.com
  • zebl.zoneedit.com
  • in.dnsbl.org

If you wish to use extra blacklists in MailMarshal, you have two different means of deploying them:

  • Multiple RBLs in one MailMarshal rule:  Although simpler to implement, this does not allow discrete reporting on individual blacklists.  Multiple blacklists are activated by adding one extra Eval element for each blacklist required -- these are added within the Evaluations section of a single XML file.
  • One MailMarshal rule for each Blacklist used:  This is slightly more complex to configure, but it allows you to report on and clearly see which blacklists are more effective and which are less effective.  You would need multiple Standard Rules and multiple XML files, using the default URL Blacklist XML file as a template.  If you make a duplicate of this file, you need to modify the following data:
    • In the SpamConfig tag, change:  Name="your new name" Description="your new description"
    • In the Eval element change:  Domain="your new RBL domain name"

 

How to configure the Timeout

If there are delays in getting a response from the RBL, you may want to lower the timeout on the actual lookup.  Again this is configured via the Eval in the XML file using the LookupRetry data parameter.  For more information on using LookupRetry, please refer to Trustwave Knowledgebase article, Q10789 :

https://www.trustwave.com/support/kb/article.aspx?id=10789
"Unresponsive or slow Real-Time Block List (RBL) causing mail flow problems"

Note:  URL blacklist evaluations and IP RBL lookups behave differently in the event of a timeout.  For URL Blacklists, if a timeout occurs, MailMarshal records in the cache that the domain is not listed.  That domain is not rechecked until its entry is flushed from the cache.

 

How to configure the Cache

The following Cache settings are configurable via the Eval entry in the XML file. The values listed are defaults and will be applied if these settings do not exist in the Eval element.  Thus, you would only need these to override the MailMarshal defaults:

  • CacheDuration - the length of time entries are kept, in seconds (default is 3600 seconds, or 1 hour).
  • CacheHighWaterMark - the number of Cache entries above which we start to clear the cache (default 2500).
  • CacheLowWaterMark - when clearing the cache, entries are removed until the low watermark threshold has been met (default 2000).

An example custom Eval element in your XML might look like this (note that Cache duration is now set to 7200 seconds, or 2 hours):

<Eval Name="SURBL" Enabled="true" Score="60" Type="SURBL" Description="URLCensor SURBL Blacklisted" CacheDuration="7200" Library="SpamSurbl.dll" Domain="multi.surbl.org" />

Each Eval will have its own results cache, thus cached results are not shared across multiple RBL evaluations.

 

How to customize the expected return code from your RBL list

Most RBLs will give a DNS result of '127.0.0.2' if a domain is blacklisted.  If the RBL list you choose gives a different result, you can customize the Eval to expect the alternative result. The parameter used is Expect, and the default of '127.0.0.2' can be overridden as shown in this example:

<Eval Name="SURBL" Enabled="true" Score="60" Type="SURBL" Description="URLCensor SURBL Blacklisted" Expect="127.0.0.4" Library="SpamSurbl.dll" Domain="multi.surbl.org" />

Note: In MailMarshal SMTP 6.2 and above the Expect parameter has been expanded to allow ranges of responses, as follows:
Expect="x.x.x.x" - a specific IP Address
Expect="x.x.x.x-x.x.x.x." - a range of IP Addresses
Expect="255.255.255.255" - all IP Addresses

 

How to combine URL blacklisting as part of your SpamCensor

You may prefer to combine the URL blacklist check as part of your SpamCensor total score.  If so, you would NOT need the Standard MailMarshal rule listed above -- the RBL check is called as part of your main SpamCensor rule. To implement this, simply add the Eval entry for the RBL blacklist to {install}Config\UserDefined.xml after backing up the UserDefined.xml file.  Edit the UserDefined.xml and note how the sample SpamCop evaluation is configured.VVAll you need to do is add the URL Eval element to the Evaluations section of UserDefined.xml.

 

How to manually confirm your DNS server is able to perform RBL lookups

For more information on using NSLookup to verify that an entry is listed on the RBL, and to determine if your DNS server is actually performing the checks correctly (some ISPs block RBL lookups on their server to reduce load on their DNS servers), please review the following Trustwave Knowledgebase Article:

https://www.trustwave.com/support/kb/article.aspx?id=10737
"SpamCop or SpamHaus is not blocking any Spam."

For example, if you wished to check whether spammer.com is listed on multi.surbl.org, you would use NSLookup to perform a forward lookup on spammer.com.multi.surbl.org.
The surbl.org site also provides the following test points for verifying your configuration:

  • Name:     SURBL-org-permanent-test-point.com.sc.surbl.org
    • Address:  127.0.0.2
  • Name:     2.0.0.127.sc.surbl.org
    • Address:  127.0.0.2
  • Name:     test.sc.surbl.org.sc.surbl.org
    • Address:  127.0.0.2
  • Name:     test.surbl.org.sc.surbl.org
    • Address:  127.0.0.2

Note:

Most Blacklist providers request that, if you process a high volume of email, you should set up a local caching DNS server.  Providers may also require a subscription fee. For example, surbl.org requests that you use a local DNS server and subscribe to a feed if you generate more than a few hundred thousand requests per day.  See the following for more information:

Marshal provides the technology to query the SURBL information. Any licensing required to use this information should be arranged between the MailMarshal customer and the list provider.

It is possible to significantly reduce the number of DNS Blacklist queries by running the URLCensor rule after a majority of junk has been removed by other rules, such as your SpamCensor rule and your rule to allow only valid recipients (if you have one).

This article was previously published as:
NETIQKB44382

To contact Trustwave about this article or to request support:


Rate this Article:
     

Add Your Comments


Comment submission is disabled for anonymous users.
Please send feedback to Trustwave Technical Support or the Webmaster
.

Details
Article ID: 10236
Last Modified: 3/31/2009
Type: HOWTO
Rated 2 stars based on 13 votes.
Article has been viewed 16,351 times.
Options