Trustwave Knowledge Base

KB Home Search Latest Additions Most Popular

Knowledgebase

Home » Knowledgebase » Trustwave MailMarshal (SEG) » HOWTO: How do I block Spam based on URLs in the message body?

HOWTO: How do I block Spam based on URLs in the message body?

Show/Hide Article Tools

This article applies to:

Trustwave MailMarshal (SEG)

Question:

How do I block Spam based on URLs in the message body?
What is the MailMarshal URLCensor?
How do I pass URL domains to a DNS based URL Blocklist?
How do I search a mail message for a certain URL and then block that message?
Do I need to pay for any URL domain lookups?

Procedure:

One very effective way of blocking Spam is to extract domain information from URLs (Uniform Resource Locators - basically Weblinks or hyperlinks) in the body of the Spam message. Typically Spammers quickly move from one IP address to the next when sending bulk mail, making blocking of spam by source IP very difficult. However, spammers usually advertise a website in their messages and they tend to reuse the domains over longer periods. This allows the URL blocklists to be very accurate.

MailMarshal's implementation of URL blocklist checking is called URLCensor, and it contains the following advanced features:

Local caching of RBL results, to increase speed and to reduce the load on your DNS server.
Handles obfuscated URLs, including decimal, octal, hexadecimal notations of IP addresses. Deobfuscated (decoded) URLs are passed to the blocklist.
Ability to use any number of RBL databases.

MailMarshal's native URLCensor is able to extract domain information from URLs in the message body and then pass these domains as a query against any RBL. The actual RBL Lookup is a query to your own DNS server. Use of the URL blocklist is implemented as a simple Category rule. If this rule does not exist, you need to create a rule, similar to the one below, to enable the URLCensor:

Standard Rule: Block Spam - URLCensor When a message arrives Where message is incoming Where message is categorized as 'URLCensor Blacklisted' Move the message to 'Spam - URLCensor'

NOTE 1: For this rule to work correctly, certain files need to be in place. These should be delivered automatically via your SpamCensor update process. The following two files files should be located in the {install}\config folder:


    SpamSurbl.dll 
    URLCensor.xml

NOTE 2: A responsive and well connected DNS server is essential to perform the RBL queries. Ensure the DNS server configured for delivery has good access for external queries.

Advanced settings

IMPORTANT NOTE: For most users, the default URLCensor rule should work fine "as-is", without any further configuration. If you need more advanced or granular control, additional optional settings are available.

For additional settings information see the anti-spam whitepapers available on the SEG documentation page (requires login).

How to configure a different RBL (Blocklist)

The URLCensor's blocklist is configured via the appropriate XML category file in the {install}\config folder. The default XML file we use is URLCensor.xml. As the URLCensor.xml file is automatically updated using the SpamCensor updates, we recommend that you do not modify this original file, rather you should use it as a template for custom XML configuration files. View the URLCensor.xml (or a duplicate of that file created for customization purposes) and note the following type of Eval element:

<Eval Name="SURBL" Enabled="true" Score="60" Type="SURBL" Description="URLCensor SURBL Blacklisted" Library="SpamSurbl.dll" Domain="multi.surbl.org" />

Note the Domain entry "multi.surbl.org". This can be any valid URL blocklist.

If you wish to use extra blocklists in MailMarshal, you have two different means of deploying them:

Multiple RBLs in one MailMarshal rule: Although simpler to implement, this does not allow discrete reporting on individual blocklists. Multiple blocklists are activated by adding one extra Eval element for each blocklist required -- these are added within the Evaluations section of a single XML file.
One MailMarshal rule for each Blocklist used: This is slightly more complex to configure, but it allows you to report on and clearly see which blocklists are more effective and which are less effective. You would need multiple Standard Rules and multiple XML files, using the default URL Blocklist XML file as a template. If you make a duplicate of this file, you need to modify the following data:
- In the SpamConfig tag, change: Name="your new name" Description="your new description"
- In the Eval element change: Domain="your new RBL domain name"

How to configure the Timeout

If there are delays in getting a response from the RBL, you may want to lower the timeout on the actual lookup. Again this is configured via the Eval in the XML file using the LookupRetry data parameter. For more information on using LookupRetry, please refer to Trustwave Knowledgebase article Q10789: "Unresponsive or slow Real-Time Block List (RBL) causing mail flow problems"

Note: URL blocklist evaluations and IP RBL lookups behave differently in the event of a timeout. For URL Blocklists, if a timeout occurs, MailMarshal records in the cache that the domain is not listed. That domain is not rechecked until its entry is flushed from the cache.

How to configure the Cache

The following Cache settings are configurable via the Eval entry in the XML file. The values listed are defaults and will be applied if these settings do not exist in the Eval element. Thus, you would only need these to override the MailMarshal defaults:

CacheDuration - the length of time entries are kept, in seconds (default is 3600 seconds, or 1 hour).
CacheHighWaterMark - the number of Cache entries above which we start to clear the cache (default 2500).
CacheLowWaterMark - when clearing the cache, entries are removed until the low watermark threshold has been met (default 2000).

An example custom Eval element in your XML might look like this (note that Cache duration is now set to 7200 seconds, or 2 hours):

<Eval Name="SURBL" Enabled="true" Score="60" Type="SURBL" Description="URLCensor SURBL Blacklisted" CacheDuration="7200" Library="SpamSurbl.dll" Domain="multi.surbl.org" />

Each Eval will have its own results cache, thus cached results are not shared across multiple RBL evaluations.

How to customize the expected return code from your RBL list

Most RBLs will give a DNS result of '127.0.0.2' if a domain is listed. If the RBL list you choose gives a different result, you can customize the Eval to expect the alternative result. The parameter used is Expect, and the default of '127.0.0.2' can be overridden as shown in this example:

<Eval Name="SURBL" Enabled="true" Score="60" Type="SURBL" Description="URLCensor SURBL Blacklisted" Expect="127.0.0.4" Library="SpamSurbl.dll" Domain="multi.surbl.org" />
Note: The Expect parameter allows ranges of responses, as follows:
Expect="x.x.x.x" - a specific IP Address
Expect="x.x.x.x-x.x.x.x." - a range of IP Addresses
Expect="255.255.255.255" - all IP Addresses

How to use URL blocklist results as part of your SpamCensor

You may prefer to combine the URL blocklist check as part of your SpamCensor total score. If so, you would NOT need the Standard MailMarshal rule listed above -- the RBL check is called as part of your main SpamCensor rule. To implement this, simply add the Eval entry for the RBL blocklist to {install}Config\UserDefined.xml after backing up the UserDefined.xml file. Edit the UserDefined.xml and note how the sample SpamCop evaluation is configured.VVAll you need to do is add the URL Eval element to the Evaluations section of UserDefined.xml.

How to manually confirm your DNS server is able to perform RBL lookups

For more information on using NSLookup to verify that an entry is listed on the RBL, and to determine if your DNS server is actually performing the checks correctly (some ISPs block RBL lookups on their server to reduce load on their DNS servers), please review Trustwave Knowledgebase Article Q10737: "Reputation service not blocking any email."

For example, if you wished to check whether spammer.com is listed on multi.surbl.org, you would use NSLookup to perform a forward lookup on spammer.com.multi.surbl.org.
The surbl.org site also provides the following test points for verifying your configuration:

Name: SURBL-org-permanent-test-point.com.sc.surbl.org
- Address: 127.0.0.2
Name: 2.0.0.127.sc.surbl.org
- Address: 127.0.0.2
Name: test.sc.surbl.org.sc.surbl.org
- Address: 127.0.0.2
Name: test.surbl.org.sc.surbl.org
- Address: 127.0.0.2

Note:

Most Blocklist providers request that, if you process a high volume of email, you should set up a local caching DNS server. Providers may also require a subscription fee. For example, surbl.org requests that you use a local DNS server and subscribe to a feed if you generate more than a few hundred thousand requests per day. See the following for more information:

http://www.surbl.org/
- In particular, note the usage policy with regard to high volume sites.

Trustwave provides the technology to query the SURBL information. Any licensing required to use this information should be arranged between the MailMarshal customer and the list provider.

It is possible to significantly reduce the number of DNS Blocklist queries by running the URLCensor rule after a majority of junk has been removed by other rules, such as your SpamCensor rule and your rule to allow only valid recipients (if you have one).

Also, SEG caches DNS responses, which helps to minimize queries to external servers.