Lead Image © Konstantin Inozemtcev, 123RF.com

Lead Image © Konstantin Inozemtcev, 123RF.com

Spam protection using SpamAssassin

Well Filtered

Article from ADMIN 28/2015
By
The intelligent, modular SpamAssassin email filter provides a variety of advanced tests for detecting unwanted junk email.

Spam in your Inbox at home is a nuisance you can hardly avoid, but what is merely irritating at home is a genuine problem in the business world. The proportion of email advertising messages can be greater than 50 percent, forcing employees to check every piece of email and manually dump at least every second message in the Trash. Spam is a dynamic, not a static, problem, and spammers usually respond very quickly and cleverly to countermeasures of any kind.

You can get the flood of advertising email under control to a certain degree with the use of spam filters. Much like antivirus programs, spam protection needs to be updated continually if it is to provide protection. Ideally, the filter should be located in the enterprise at the central node through which incoming email traffic runs and where the most efficient filtering is possible.

Combination of Techniques

Intelligent spam filters like SpamAssassin [1] employ various solutions (Figure 1). Black and white lists explicitly exclude or include email addresses. A content filter checks the header and body content. Statistical tests and URL block lists are also used for spam detection and subsequent processing.

Figure 1: SpamAssassin uses a variety of databases for analyzing and assessing email. You can come to grips with unwanted advertising email using a complex set of rules.

Sophisticated solutions like SpamAssassin use Bayesian filters, which are self-learning text filters that cull junk email on the basis of content – in theory, at least. In practice, however, filters suffer from significant error rates, particularly false negatives, relegating legitimate email into the Junk mailbox.

In principle, email filtering can take place on the client or the server sides, and often the two approaches are combined. The best option to protect as many users as possible from unwanted email is to implement server-side spam filtering. The Mail Transfer Agent (MTA) receives the incoming email and passes it to the spam filter, which returns it back to the MTA. Depending on the result of the test, the email is either sorted into the user's mailboxes or moved to a special Spam folder. The client then retrieves the filtered mail but can also access the rejected mail if required.

SpamAssassin filters email in two phases: Phase 1 detects spam and phase 2 processes the email classified as spam. The SpamAssassin spam detector expands the headers with a corresponding note, and the MTA then implements the processing of this information.

The core of SpamAssassin is a rules engine that applies previously established rules, so you can determine which detection methods are used (e.g., Bayesian filtering, the network test, or the whitelist). SpamAssassin comes with simple text files containing a standard set of rules. Both users and administrators can modify these rules. The Bayes filter – a key component of SpamAssassin – then uses its own database with statistical data from previously processed spam and ad-free email. The auto blacklist/whitelist in turn creates its own database.

Commissioning SpamAssassin

The SpamAssassin spam filter is available via the package manager of any common Linux distribution – Debian, openSUSE, or another platform. The installation is simple. To install SpamAssassin manually, download the current archive and copy it to $HOME/src as in Listing 1.

Listing 1

Installing SpamAssassin

cd $HOME
mkdir src
cd src
wget http://www.apache.org/dist/spamassassin/Mail-SpamAssassin-3.4.0.tar.gz
tar xvzf Mail-SpamAssassin-3.4.0.tar.gz
cd Mail-SpamAssassin-3.4.0
perl Makefile.PL PREFIX=$HOME && make && make install

Confirm four times by pressing Enter, and make sure you are using the version just installed, which you should find in /home/user_name/bin/spamassassin.

Next, you should perform a test to see whether the spam filter was installed correctly:

spamassassin < \
  $HOME/src/Mail-SpamAssassin-3.4.0/sample-spam.txt

Output appears on the console telling you that SpamAssassin is creating the user settings file and ensuring that the environment is functional. A separate configuration, as with many other environments, is not necessary.

After the installation, you can first devote yourself to the central configuration file local.cf, which you will usually find in the directory /etc/mail/spamassassin. The central SpamAssassin configuration file looks roughly as shown in Figure 2.

Figure 2: The central configuration of the SpamAssassin environment takes place in local.cf but does not usually require any changes.

A mailbox for spam on the email server side allows a client to download email after viewing. You can also use the SpamAssassin Configuration Generator [2] for creating your own configuration. This provides you with a web form in which you can determine the cornerstones of the SpamAssassin configuration and export the configuration file.

Optimizing the Spam Filter

Once you have set up a functional filter system, you can turn to optimizing the environment as the next step. The main problem with using SpamAssassin is how to prevent or minimize false positives. Spammers are also learning through the years and have added increasingly better camouflage to their advertising messages. You need to consider several aspects to reduce the number of messages that are incorrectly identified as spam. First, when you send mail, make sure not to use suspicious subject lines or content. Receivers can work with whitelists or change the assessments that SpamAssassin triggers. Administrators should optimize the use of the Bayes filter in particular.

One goal of the SpamAssassin developers is to make static whitelisting redundant. SpamAssassin has been using the TxRep plugin (reputation plugin) since spring 2014; thanks to its advanced functions, it replaces the auto-whitelist (AWL) plugin.

Like its predecessor, TxRep tracks the assessments of previously received messages and adjusts them as necessary. The status can, however, change for senders who were previously regarded as harmless. From a certain rating they are considered spam distributors. In contrast to AWL, the TxRep plugin is capable of learning. AWL is already disabled in current versions. You can switch it on with

use_auto_whitelist 1

if you do still want to use it.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus