Greylisting with Exim 4 and Spamassassin
Like many people, I have a huge problem with junk email. The spam keeps flooding in, and while spamassassin running under exiscan on Exim keeps the worst of it (some 500-600 messages a day) at bay, a number still slip through.
The approach spamassassin takes to scoring spam is to add up a number of clues - does it contain known spam phrases? does it look like it's trying to obfuscate text? has the message come from a spam-happy ISP?
These clues are great since they allow you to set a threshold at which you want to block messages, but that's also its downfall - where do you draw the line? If you set the spam threshold too high, then you'll get a lot of junk messages getting through. Too low, and you risk blocking legitimate email.
I have the threshold set at 6, and this seems a reasonable compromise, but I also have some additional spamassassin rules to bump up the score of some dodgy email, which helps. Unfortunately, a lot of the spam that gets through is scored in the 2-5 points range - it's a bit spammy, but not spammy enough.
So what's greylisting got to do with it?
Greylisting is an interesting new technique to help foil the spammers - it simply delays messages by issuing an SMTP "try again later" message, and after a delay of an hour or so if the message retries, then it's let through. The theory is that most of the spam email sending engines aren't really very good email MTAs - if they're denied the first time, then they won't bother to retry. While this will inevitably change, the additional hour also means that a previously unknown spammer IP might have been added to the dns blacklists - another clue for spamassassin.
The downside with greylisting is the delay - you don't really want legitimate email being delayed by an hour, or even longer depending on how the sending mail server has been configured. The answer seemed obvious - I really only wanted to greylist the "questionable" emails - those scoring 2+ on spamassassin. Any legitimate mail that scores over 2 would be delivered after the delay, and any spam scoring more than 6 that does retry will be caught by spamassassin anyway.
Implementation
I used greylistd under debian, which was as simple as running:
apt-get install greylistd
I didn't want to use the debian recommended configuration for greylistd as this greylists all incoming email, so I added the necessary bits to my data ACL configuration, which currently looks like this:
acl_check_data:
deny malware = *
message = This message contains a virus ($malware_name).
defer message = $sender_host_address is not currently authorised to deliver mail to this server. Please try later.
log_message = greylisted.
spam = nobody:true
!hosts = : ${if exists {/etc/greylistd/whitelist-hosts}\
{/etc/greylistd/whitelist-hosts}{}} : \
${if exists {/var/lib/greylistd/whitelist-hosts}\
{/var/lib/greylistd/whitelist-hosts}{}}
set acl_m9 = $sender_host_address
set acl_m9 = ${readsocket{/var/run/greylistd/socket}{$acl_m9}{5s}{}{}}
condition = ${if and\
{\
{>{$spam_score_int}{2}}\
{eq {$acl_m9}{grey}}\
}\
}
warn spam = nobody:true
message = X-Spam-Score: $spam_score
warn spam = nobody
message = X-Spam-Flag: YES\n\
X-Spam-Report: $spam_report\n\
X-Spam-Bar: $spam_bar
accept
Other problems?
Not all proper mail servers behave with greylisting - groups.yahoo.com is one of the more well known ones. There's a list of known ones here, and this list is already included with greylistd by default, so these are automatically whitelisted.
Obviously, you need to decide if it's acceptable to introduce an hour delay for some messages. I decided it was for me.
The other issue with this implementation is that it doesn't scale well to multiple mail servers, as this maintains a per-server greylist. Other people have come up with solutions that scale more - see greylisting.org for more information.
More info
- What happens when the spammers adapt?
- greylisting.org
- Greylisting whitepaper by Evan Harris
- Evan's greylisting pages
- exim.org
- spamassassin