My first discovery (via spamassassin -D --lint) was that RBL was not being done at all, because we didn't have the DNS lookup perl packages (Net::DNS::Resolver). These are easy to get through CPAN:
perl -MCPAN -e shell [as root] o conf prerequisites_policy ask install Net::DNS::Resolver::UNIX quitThen I modified the local.cf file as follows, to add some additional RBL services (gleaned from Google):
# This is the right place to customize your installation of SpamAssassin.
# See 'perldoc Mail::SpamAssassin::Conf' for details of what can be
# tweaked.
#
###########################################################################
#
#rewrite_subject 0
#report_header 1
#defang_mime 0
use_terse_report 1
use_bayes 1
bayes_path /usr/local/spamassassin/bayes
bayes_file_mode 0666
bayes_auto_learn 1
bayes_auto_learn_threshold_spam 10
bayes_min_spam_num 100
header RCVD_IN_BNBL eval:check_rbl('bl', 'bl.blueshore.net.')
describe RCVD_IN_BNBL Received via a relay listed by BNBL
tflags RCVD_IN_BNBL net
score RCVD_IN_BNBL 2.0
header RCVD_IN_RFC_PM eval:check_rbl('relay','postmaster.rfc-ignorant.org.')
describe RCVD_IN_RFC_PM Received via a relay in postmaster.rfc-ignorant.org
score RCVD_IN_RFC_PM 2.0
header X_CHINESE_RELAY eval:check_rbl('relay', 'cn.rbl.cluecentral.net.')
describe X_CHINESE_RELAY Received via a relay in China
score X_CHINESE_RELAY 1.5
header X_KOREAN_RELAY eval:check_rbl('relay','korea.services.net.')
describe X_KOREAN_RELAY Received via a relay in Korea
score X_KOREAN_RELAY 1.5
header X_SPAMHAUS eval:check_rbl('relay','spamhaus.relays.osirusoft.com.')
describe X_SPAMHAUS Received via relay in Spamhaus Blacklist
score X_SPAMHAUS 1.5
header RCVD_IN_NJABL eval:check_rbl('relay', 'dnsbl.njabl.org')
describe RCVD_IN_NJABL Received via a relay in NJABL
score RCVD_IN_NJABL 2.0
(Yes, we have a shared Bayes database, because we use spamd.)
Received: from some.host.or.other by dirf.bris.ac.uk
with SMTP-SLOPPY; Tue, 6 Jan 2004 05:42:43 +0000
--- if the HELO string looks up to be the same as the originating IP
address, then only the HELO string is reported. If they're different,
the form is different
Received: from bogus.org (actually host some.host.or.other) by dirg.bris.ac.uk
with SMTP-SLOPPY with ESMTP; Tue, 6 Jan 2004 12:44:49 +0000
SpamAssassin cannot cope with the first form, because it doesn't know
that it should trust the upstream mail server to get the name right.
In fact, in Received.pm, this form is explicitly disallowed:
# Received: from virtual-access.org by bolero.conactive.com ; Thu, 20 Feb 2003
23:32:58 +0100
if (/^from (\S+) by (\S+) *\;/) {
return; # can't trust this
}
In practice, SpamAssassin doesn't know what to do with the second
form, either.
if (/^from /) {
# First try to parse out Bristol headers
if (/^from (\S+) \(actually host (${IP_ADDRESS})\) by (dir\S+\.bris\.ac\.uk)/) {
dbg("received_header: bristol w helo, ip $1 $2 $3");
$helo=$1;
$ip=$2;
$by=$3;
goto enough;
}
if (/^from (\S+) \(actually host (\S+)\) by (dir\S+\.bris\.ac\.uk)/) {
dbg("received_header: bristol w helo $1 $2 $3");
$helo=$1;
$rdns=$2;
$ip=$self->lookup_a($rdns);
$by=$3;
goto enough;
}
if (/^from (\S+) by (dir\S+\.bris\.ac\.uk)/) {
dbg("received_header: bristol $1 $2");
$rdns=$1;
$helo=$1;
$by=$2;
$ip=$self->lookup_a($rdns);
goto enough;
}
First we deal with the form where the IP address is there; then we
deal with the much more problematic forms where it isn't, by looking up
the A record corresponding to the address written in there by the
upstream server. Of course, we need to implement lookup_a,
which goes in Dns.pm, and is based on lookup_ptr:
sub lookup_a {
my ($self, $dom) = @_;
return undef unless $self->load_resolver();
if ($self->{main}->{local_tests_only}) {
dbg ("local tests only, not looking up A");
return undef;
}
dbg ("looking up A record for '$dom'");
my $name = '';
eval {
my $query = $self->{res}->search($dom);
if ($query) {
foreach my $rr ($query->answer) {
if ($rr->type eq "A") {
$name = $rr->address; last;
}
}
}
};
if ($@) {
dbg ("A lookup failed horribly, perhaps bad resolv.conf setting?");
return undef;
}
dbg ("A for '$dom': '$name'");
# note: undef is never returned, unless DNS is unavailable.
return $name;
}
This will probably be handled completely differently in 2.7.