Get totally free music and movies. Download P2P
software and start file sharing.
Click here No scams, no BS. Get BitTorrent, eMule, LimeWire or
Shareaza
Click here
Spammers have attempted to fight statistical filtering by inserting
many random but valid "noise" words or sentences into their messages
while attempting to hide them from view, making it more likely that
the filter will classify the message as neutral. (See Word salad
(computer science).) Attempts to hide the noise words include setting
them in tiny font or the same colour as the background. However, these
noise countermeasures seem to have been largely ineffective.[citation
needed]
Software programs that implement statistical filtering include
Bogofilter, the e-mail programs Mozilla and Mozilla Thunderbird, and
later revisions of SpamAssassin. Another interesting project is CRM114
which hashes phrases and does bayesian classification on the phrases.
There is also the free mail filter POPFile [2] which sorts mail in as
many categories as you want (family, friends, co-worker, spam,
whatever) with bayesian filtering.
Checksum-based filtering
Checksum-based filter takes advantage of the fact that often, for any
individual spammer, all of the messages he or she sends out will be
mostly identical, the only differences being web bugs, and when the
text of the message contains the recipient's name or email address.
Checksum-based filters strip out everything that might vary between
messages, reduce what remains to a checksum, and look that checksum up
in a database which collects the checksums of messages that email
recipients consider to be spam (some people have a button on their
email client which they can click to nominate a message as being
spam); if the checksum is in the database, the message is likely to be
spam.
The advantage of this type of filtering is that it lets ordinary users
help identify spam, and not just administrators, thus vastly
increasing the pool of spam fighters. The disadvantage is that
spammers can insert unique invisible gibberish -- known as hashbusters
-- into the middle of each of their messages, thus making each message
unique and having a different checksum. This leads to an arms race
between the developers of the checksum software and the developers of
the spam-generating software.
Checksum based filtering methods include:
Distributed Checksum Clearinghouse
Vipul's Razor
Authentication and Reputation (A&R)
A number of systems have been proposed to allow acceptance of email
from servers which have authenticated in some fashion as senders of
only legitimate email. Many of these systems use the DNS, as do DNSBLs;
but rather than being used to list nonconformant sites, the DNS is
used to list sites authorized to send email, and (sometimes) to
determine the reputation of those sites. Other methods of identifying
ham and spam are still used. The A&R allows much ham to be more
reliably identified, which allows spam detectors to be made more
sensitive without causing more false positive results. The increased
sensitivity allows more spam to be identified as such. Also, A&R
methods tend to be less resource-intensive than other filtering
methods, which can be skipped for messages identified by A&R as ham.
Sender-supported whitelists and tags
There are a small number of organizations which offer IP whitelisting
and/or licensed tags that can be placed in email (for a fee) to assure
recipients' systems that the messages thus tagged are not spam. This
system relies on legal enforcement of the tag. The intent is for email
administrators to whitelist messages bearing the licensed tag.
Habeas Safelist
Bonded Sender.
A potential difficulty with such systems is that the licensing
organization makes its money by licensing more senders to use the tag
-- not by strictly enforcing the rules upon licensees. A concern
exists that senders whose messages are more likely to be considered
spam who would accrue a greater benefit by using such a tag. The
concern is that these factors form a perverse incentive for licensing
organizations to be lenient with licensees who have offended. However,
the value of a license would drop if it was not strictly enforced, and
financial gains due to enforcement of a license itself can providee an
additional incentive for strict enforcement. The Habeas mail classing
system attempts to further address this issue this by classing email
according to origin, purpose, and permission. The purpose is to
describe why the email is not likely spam, but permission based email.
Ham passwords
Another approach for countering spam is to use a "ham password".
Systems that use ham passwords ask unrecognised senders to include in
their email a password that demonstrates that the email message is a
"ham" (not spam) message. Typically the email address and ham password
would be described on a web page, and the ham password would be
included in the "subject" line of an email address. Ham passwords are
often combined with filtering systems, to counter the risk that a
filtering system will accidentally identify a ham message as a spam
message.