Spamassassin CustomRulesets
出处:www.thismail.org 作者:jacky 时间:2006-10-25 17:33:00
CustomRulesets
Disclaimer
Custom or third-party rules described here are not part of the official SpamAssassin distribution. They may have a different license and are not from the Apache Software Foundation.
Available Custom Rulesets
Listed below are several custom rulesets that are available as "drop in" .cf files. To use these rules, just place the file in /etc/mail/spamassassin (if you use spamD, be sure to restart). Before running these rules please do the following:
Read any extra info available with the rules, including the comments in the .cf files.
Check to make sure that the default scores in these rules fit your installation. You might want to modify scores.
Make sure to --lint the rules after loading them.
Test the new rulesets. Keep an eye on hits from the new rules to determine if the scoring is right for you.
Use at your own risk.
--------------------------------------------------------------------------------
Status Information
Active: Ruleset is actively updated and maintained
Locked: Ruleset is not actively updated, but is fine to run and considered "stable"
Defunct: Ruleset is no longer maintained, may be out of date or have problems
Auto-update: Author/Maintainer has given permission to use scripts to automate the download of the ruleset
Please respect the wishes of the authors and/or the site hosts
--------------------------------------------------------------------------------
antidrug.cf
antidrug.cf is a set of rules designed to catch those pesky "pill spams".
Created by: Matt Kettler
Contact: mkettler_sa@verizon.net
License Type: Artistic/GPL dual
Status: Inactive
Auto-update: Yes, subject to change if Verizon later objects to the practice. Note: at this time the ruleset is not actively being updated.
Available at: http://mysite.verizon.net/mkettler_sa/antidrug.cf
Mirror: N/A
Note: Matt Kettler says "It may not be appropriate for a medical or pharmecutical environment. If in doubt, adjust the scores of all the rules to 0.01 and see if they fire off on your daily nonspam."
Note: SA 3.0.0 documentation indicates that much of this rule set has been incorporated into that version. This file is unnecessary with SA 3.0.0 or higher and may downgrade any improvements contributed directly to the standard ruleset. ONLY use antidrug if you are stuck on SA 2.6x for some reason. Sample Results: MasscheckAntidrug (rev 0.65 04/28/2004)
backhair.cf
backhair is a set of rules designed to catch those ugly, unsightly HTML tags.
Created by: Jennifer Wheeler
Contact: TBD
License Type: TBD
Status: Locked
Auto-update: No
Available at: http://www.emtinc.net/includes/backhair.cf
Mirror: rulesemporium.com
More information on Jennifer's rules: http://www.emtinc.net/spamhammers.htm
NOTE: Early versions of Rules Du Jour included this set in its default config. This set is now considered "stable" and is no longer actively updated. Please do not use auto-update scripts
Note: This is a fairly aggressive ruleset that can hit on UUencoded attachments...
Note: SA 3.0.0 documentation indicates that much of this rule set has been incorporated into that version. This file is unnecessary with SA 3.0.0. Sample Results: MasscheckBackhair (Version 1.5 2004-01-21)
bogus-virus-warnings.cf
bogus-virus-warnings tries to pick out 'collateral spam' caused by viruses.
Created by: Tim Jackson with contributions from others
Contact: TBD
License Type: TBD
Status: Active
Auto-update: Yes
Available at: http://www.timj.co.uk/linux/bogus-virus-warnings.cf
More information on Tim's rules: http://www.timj.co.uk/linux/sa.php
Note: Main aim is to catch warnings generated by virus scanners along the lines of "you sent us virus", which are sent to the (usually faked) 'senders' of virus-infected e-mails. Contains many "black-and-white" very-high-scoring rules.
Sample Results: MasscheckBogusVirus (version 1.69 2004-03-04)
chickenpox.cf
chickenpox is a set of rules designed to catch spam like "l.ooks f|or th.is kind of garb+age"
Created by: Jennifer Wheeler
Contact: TBD
License Type: TBD
Status: Locked
Auto-update: No
Available at: http://www.emtinc.net/includes/chickenpox.cf
Mirror: rulesemporium.com
NOTE: Early versions of Rules Du Jour included this set in its default config. This set is now considered "stable" and is no longer actively updated. Please do not use auto-update scripts
More information on Jennifer's rules: http://www.emtinc.net/spamhammers.htm
Sample Results: MasscheckChickenpox (Version 1.15 2004-02-06)
Chickenpox rules are BROKEN for non-English text, they treat all accented characters as non-letters!
evilnumbers.cf
evilnumbers is a collection of phone numbers, PO boxes and street addresses harvested from spam.
Created by: Matt Yackley
Contact: sare@yackley.org
License Type: Artistic
Status: Active
Auto-update: Yes - Please try to keep checks down to no more then once every 24 hours
Available at: http://www.rulesemporium.com/rules/evilnumbers.cf
Extras: Localized language packs available at the link below.
Mirror: yackley.org
More information on Matt Yackley's rules: http://www.yackley.org/sa-rules
Sample Results: MasscheckEvilNumbers (Version: 1.12k 03/31/2004)
Malware Block List
The Malware Block List is a free, automated and user contributed system for checking URLs for the presence of Viruses, Trojans, Worms, or any other software considered Malware. The list of URLs that point to Malware is available and formatted for using on SpamAssassin.
Created by: Andre Correa
Contact: andre.correa@pobox.com
License Type: GPL
Status: Active
Auto-update: Yes - Please try to keep checks down to no more then once every 4 hours
Auto-update: Preferred method http via http://www.malware.com.br/cgi/submit?action=list_sa
More information: http://www.malware.com.br
Note: This link is not a .cf file, you will need to save it with a .cf extension. Please visit the site for information on automatic updating procedure.
sa-blacklist
sa-blacklist is a large set of blacklist entries of domains and IP addresses.
Created by: William Stearns
Contact: wstearns@pobox.com
License Type: GPL
Status: Active
Auto-update: Yes - Please try to keep checks down to no more then once every 4 hours
Auto-update: Preferred method rsync via rsync.sa-blacklist.stearns.org::wstearns/sa-blacklist/
Available at: http://www.sa-blacklist.stearns.org/sa-blacklist/sa-blacklist.current
Available at: ftp://ftp.sa-blacklist.stearns.org/pub/wstearns/sa-blacklist/sa-blacklist.current
Mirror: ftp.bascom.com
More information on Bill's rules: http://www.sa-blacklist.stearns.org/sa-blacklist/README
Note: These are blacklist entries and will tag emails on their own! This link is not a .cf file, you will need to save it with a .cf extension.
sa-blacklist-uri.cf
sa-blacklist-uri is a large set of URIs
Created by: William Stearns
Contact: wstearns@pobox.com
License Type: GPL
Status: Active
Auto-update: Yes - Please try to keep checks down to no more then once every 4 hours
Auto-update: Preferred method rsync via rsync.sa-blacklist.stearns.org::wstearns/sa-blacklist/
Available at: http://www.sa-blacklist.stearns.org/sa-blacklist/sa-blacklist.current.uri.cf
Available at: ftp://ftp.sa-blacklist.stearns.org/pub/wstearns/sa-blacklist/sa-blacklist.current.uri.cf
More information on Bill's rules: http://www.sa-blacklist.stearns.org/sa-blacklist/README
Mirror: ftp.bascom.com
Note: The idea behind this list is similar to bigevil, but are pulled together from different spam. These rules are "flat" ie, one entry per rule, which uses more memory than combining multiple entries into one rule. This should not be an issue if you have lots of memory or a lighter mail load.
Note: Using the http://wiki.apache.org/spamassassin/SURBL of this blacklist allows far less memory by Spamd than using the ruleset itself.
Sample Results: MasscheckBlacklist (2004030403)
sa-random.cf
sa-random searches for spamware mistakes like: %RANDOM_WORD
Created by: William Stearns
Contact: wstearns@pobox.com
License Type: GPL
Status: Active
Auto-update: Yes - Please try to keep checks down to no more then once every 4 hours
Auto-update: Preferred method rsync via rsync.sa-blacklist.stearns.org::wstearns/sa-blacklist/
Available at: http://www.sa-blacklist.stearns.org/sa-blacklist/random.current.cf
Available at: ftp://ftp.sa-blacklist.stearns.org/pub/wstearns/sa-blacklist/random.current.cf
Mirror: ftp.bascom.com
More information on Bill's rules: http://www.sa-blacklist.stearns.org/sa-blacklist/README
Sample Results: MasscheckRandom (release: 2004030501)
tripwire.cf
tripwire searches for 3 characters that shouldn't be together.
Created by: Fred Tarasevicius
Contact: tech2@i-is.com
License Type: TBD
Status: TBD
Auto-update: TBD
Available at: http://www.rulesemporium.com/rules/99_FVGT_Tripwire.cf
Note: These rules are based on the English language, due to the number of rules that can be triggered, problem have been reported by exim users that it can cause the header to go over the byte limit of the exim header limits, also MS Outlook can have problems with rules that look for "message headers" due to a unknown size limit in the amount of headers it will search.
Sample Results: MasscheckTripwire (Version 1.17)
French Rules
Catches spams written in French.
Created by: Maxime Ritter
Contact: mritter@alussinan.org
License Type: Public Domain
Status: Active
Auto-update: On the mirror (updates of the mirror are automatic)
Available at: http://maxime.ritter.eu.org/Spam/french_rules.cf
GPG-signature: Yes
Mirror: http://airmex.nerim.net/rule-get/french_rules.cf
More information on my site : (in French only at the moment) : http://maxime.ritter.eu.org/article.php3?id_article=11
Sample Results: None yet.
Romanian Rules
Catches spams written in Romanian or by Romanian spammers.
Created by: INTERSOL SRL
License Type: Public Domain
Status: Active
Auto-update: On the mirror (updates of the mirror are automatic)
Available at: http://www.intersol.ro/blacklist_ro.cf
More information on our site : (in Romanian only at the moment) : http://www.intersol.ro/anti-spam
Airmax.cf
Misc rules I use. Use them if you find them usefull.
Created by: Maxime Ritter
Contact: mritter@alussinan.org
License Type: Public Domain
Status: Active
Auto-update: On the mirror (auto-updated)
Available at: http://maxime.ritter.eu.org/Spam/airmax.cf
GPG-signature: Yes
Mirror: http://airmex.nerim.net/rule-get/airmax.cf
More information on my site : (in French only at the moment) : http://maxime.ritter.eu.org/article.php3?id_article=11
Sample Results: None yet.
Chinese Rules
Rules to catch spams written in Chinese.
Created by: Quang-Anh Tran, at CCERT Anti-Spam Team
Contact: chenguangying@tsinghua.org.cn
License Type: Apache License
Status: Active
Available at: http://www.ccert.edu.cn/spam/sa/Chinese_rules.cf
More information (in Chinese): http://www.ccert.edu.cn/spam/sa/Chinese_rules.htm
Note : Rules and scores are updated once a week by using spams reported to the anti-spam service of CCERT in the last 3 months.
Sample Results: MasscheckChineserules
GEE Whiz Chinese Ruleset
We developed a set of SpamAssassin rules which apply to Simplified Chinese, based on GB2312. They include head rules, phrase rules.
Created by: Zhong(Adam) Wang at Submersion Corporation
Contact: adamwang@submersion.com
License Type: GPL
Status: Active
Available at: GEE Whiz Chinese Ruleset
More detail: http://www.geewhiz.ca
Note : Rules are masschecked by CCERT.
Sample Results: MasscheckGeeWhizChineseRuleset
I cleaned up part of GEE Whiz Chinese Ruleset which take forever to run mass-check and run perceptron to rescore the Ruleset Available at: http://mcli.homelinux.org:8080/apache2-default/spam
Contact: mchun.li@gmail.com
MIME Validation Ruleset
This is a tiny set of rules, designed to find MIME errors commonly encountered in mails sent by the bulk mailers used by spammers.
Created by: Byteplant GmbH
Contact: nstsupport@byteplant.com
License Type: GPL
Status: Active
Available at: http://www.nospamtoday.com/download/mime_validate.cf
Sample Results: None yet.
German Language Ruleset
Catches german language SPAM. Please report your german SPAM with full headers
Created by: Michael Monnerie ( http://it-management.at )
Contact: sare-german@zmi.at
License Type: Artistic ( http://www.rulesemporium.com/license.txt)
Status: Active
Available at: Rules Du Jour Ruleset name ZMI_GERMAN
Available at: http://zmi.at/x/70_zmi_german.cf
Sample Results: None yet.
--------------------------------------------------------------------------------
Automatic Updates
If you wish to easily update these rules every day(using cron or some other scheduler), look at [ sa-update and its channels] or http://saupdates.openprotect.com
If you find these rulesets useful and get tired of downloading updates, Chris Thielen, has kindly provided a shell script to automatically update these sets. You can find the script and instructions at: http://www.exit0.us/index.php?pagename=RulesDuJour
Another tool is now available, featuring GPG check of the rulesets which have a known signature and an apt-get-like syntax : http://maxime.ritter.eu.org/article.php3?id_article=10
For Windows, Bret Miller has contributed a windows script for updating these sets. You can download it here: http://mail.wcg.org/~support/default.html#satools
A readme file is provided with instructions for setting it up. It requires ActivePerl 5.8.x (doesn't work right on 5.6.1).
Additional collections
Here are some additional collections of custom rulesets:
The SARE Ninjas have a collection of custom rules available at the SpamAssassin Rules Emporium (started by Chris Santerre) - http://www.rulesemporium.com - this collection includes HTML rules, Header abuse rules, ratware rules, specific spammer rules, adult rules, fraud rules, subject rules, business and marketing rules, etc. Several of those rule sets are multi-file rule sets, a practice started by Bob Menschel, allowing you to pick and choose based on the quality or applicability of rules within the MultiFileRuleSets.
The Hebrew SpamAssassin rules project is located at http://www.deltaforce.net/hebrewspam