Anti-Spam Service: Glossary of Terms
address mapping: When email comes in, each recipient address is mapped to a stream. This process is calledf address mapping.
Bayesian filtering: A statistical technique whereby the system assigns a spam probability based on training from users. Bayesian filtering can greatly improve the accuracy of CanIt-PRO, and makes it harder for spammers to evade filtering. Bayesian email filters take advantage of Bayes' theorem. Bayes' theorem, in the context of spam, says that the probability that an email is spam, given that it has certain words in it, is equal to the probability of finding those certain words in spam email, times the probability that any email is spam, divided by the probability of finding those words in any email.
Bayes Journal: The system makes a note of the fact that the message is to be trained in a special table called the Bayes Journal. Periodically, the antispam process goes through the Bayes Journal and actually updates the Bayes data.
blacklist: A list of email senders, domains, or hosts that are automatically and entirely blocked from sending mail to the filtered address, regardless of the message content.
DNS query: This is a query sent to a Domain Name Server (DNS) to resolve a domain name or hostname into an Internet Protocol (IP) address.
host: This is a mail server on the Internet that sends mail to your address.
one-shot: Many spammers use special software to send bulk mail directly from their PCs. Because spammers want wide distribution, they want each message to be sent as cheaply as possible. Some spam software, therefore, ignores SMTP errors if a message cannot be delivered. This is the motivation behind the "One-Shot" message category. CanIt-PRO can deal very effectively with hit-and-run spam software by sending a temporary failure indication at the MAIL FROM: or RCTP TO: SMTP command when mail from an unknown sender arrives.
MIME type: Multipurpose Internet Mail Extensions (MIME) is an Internet Standard for the format of email. Virtually all Internet email is transmitted via SMTP in MIME format. Internet email is so closely associated with the SMTP and MIME standards that it is sometimes called SMTP/MIME email. The basic Internet email transmission protocol, SMTP, supports only 7-bit ASCII characters. This effectively limits Internet email to messages which, when transmitted, include only the characters used for the English language. MIME defines mechanisms for sending other kinds of information in email, including text in languages other than English using character encodings other than ASCII as well as 8-bit binary content such as files containing images, sounds, movies, and computer programs.
RPTN: This stands for the Roaring Penguin Training Network, and is a mechanism whereby multiple CanIt-PRO installations can share Bayes votes. In the reporting phase, CanIt-PRO installations send reports about whether or not mail they have seen is spam. A report essentially consists of a list of tokens in the mail message and a spam or not-spam flag, depending on how the incident was disposed of. The RPTN server aggregates all of the reports it receives and builds a database of Bayesian statistics from the reports. In the download phase, a CanIt-PRO installation downloads the aggregated data and installs it in its database. This data can subsequently be used for Bayesian analysis.
Sendmail: Sendmail is an open source mail transfer agent (MTA): a computer program for the routing and delivery of email.
spam score: This is a number assigned by the spam-scanning rules. The higher the score, the more "spamlike" the message appears. If you set a spam threshold of 6, any message scoring 6 or higher is flagged or filtered out of the email stream, depending on your settings.
SPF rules: SPF (Sender Policy Framework) is a more sophisticated version of Mismatch Rules. SPF allows the owners of a domain to assert which hosts are allowed to originate e-mail claiming from that domain. For example, the domain aol.com has an SPF record that lists which hosts ordinarily send out AOL mail. If you receive mail from a host not in AOL's list of approved senders, it is probably faked.
stream:This is a collection of rules and policies. Each stream in CanIt-PRO can have its own rules, settings, thresholds and policies as defined by the user.
token: This roughly corresponds to a word. In addition to single-word tokens, the filter keeps track of token pairs, which can greatly increase the accuracy of Bayesian filtering.
training corpus: Each time a message is marked as spam or not-spam, CanIt-PRO updates counters for each token and token pair in the message. The training statistics are unique for each stream; each stream therefore has its own training set and own notion of what is and isn’t spam. The set of messages on which CanIt-PRO is trained is called the training corpus.
trap: This consists of messages that have been held based on the streams settings. For example, a message can be held because of its spam score, or because it contains a suspicious MIME type.
whitelist: A list of email senders, domains, or hosts that are always automatically allowed to send mail to the filtered address, regardless of the message content.
wild card: In computer (software) technology, a wildcard character can be used to substitute for any other character or characters in a string. The asterisk (*) usually substitutes as a wildcard character for any zero or more characters, and the question mark (?) usually substitutes as a wildcard character for any one character, as in the CP/M, DOS, Microsoft Windows and POSIX (Unix) shells. (In Unix this is referred to as glob expansion.) In SQL, the wildcard characters are percent (%) for zero or more characters, and underscore (_) for one character. In many regular expression implementations, the period (.) is the wildcard character for a single character.
wizards: These are a collection of tools for easily configuring certain common scenarios.
Taken from: CanIt-PRO User's Guide for Version 3.0.4, Wikipedia
Current Record: 2653
Create Date: 08-12-2005
Last Reviewed: 04-30-2007
Home
