C. Rossow, T. Czerwinski, C. Dietrich, N. Pohlmann:,
“Detecting Gray in Black and White”.
In Proceedings of the MIT Spam Conference,
DNS based black- and whitelists are heavily used in the fight against spam. However, in certain cases their use can cause conflicts, such as false positives. In our work, we show a method to identify IP addresses that are listed in both blackand whitelists. We term this set of addresses as gray IP area. We then develop a method to classify these senders as either spammers or legitimate mailers. This method is applied in an experiment using well-known black- and whitelists. The results show difficulties in such an automated classification because of legitimate mail servers relaying spam mails, mail senders behind dynamic IP addresses and misconfigured MTAs. We conclude that there is no automated mechanism to perform a reasonable classificiation without manual expert knowledge of all involved mail senders.
Since the early ages of email, DNS based blacklists have been widely used as anti-spam measure. They allow to filter or deny SMTP traffic from senders using IP addresses with a bad reputation. In particular high-volume mail receivers tend to use blacklisting, since it is a very efficient filtering mechanism that does not require to inspect the mail content at all. In addition, professional services conveniently operate high-quality DNS based blacklists and supply these to mail receivers. Thus, hitherto, blacklisting is the (or at least one of the main) mechanisms used by email operators to filter unsolicited messages.
However, blacklists have always been prone to false positives. In the context of blacklists used as anti-spam tool, false positives are entries that wrongly lead to blocking legitimate mail submissions. In order to soften that risk, blacklists are usually combined with whitelists. Such whitelists
include IP addresses or mail servers that are considered to send legitimate mail – even accepting a certain spam ratio. As a consequence, these cross-listed mail servers may always initiate SMTP connections to submit mails and a spam check is done at later stages, if at all. Experiences of mail experts showed that the mechanism works well in most cases. But if the interaction of black- and whitelists fails, legitimate mail senders can be blocked from mail transmission until further manual interference.
Although the coarse granularity of blacklists has been known to anti-spam experts for years, to the best of our knowledge, we are the first to explore cross-listed IP addresses systematically. We see a need for a more detailed discussion due to recent and current spam trends that partially undermine IP reputation based anti-spam mechanisms. First, in the early beginnings of spam the unsolicited mails were sent by open relays with fixed IP addresses. Nowadays the majority of spam is sent by zombie computers that are part of a botnet with dynamically assigned IP addresses [5, 7].
This makes it difficult for blacklists to have a complete set of spamming IP addresses [14, 11, 6]. Next, current malware tends to steal data from home users that also include credentials of legitimate SMTP servers. The stolen credentials are then used by spammers to relay unsolicited mails through SMTP smarthosts that otherwise send mostly legitimate mail. Third, malware can even sign up for new accounts by evading CAPTCHA mechanisms of free mailers . These accounts can also be misused for sending spam from an else well reputed mail source. As a last option of
this still incomplete listing, a conflict occurs if users instruct a primary mail account to automatically forward all received mail to a secondary mail account located on a different mail server. In this situation, not only legitimate mail but also spam is typically forwarded to the secondary account and contaminates the primary mail server’s reputation.