In my previous blog post, I introduced the concept of open quarantine. This week, I’d like to explore the two phases of email filtering that make up the open quarantine process.
The notion of open quarantine depends on being able to perform a tripartite classification of messages into good, bad and undetermined, where the first two categories have a close to negligible probability of containing misclassified messages. For emails, this is practically attainable, and can be done in many ways.
One good approach to performing the Phase 1 classification involves computing four scores, corresponding to the trust, reputation, authenticity and risk associated with a message:
Based on these four scores, messages are classified as follows:
The rationale behind bad.
A high trust score and a low authenticity score is indicative of messages in which an attacker impersonates a party with which the recipient (or the organization) has a working relationship. On the other hand, a high reputation score and a low authenticity score is common for attacks in which well-known brands are impersonated, where the recipient does not necessarily have a relationship with these brands. Very low trust and reputation scores correspond to “fly-by-night” operations; these are commonly used for large-volume attacks. Ransomware attacks and well-established BEC attacks would typically exhibit high risk scores.
Phase 2 of the email filtering depends on the outcome of the Phase 1 filtering, and may involve in-depth database lookups; manual review; automated messaging to the apparent sender; and more. To help clarify the approach:
High Risk of Spoofing: While DMARC deployment is on the rise, we are far from universal deployment of this de-facto standard. As a result, email spoofing is still a reality organizations have to deal with. Roughly half of all attempts to pose as somebody else involve spoofing. For emails that the Phase 1 review identify as undetermined due to a low authenticity score, more thorough scrutiny should be performed.
Automated analysis can identify senders that are particularly vulnerable to spoofing attacks, as DMARC records are publicly available. This corresponds to email from senders whose organizations do not have a DMARC reject policy in place. Messages that are at high risk of having been spoofed can be validated by generating an automated message for the apparent sender, requesting a confirmation that he or she sent the message. If an affirmative reaction to this message is observed, the initial message is classified as good; if a negative reaction is received, it is classified as bad. Heuristics can be used how to classify messages resulting in no response after a set time has elapsed; for example, a message with a reply-to address not previously associated with the sender, or containing high-risk content, could be classified as spoofed if there is no affirmative reaction within ten minutes of the transmission of the automated validation request.
High Risk of Impersonation: The Phase 1 email filtering may indicate a higher than normal risk for impersonation. Consider, for example, an email is received from a sender that is neither trusted by the recipient or the organization, nor has a good reputation in general, but for which the display name is similar to the display name of a trusted party or a party with high reputation. This, by itself, is not a guarantee that the email is malicious. Therefore, additional scrutiny of the message is beneficial.
Automated analysis can be used to identify some common benevolent and malicious cases. One common benevolent case involves a sender for which the display name and user name match, and where the sender’s domain is one for which account creation is controlled. A common malevolent case corresponds to a newly created domain, especially if the domain is similar to the domain of the trusted user to which the sender’s display name is similar. There are additional heuristic rules that are useful to identify likely benevolent and malevolent cases. However, a large portion of display names and user names do not match any of these common cases—whether the message is good or bad—for these, manual review of the message contents can be used to help make a determination.
Another helpful approach is to send an automated request to the trusted party whose name is matches the sender’s name, asking to confirm whether the email from the new identity was sent by him or her. For example, the request may say:
“Recently, <recipient> received an email from a sender with a similar name to yours. If you just sent that email, please click on the link below and copy in the subject line of the email and click submit. Doing this will cause your email to be immediately delivered, and fast-track the delivery of future emails sent from the account.”
High Risk of Account Take-Over. The Phase 1 email filtering may indicate a higher than normal risk for an account take-over of the account of the sender. For example, one such indication is an email with high trust, authenticity and risk scores—this is an email likely to be sent from the account of a trusted party, but whose content indicates potentialIf the source of potential danger is an attachment then this can be scrutinized, including both an anti-virus scan and processing of potential text contents of the attachment to identify high-risk storylines. Similarly, a suspect URL can be analyzed by automatically visiting the site and determining whether it causes automated software downloads, or has a structure indicative of a phishing webpage. The system can also attempt to identify additional indications of risk; for example, by determining if the sender of the suspect email is associated with a recent traffic anomaly: if the sender has communication relationships with a large number of users protected by the system, and an unusual number of these received emails from the sender in the recent past, then this increases the probability of an Account Take Over having taken place.
A Phase 2 risk score is computed using these types of methods. If the cumulative risk score falls below a low-risk threshold, then the message is deemed safe, and the second phase concludes. If the cumulative score exceeds a high-risk threshold, then the message is determined to be dangerous, and a protective filter action is taken. If the score is between these two thresholds, then additional analysis may be performed. For example, the message can be sent for manual review, potentially after being partially redacted to protect the privacy of the communication. Another approach involves automatically contacting the sender using a second channel (such as SMS) to request a confirmation that the sender intended to send the message. Based on the results of the manual review, the potential response of the sender, and other related results, a filtering decision is made.
Check back in next week, when I discuss the user experience of the recipient and its relation to the neutralization of messages that are classified as undetermined.