Email Security Blog

Email Fraud: Refining Predictive Models to Stop the Next Email Attack

Siobhan McNamara August 22, 2018 Email Security
Fallback Featured Image

New research finds malicious email attacks hit your business an average of one every six minutes. Predictive models point to new ways to quash the threat—but how can they account for random, outside factors?

With apologies to Thomas Dolby, there’s new hope that email fraudsters everywhere could soon be screaming, “They blunted me with science!” Data science, that is.

As we discussed in parts one and two of this series, breakthrough research from data scientists at Agari have unearthed a surprising pattern hidden in business email compromise (BEC), whale- and spear-phishing and other email-based attacks.

After accounting for other variables, it’s now clear that businesses receive a new malicious email at an average rate of one every six minutes.

Yet as alarming as this attack frequency may be, that’s not the jaw-dropping part. As it turns out, this average attack rate is consistent across all organizations, of all sizes, across all industries.

Why does it matter? Because this level of consistency may soon lead to new enhancements in the Agari solutions organizations depend on to short-circuit a growing number of advanced email threats. As it stands now, the business world can use all the help it can get.

Not Your Grandfather’s Email Attack

Indeed, email has emerged as a cybercriminal’s best friend. While businesses everywhere have spent billions to keep hackers out of their systems, fraudsters have learned to circumvent those defenses by exploiting simple human nature.

Using advanced identity deception techniques and highly-personalized messages that easily slip past the email security solutions most businesses use today, these criminals leverage social engineering to fool recipients into making payments or exposing sensitive information.

The damage is mounting. Today, more than 95% of data breaches start with a malicious email, helping to fuel cybercriminal activities that are expected to contribute to more than $3 trillion in business losses worldwide this year.

Unlike traditional email security systems, Agari solutions leverage advanced machine learning to analyze people, relationships and behaviors to effectively block malicious emails.

What if a predictive, time-based data model could be applied to fine-tune the efficacy of these solutions even further?

Known-Knowns Meet Unknowns

In part one, we established that average six-minute interval between the arrival of one malicious email to the next, with 90% falling within an average 16-minute window.

To be clear, this is just an average. Some malicious emails arrive in quick succession followed by longer time gaps. What’s more, each organization experiences a unique cadence to these attacks—no two organizations are receiving malicious emails at the same intervals. Yet on average, they all share that same average six-minute time gap, no matter their overall volume of email.

In part two, we used this insight to developed a data model that featured a two-minute time lag to determine if the prior few minutes of incoming email activity would accurately predict the next minute of activity. When plotted against one company’s real-world data on incoming email over a 45-day period, this model, with a 2 minute lag, anticipated the arrival of malicious emails with a remarkable level of accuracy. That means every two minutes of data served as a good predictor of the next two minute’s volume of malicious messages.

A key aspect of the attacks caught our eye: Even within the average six-minute interval, when malicious emails did arrive, they tended to do so in batches of pronounced spikes. This is why the model with the two minute lag works so well. Next we wanted to know what underlies this pattern. Are these coordinated campaigns? Are they in response to news involving the company?

Campaign-Centric Threat?

As it happens, we did find some of the answers in shared subject lines, “from” addresses and IP addresses. By removing these clear campaign indicators, we cut our model’s error rate in half. What’s more, the large spikes in malicious messages start conforming to typical business hours.

This may indicate campaigns sharing this handful of obvious attributes are less targeted and perhaps sent from different geographies without concern for email arrival time.

[IMAGE]

To better understand the remaining attack pattern, we recalibrated our model to capture work hours only, hoping to see if there was an even more defined pattern during the block of time both legitimate and malicious email volumes are highest.

[IMAGE]

In this instance, our 2-minute model was less predictive of new malicious emails. Indeed, the malicious email appeared to be more random, with unique subject lines and coming from unique senders and IP addresses. This points to two possibilities:

There is a common incentive motivating a large number of independent fraudsters to send malicious emails to a given organization at one time

These emails are part of coordinated campaigns from the same criminals, and that these campaigns are far more sophisticated than spam ever was

News-Driven Events?

That first hypothesis fit with our notion that news coverage about the company could drive malicious email traffic by getting word out to a large number of fraudsters about opportunities to defraud the organization. After all, when an organization announces they have completed a round of funding, for instance, they tend to receive a lot more phishing emails.

[IMAGE]

To test that, we analyzed for volume of media coverage. But ultimately, we could not identify any discernable patterns to support that notion. Which leaves our second hypothesis. And while the prospect that these are all highly-coordinated attacks is far more intriguing, they also represent a much larger threat to organizations.

A Never-Ending Battle

So, where does that leave us? After plotting a number of models to test a variety of hypotheses, we could account for only some of the underlying reasons organizations large and small receive malicious emails with that same average six-minute interval.

Which means that while we can’t yet identify whether a malicious email is part of a highly-coordinated campaign, we’re getting closer. This is great validation to Agari’s approach of designing machine learning models to catch fraud. Despite the opaque nature of what underlies these batch attacks, Agari’s solutions are designed to catch them regardless. We catch anything that looks different from ‘good’ and ‘normal’.

The dataset generated by the two trillion or so emails we process annually is just one of the extraordinary weapons in our arsenal. Our industry-leading expertise and AI-powered solutions apply behavioral science to identify and infer relationships in order to successfully recognize and neutralize incoming email attacks.

And our data scientists are continuously refining our data models to enhance the way we help brands across the globe stay ahead of cybercriminals and defeat BEC, phishing and other forms of email fraud. This culture of innovation is at the heart of everything we do.

With that in mind, I hope this series has given you at least a small glimpse into the kind of analysis we do on a continuous basis. We are growing fast and as we do so, our data becomes more diverse and our models more sophisticated in distinguishing between good and malicious behaviour. With such a large global customer base now I believe we will be able to defeat advanced email attacks, closing off the primary entry point for hackers into organizations.

That is my prediction – and I’m sticking to it!

Leave a Reply

Your email will not be published. All fields are required.

Agari Blog Image

July 16, 2019 Seth Knox

Microsoft Office 365 + Agari Secure Email Cloud: All You Need in a Cloud-First World

You’ve heard the statistics… more than 70% of all business users will be provisioned with…

Agari Blog Image

July 11, 2019 Armen Najarian

Restoring Trust to Digital Communications: How Smart Communities Model the Good

Legacy email security systems are failing, as more enterprises migrate their emails to the cloud…

Agari Blog Image

June 27, 2019 Siobhan McNamara

The 4 Fundamentals of AI-Based Email Security

Predictive, AI-based email security is proving to be remarkably effective at protecting against today's most…

Agari Blog Image

June 20, 2019 Michael Cichon

Email Security: Using ML to Prevent Advanced Attacks

The statistics are astounding. Email remains the number one threat vector for data breaches, the…

Agari Blog Image

June 19, 2019 Patrick Peterson

From Secure Email Gateway to Secure Email Cloud

The secure email gateway (SEG) worked for decades, no doubt. It was truly the first…

mobile image