Email Security Blog

Exploring the Ashley Madison Dataset

John Wilson October 2, 2015 Cybercrime, DMARC
Fallback Featured Image

I first heard about the Ashley Madison breach on July 15, 2015 in a post by Brian Krebs. I immediately wondered what the fallout of such a breach would be. Would Ashley Madison’s new tagline be “1 million divorces and counting!” Would the perpetrators try to profit from the stolen data, perhaps through blackmail? I never imagined I’d soon have the chance to explore the dataset myself, after Forbes published a link to the data.

Most breaches are not discovered immediately. If you were to look at the creation timestamp for members, you would find that, since January 1 there have been between 25K and 30K new records created each day, right up until February 23rd at 12:26:13. (The timezone is unknown but likely to be either UTC or Toronto time). Could this be the precise moment the data was pilfered? Or, did the thieves merely stumble upon an older backup taken at that time? These are interesting questions, but I actually had another purpose in mind.

At my previous company I needed a large volume of realistic email messages in order to create some DKIM test cases. Fortunately I stumbled upon the Enron email corpus released as part of the public record after the company imploded. I wondered if perhaps the Ashley Madison data dump could be similarly useful.

One question I’ve had for a while is “What percentage of consumer email addresses are protected using the DMARC standard, broken down by country?” On the surface this seems like an easy question: If you have a,,,,, or email address, you are protected by DMARC and if you use some other service you are not protected. It’s actually more complicated than that; Yahoo manages email for Rogers, AT&T, SBC, and a number of other entities. All 5 million+ Google Apps domains support DMARC reporting and enforcement. A year or two ago I wrote a script that takes a list of domains, performs an MX lookup on each one, and then determines if the domain supports DMARC enforcement on inbound email. My script should get me part way there, but to do the country breakdown I was going to need a representative consumer email list that included each consumer’s country.

The recent Ashley Madison data dump includes all that and much more. Using the leaked data, I could theoretically figure out DMARC coverage by country, gender, or even sexual proclivities. For the purposes of this post, I’ll stick to a breakdown by country. claims that typical DMARC coverage in the USA is around 85% for consumer mailboxes, and about 60% globally. Let’s see if that holds true in the Ashley Madison data.

After following the Forbes link using Tor, I obtained a .torrent file which I used to download the dataset. After installing mysql, I imported the data into relational tables began my investigation. I extracted the domain portion of all email addresses and ran my shell script to determine DMARC coverage for each domain. After importing these results back into mysql, I was finally able to perform my analysis.

For the Ashley Madison dataset, 90.68% of all email addresses linked to US-based members support the DMARC standard today. That’s more than 5 points higher than the estimate. Global coverage is 87.26% which is much better than the estimate of 60%.

Below is a breakdown by country, showing the percentage of Ashley Madison member addresses hosted at a DMARC-compliant provider:






















That’s all for today; but there are a lot more juicy stories hidden in this data…stay tuned!

Leave a Reply

Your email will not be published. All fields are required.

Agari Blog Image

April 18, 2019 Ronnie Tokazowski

Do You Know Where Your W-2 Is? Probably Where You Left It

It’s like clockwork. Every year around tax time security vendors (even us!) push out warnings…

Agari Blog Image

April 17, 2019 Fareed Bukhari

The Time is Now: Underscoring the Importance of DMARC for State and Local Governments

Scammers know that impersonating a trusted government agency is an extremely effective way to trick…

Agari Blog Image

April 4, 2019 Crane Hassold

Evolving Tactics: London Blue Starts Spoofing Target Domains

In December, the Agari Cyber Intelligence Division (ACID) published a report on a business email…

Agari Blog Image

March 27, 2019 Ronnie Tokazowski

Why iTunes? A Look into Gift Cards as an Emerging BEC Cash Out Method

One of the trends that has been slowly creeping up across the BEC threat landscape…

Checking Email on Phone

March 19, 2019 James Linton

BEC Goes Mobile as Cybercriminals Turn to SMS

Business email compromise (BEC) is a term that encompasses a variety of techniques and tactics…

mobile image