Email Security Blog

Exploring the Ashley Madison Dataset

John Wilson October 2, 2015 Cybercrime, DMARC
Fallback Featured Image

I first heard about the Ashley Madison breach on July 15, 2015 in a post by Brian Krebs. I immediately wondered what the fallout of such a breach would be. Would Ashley Madison’s new tagline be “1 million divorces and counting!” Would the perpetrators try to profit from the stolen data, perhaps through blackmail? I never imagined I’d soon have the chance to explore the dataset myself, after Forbes published a link to the data.

Most breaches are not discovered immediately. If you were to look at the creation timestamp for members, you would find that, since January 1 there have been between 25K and 30K new records created each day, right up until February 23rd at 12:26:13. (The timezone is unknown but likely to be either UTC or Toronto time). Could this be the precise moment the data was pilfered? Or, did the thieves merely stumble upon an older backup taken at that time? These are interesting questions, but I actually had another purpose in mind.

At my previous company I needed a large volume of realistic email messages in order to create some DKIM test cases. Fortunately I stumbled upon the Enron email corpus released as part of the public record after the company imploded. I wondered if perhaps the Ashley Madison data dump could be similarly useful.

One question I’ve had for a while is “What percentage of consumer email addresses are protected using the DMARC standard, broken down by country?” On the surface this seems like an easy question: If you have a,,,,, or email address, you are protected by DMARC and if you use some other service you are not protected. It’s actually more complicated than that; Yahoo manages email for Rogers, AT&T, SBC, and a number of other entities. All 5 million+ Google Apps domains support DMARC reporting and enforcement. A year or two ago I wrote a script that takes a list of domains, performs an MX lookup on each one, and then determines if the domain supports DMARC enforcement on inbound email. My script should get me part way there, but to do the country breakdown I was going to need a representative consumer email list that included each consumer’s country.

The recent Ashley Madison data dump includes all that and much more. Using the leaked data, I could theoretically figure out DMARC coverage by country, gender, or even sexual proclivities. For the purposes of this post, I’ll stick to a breakdown by country. claims that typical DMARC coverage in the USA is around 85% for consumer mailboxes, and about 60% globally. Let’s see if that holds true in the Ashley Madison data.

After following the Forbes link using Tor, I obtained a .torrent file which I used to download the dataset. After installing mysql, I imported the data into relational tables began my investigation. I extracted the domain portion of all email addresses and ran my shell script to determine DMARC coverage for each domain. After importing these results back into mysql, I was finally able to perform my analysis.

For the Ashley Madison dataset, 90.68% of all email addresses linked to US-based members support the DMARC standard today. That’s more than 5 points higher than the estimate. Global coverage is 87.26% which is much better than the estimate of 60%.

Below is a breakdown by country, showing the percentage of Ashley Madison member addresses hosted at a DMARC-compliant provider:






















That’s all for today; but there are a lot more juicy stories hidden in this data…stay tuned!

Leave a Reply

Your email will not be published. All fields are required.

Agari Blog Image

July 10, 2019 Ronnie Tokazowski

‘Til Death Do Us Part… Romance Scams and the BEC Game

When we think of business email compromise (BEC), the first thing that comes to mind…

Agari Blog Image

June 26, 2019 Armen Najarian

Ticket to Fraud: Airline Industry Sees Increased Consumer Phishing Scams

For many, there are few things more satisfying than receiving an email confirmation for a…

Agari Blog Image

June 13, 2019 Fareed Bukhari

DMARC Adoption Worldwide Slows with Australia's ASX 100 Remaining Most Vulnerable

DMARC adoption rose a tepid 1% in the first quarter of the year, with the…

Agari Blog Image

June 5, 2019 Crane Hassold

From One to Many: Scattered Canary Evolves from One-Man Startup to BEC Enterprise

There is no denying that business email compromise (BEC) is big business, with losses exceeding…

Agari Blog Image

May 23, 2019 Suela Vahdat

DMARC Remains Elusive with 86% of Domains Open to Impersonation

More than three-quarters of UK government organisations haven't yet adopted Domain-based Message Authentication and Reporting…

mobile image