Home Articles Data-Enriched Profiles on 1.2B People Exposed in Gigantic Leak
Articles - Cyber Security - November 26, 2019

Data-Enriched Profiles on 1.2B People Exposed in Gigantic Leak

Although the data was legitimately scraped by legally operating firms, the security and privacy implications are numerous.

An open Elasticsearch server has exposed the rich profiles of more than 1.2 billion people to the open internet.

First found on October 16 by researchers Bob Diachenko and Vinny Troia, the database contains more than 4 terabytes of data. It consists of scraped information from social media sources like Facebook and LinkedIn, combined with names, personal and work email addresses, phone numbers, Twitter and Github URLs, and other data commonly available from data brokers – i.e., companies which specialize in supporting targeted advertising, marketing and messaging services.

Taken together, the profiles provide a 360-degree view of individuals, including their employment and education histories. All of the information was unprotected, with no login needed to access it.

“it is a comprehensive dataset collected from B2B [business-to-business] lead-generation companies’ lists,” Diachenko told Threatpost via Twitter.

If accessed by cybercriminals, the data, which includes scores of related accounts tied to each individual, could be used for highly effective, targeted phishing attacks, business email compromises and identity theft, among other things.

“Information like this is extremely useful to criminals as a starting point in hacking a number of related accounts and also lends itself the potential for increased credential stuffing attacks,” Carl Wearn, head of e-crime at Mimecast, said via email. “This information obviously also provides a fantastic treasure trove of information for the means of industrial, political and state-related espionage and there are multiple malicious uses for the data leaked from this breach.”

For affected consumers, remediation is no picnic, either.

“Data breaches that expose information such as phone numbers to personal accounts like email or social accounts are just as serious as ones that expose payment information,” Zack Allen, director of threat operations at ZeroFOX, told Threatpost. “Luckily for payment information, you can change your credit card, or your password to your accounts. But what can victims of this breach do when their phone number and Facebook profile is leaked? Changing your phone number can cost money with your carrier, you also have to update all of your contacts with your new phone number, plus all of your two-factor accounts.”

Data Broker Sources

Diachenko and Troia’s investigation uncovered that the data sets came from two separate lead-generation companies, whose business it is to assemble highly detailed profiles of individuals: People Data Labs (PDL) and OxyData[.]io.

“The majority of the data spanned four separate data indexes, labeled ‘PDL’ and ‘OXY,’ with information on roughly 1 billion people per index,” the researchers wrote in a writeup on Friday. “Each user record within the databases was labeled with a ‘source’ field that matched either PDL or Oxy, respectively.”

After notifying both companies, both said the server in question did not belong to them. However, the data certainly appeared to.

“In order to test whether or not the data belonged to PDL, we created a free account on their website which provides users with 1,000 free people lookups per month,” the researchers explained. “The data discovered on the open Elasticsearch server was almost a complete match to the data being returned by the People Data Labs API. To confirm, we randomly tested 50 other users and the results were always consistent.”

OxyData meanwhile sent Diachenko a copy of his profile, and the data fields also matched.

The researchers said they were unsure how the data came to be collected in the now-closed database. Could it be a customer of both PDL and OxyData, they wondered? Or, was the data had been stolen and placed in the storage bucket by hackers? The only clues as to the owner of the server was the IP address (35.199.58.125), and that it was hosted with Google Cloud.

Liability and Privacy Concerns

While the incident is not a data breach per se (but rather a story of yet another misconfigured server), it brings up two different concerns. First, what liability do the data originators (PDL and OxyData) have to the people whose profiles were exposed? And two, even though the information is aggregated from allegedly public sources, what does this kind of “data enrichment” mean from a privacy perspective?

To the first concern, Kelly White, CEO at RiskRecon, believes that the lead-generation companies are on the hook for the exposure.

“Data…is easily and perfectly…

Read The Full Article 

Leave a Reply