A Tale of 3 Data 'Leaks': Clubhouse, LinkedIn, FacebookConfusion Over Hacking, Scraping and Amassing Highlights Data Lockdown Imperative
Criminals love to amass and sell vast quantities of user data, but not all data leaks necessarily pose a risk to users. Even so, the ease with which would-be attackers can amass user data is a reminder to organizations to lock down inappropriate access as much as possible.
That's a takeaway experts offer after large tranches of data recently became available for sale or for free. The data allegedly was obtained from three social networks: Clubhouse, LinkedIn and Facebook. Scammers can use such data to target individuals via social engineering attacks, and phishers can use it to craft lures, among other potential threats.
Clubhouse - a startup social media network accessed via an app - and LinkedIn have both confirmed that large amounts of their user data has appeared online. But both services say the data, which is being offered for sale on darknet forums, was scraped from public-facing pages. So what buyers would be paying for is getting access to all of this public information at once.
The story is different, however, with the latest Facebook data breach to come to light. Earlier this month, 533 million users' details - including phone numbers that were set to not display on their profiles - were being offered for free online after having been available for purchase. In response, Facebook said attackers had obtained the data "not through hacking our systems but by scraping it from our platform," apparently by abusing an API that Facebook built to allow users to find each other.
"If you provide an API … work on the assumption of it being abused."
Experts say the resulting records, linking people's names, email addresses, phone numbers and more, are a potential gold mine for fraudsters and phishers (see: Facebook Tries to 'Scrape' Its Way Through Another Breach).
Ireland's Data Protection Commission is probing the breach, in line with its authority to enforce the EU's General Data Protection Regulation. Facebook says it's attempting to trace the posted information back, and it has suggested that the data dump may include information amassed from multiple sources, not all of them involving private information held by the social network and its ancillary services.
LinkedIn: 'Not a Data Breach'
While a Facebook feature appears to have exposed private data for more than a half-billion users, the story looks different for LinkedIn and Clubhouse.
Last week, a cybercrime forum seller began advertising 500 million LinkedIn records, offering 2 million of the records as a sampler for $2 in forum credits and access to all records for a four-figure sum, CyberNews first reported. The seller said the profiles included "emails, phone and other details."
In a statement released on Thursday, LinkedIn said the data involves only information that is already publicly accessible via its site and may have been combined with information from other sites.
"We have investigated an alleged set of LinkedIn data that has been posted for sale and have determined that it is actually an aggregation of data from a number of websites and companies," LinkedIn says. "It does include publicly viewable member profile data that appears to have been scraped from LinkedIn. This was not a LinkedIn data breach, and no private member account data from LinkedIn was included in what we've been able to review."
In other words, while seeing so much user data get amassed in one place might be concerning - and of use to social engineers and others - this information was already in circulation.
Clubhouse Data Also Scraped
The same also appears to be true for Clubhouse, which saw information from about 1.3 million user profiles get posted on a cybercrime forum on or around Saturday. The poster said that the data had been scraped from Clubhouse using one of its APIs.
Clubhouse is an iOS-based app that enables users to set up virtual audio chat rooms, to which most participants will then be listening in. The service, which launched early last year, is still invite-only, but the Guardian reports that buzz over Clubhouse has been building, especially after Tesla founder Elon Musk used it in February to host a popular chat.
The scraped Clubhouse data includes name and username, user ID, profile photo, number of followers, number of other Clubhouse users followed, an account creation date, who invited the user to the platform and sometimes Instagram and Twitter handles.
The data does not include personally identifiable information, such as phone numbers, email addresses or other sensitive information.
In a statement posted to Twitter on Sunday, Clubhouse denied that it had been breached or hacked after reports emerged that user data had appeared on the cybercrime forum.
This is misleading and false. Clubhouse has not been breached or hacked. The data referred to is all public profile information from our app, which anyone can access via the app or our API. https://t.co/I1OfPyc0Bo— Clubhouse (@joinClubhouse) April 11, 2021
Clubhouse officials didn't immediately respond to a request for further comment.
Expert View: The API Challenge
The posted Clubhouse data poses no risk to users, says Jane Manchun Wong, a Hong Kong-based software engineer and security researcher who often blogs about unreleased features in popular applications.
"The kind of data gathered here is no different than going to someone's Clubhouse profile and taking a screenshot," Wong says.
The data was likely scraped using one of Clubhouse's "private" APIs or one that is used by its app to retrieve data, Wong says. Whoever downloaded the data may have simply cycled through user IDs sequentially, she says.
Not seeing any private info in this "leaked data" of Clubhouse
The user IDs are numerical. So it just seems like someone scraped the data by hitting Clubhouse's private API, iterating from user ID 1 to beyond https://t.co/MBWG46JmCB— Jane Manchun Wong (@wongmjane) April 11, 2021
Services generally use rate-limiting and other defensive measures to ensure their APIs aren't abused. Wong says that if the data was obtained by iterating through numerical user IDs, Clubhouse should have enabled rate limiting on its private API if it does not already do that, because its users have an expectation of privacy.
But even with rate limiting, amassing all of this information would still be possible. "It'll only be slower, but it can still be done," Wong says.
Troy Hunt, creator of the free Have I Been Pwned data breach notification service, says APIs pose this paradox: If developers want to make users discoverable to other users, it's difficult to ensure that the underlying API will only be used for that purpose - in other words, by only the right users and for the right reasons.
"If you provide an API, regardless what you protect with rate limiting," expect that whatever data it touches "will be aggregated," Hunt says. "You work on the assumption of it being abused."