Big Data Security Analytics , Governance & Risk Management , Next-Generation Technologies & Secure Development
Travel Card Data Release Risked Australians' PrivacyTransport Agency Criticized for Violating Privacy Law
An Australian transport authority has been put on notice for releasing a data set comprising nearly 2 billion public transport travel records without sufficiently ensuring that travelers couldn’t be identified.
See Also: LIVE Webinar | Stop, Drop (a Table) & Roll: An SQL Highlight Discussion
The Office of the Victorian Information Commissioner (OVIC) has issued a compliance notice to Public Transport Victoria, which is part of the state’s Department of Transport. The department has been ordered to create a data governance program and review its policies for releasing data.
The commissioner found the department violated two parts of the state’s privacy law: disclosing information for a purpose for which it wasn’t collected and failing to protect personal information.
“This incident demonstrates why it is dangerous to rely on de-identification along when sharing and releasing data,” writes Annan Boag, an assistant commissioner with the OVIC.
OVIC says that although the Department of Transport disagrees that the data set posed a privacy risk, it will take the actions recommended in the compliance notice.
Chris Culnane, a lecturer in the School of Information Systems at the University of Melbourne who was on the team of researchers that uncovered the privacy concerns, tells Information Security Media Group that there “hadn’t been a lot of effort” to protect people from being identified.
Taking the Myki
At issue is the release of three years of travel records for Victoria’s Myki card, which is the state’s travel card used for buses, trams and trains.
The data covered virtually all public transport travelers in Victoria - 1.8 billion travel records for 15.1 million Myki cards between July 2015 and June 2018. The data was released in July 2018 for the Melbourne Datathon, Australia’s largest event focused on finding innovative uses for data.
Almost immediately, data security experts warned that it would be possible to examine travel records and identify people. That posed a real danger because figuring out who took just one trip opens a door to three years of that person’s travel records.
Three academics from the University of Melbourne downloaded the data and by September 2018 had re-identified themselves. They also identified a member of the Victorian Parliament “through nothing more than their tweets about travelling on public transport,” Culnane writes in a blog post.
“It’s not difficult to see how information like this could be used for nefarious purposes – for stalking by a jealous ex-partner, a rejected date or something equally serious."
—Chris Culnane, University of Melbourne
A separate analysis by researchers with Data61, which is part of Australia’s CSIRO national research agency, determined there was “a high risk that some individuals may be re-identified by linking the data set with other information sources,” according to the OVIC’s report.
The only step taken to de-identify the data was the removal of individual Myki card IDs, Culnane writes. But all trips taken on one card were linked. Travel data – which is generated when someone touches a transport card to a reader – was accurate to a second, along with location information. The data set, however, didn’t not contain names or addresses.
The transport department also didn’t redact the type of transport card. There are 74 types of cards, some of which are relatively rare, such as federal police travel passes and cards for state and federal lawmakers. The University of Melbourne team found, for example, that there were only seven cards reserved for federal parliamentarians.
The researchers found they could identify themselves based on just two known touch events, which could be correlated by looking at transport logs connected with an online Myki card account.
“It’s not difficult to see how information like this could be used for nefarious purposes – for stalking by a jealous ex-partner, a rejected date or something equally serious,” Culnane writes.
Protecting Data: Differential Privacy
Culnane says that applying differential privacy techniques to the Myki data would have prevented re-identification risks. That involves using mathematical techniques to “perturb” the data so it less linkable to individuals, he says.
For example, in early 2017, New South Wales’ transport agency released data related to the Opal card, which is the travel card in that state and applied differential privacy techniques.
With the Opal card data set, it wasn’t possible to correlate where someone got on and then off public transport, Culnane says. Touch on or touch off events were binned in 15-minute intervals. Also, it wasn’t possible to determine different trips by the same person, according to a study co-authored by Culnane.
While those measures meant much stronger privacy protections for travelers, “the problem is it does reduce the utility of the data,” Culnane says.
OVIC: Low Ongoing Risk
OVIC says that the risk now to Victorians who used public transport over the three-year period is low. That’s because online accounts for registered Myki cards only show the last six months of journey data, so those accounts would no longer overlap with the data that was released. Leveraging those logs was a particularly easy way to correlate trips in the data set.
The Myki data was taken offline on Sept. 26, 2018, after the Datathon ended as a result of the concerns. The OVIC says those who were identified in the data have been contacted.
But even without the Myki online logs, Culnane says it may still be possible to identify people based on the uniqueness of someone’s travel pattern, such as unusual trips or perhaps the absence of trips during a certain period.
He says it’s possible the data set is still circulating because it was shared with 190 teams that participated in the Datathon. “I would imagine some people have it,” he says. “I would hope they don’t publish it.”