The confidential patient data of 2.9 million Australians can be re-identified, without decryption, a report from University of Melbourne researchers has found.
The dataset from Australia's Medicare Benefits Scheme and Pharmaceutical Benefits Scheme released in August 2016 can be traced back to the individual, academics from the university's School of Computing and Information Systems found.
The data was removed by the Health Department a month after its initial release, after the researchers warned the government that practitioner details could be decrypted.
The data includes the de-identified medical billing records of 10% of Australians from 1984 to 2014, and reveals what medication patients were on and whether they are seeing a psychologist.
The federal Department of Health said it was not aware of anyone being identified, but that the dataset was removed immediately last year when it was referred to the Australian Information and Privacy Commissioner.
Dr. Vanessa Teague, who co-authored the report, warned other de-identified government data — from the Census or from Centrelink — could be vulnerable to similar types of exposure.
"We found that patients can be re-identified, without decryption, through a process of linking the unencrypted parts of the record with known information about the individual such as medical procedures and year of birth," Teague said in a statement.
“We need a much more controlled release in a secure research environment, as well as the ability to provide patients greater control and visibility over their data."
The Greens' digital rights spokesperson, senator Jordon Steele-John, called on the government to improve the security of publicly available data.
"Given 10% of Australian’s are included in this historical data, this public release can effectively be viewed as a data breach on the grandest scale," Steele-John said.
“Legislating against misuse of this kind of data will not stop it occurring, especially when it is this easy to re-identify individual’s records. What are the implications for other publicly released data sets that are supposedly ‘de-identified’ and secure?"