The challenge
Accessing user data safely
More and more, organisations are collecting data about their users and customers. This data is then fed into sophisticated analytics, including machine learning algorithms, to unlock insightful information leading to higher value services and products.
The question is how organisations can then provide safe access to this data internally, or even share the data externally for societal or commercial benefit. This is extended by considering the benefit of different organisations safely sharing data between them, and there is a strong incentive to do so.
Most data custodians recognise the privacy and confidentiality risks in using and sharing their data both within and outside their organisations. However, there is no consistent and repeatable methodology or related tool for data custodians to confidently measure and understand the level of such risks in their data for the purpose of sharing or releasing it.
Our response
Re-identifier Risk Ready Reckoner (R4)
We have designed quantitative and qualitative privacy and confidentiality risk methodology, with appropriate assessment metrics and frameworks, to understand the risks with sharing or releasing data, or even just providing access to a wider internal audience. These tools leverage scientific knowledge from information theory and stochastic models to provide an accurate estimation of the residual risks associated with the sharing of sensitive data.
For example, one of our metrics allows the measurement of re-identification risks for an individual event, or transaction based ion factors such as uniqueness, uniformity and/or linkability. Another one of our metrics quantifies the risk of deducing a non-reported value in an aggregated data report.
We have also developed software, such as our Re-identifier Risk Ready Reckoner (R4), to implement these metrics and methodologies. R4 generates quantifiable risk assessments that display on a working dashboard - and provides data treatment options such as binning and perturbation to help data custodians mitigate these risks - before re-assessing the risk in the treated data.
The results
Improving awareness of privacy and confidentiality risk
Our work is improving awareness of privacy and confidentiality risk in data and helping in the management of that risk across the data ecosystem.
Our privacy and confidentiality risk frameworks and R4 software have been used extensively in several commercial engagements, identifying and measuring re-identification risks in so-called de-identified data pending release (or in some cases already released), as well as inference risks of not-reported data in confidential financial reports.
Demonstrating the impact of our work through these engagements, we have observed cases where data custodians have adjusted their approach to making data available due to better appreciation of the risk it carries. In other cases, guided by our framework, data custodians have applied targeted transformation to the data to reduce the residual risks - while still maintaining an acceptable level of utility - before releasing it.
Find out more: Information Security and Privacy