Blog icon

The challenge

Perceptual hashing

Harmful and illegal-to-possess data are continuously seized and generated. For example, online child abuse and exploitation investigations routinely produce investigator labelled corpora of 100,000+ images or videos. These datasets are inherently offensive, psychologically harmful, and illegal to transmit or possess.

In a child exploitation investigation, law enforcement officers are given a couple of days to view a large amount of material before reporting to court for a successful conviction.

The original method required officers to view thousands of images, comparing photo files to identify similarities. In 2018, a method known as "perceptual hashing" was introduced and used algorithms to look for similarities between the content of the images, leaving a digital watermark to identify various forms of material.

Our response

Using AI and ML to scan images

The Data Airlock platform uses Artificial Intelligence (AI) and Machine Learning (ML) to scan through and filter confronting images faster than the previous methods, whilst also keeping analytics secure and restricted.

Data Airlock focuses on three key principles; protecting people from data, protecting data from people and analysing sensitive data in a safe and secure manner.

The results

Developing new algorithms to scan sensitive data

The design enables researchers to deliver new algorithms against sensitive data without being exposed to the data, using a Model-to-Data (MTD) paradigm; keeping information in secure vaults and permitting only manually vetted algorithms to operate on the data in isolated environments called airlocks.

Full analytical capability is achieved while keeping data custodians in absolute control. Researchers receive updates during executions and vetted outputs on completion for evaluation and action. Data Airlock's composition also allows trusted third parties to host the system securely.

Graphic of the Data Airlock Technology with a variety of buttons showing technology areas and the interconnection between public and secure virtualisation, and sensitive virtualisation.

Since it's inception, this project has attracted attention from a range of government agencies putting Data61 at the forefront of this area with the potential of future revenue streams.

Data Airlock will help law enforcement agencies to utilise talents from the public to make law enforcement more efficient and accurate and thereby helping to swiftly remove predators from society.

Contact us

Find out how we can help you and your business. Get in touch using the form below and our experts will get in contact soon!

CSIRO will handle your personal information in accordance with the Privacy Act 1988 (Cth) and our Privacy Policy.


This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

First name must be filled in

Surname must be filled in

I am representing *

Please choose an option

Please provide a subject for the enquriy

0 / 100

We'll need to know what you want to contact us about so we can give you an answer

0 / 1900

You shouldn't be able to see this field. Please try again and leave the field blank.