Apple responds to highlighted 'flaws' in iPhone CSAM scanning tech

How to use widgets on your iPhone Home screen
How to use widgets on your iPhone Home screen (Image credit: Joseph Keller / iMore)

What you need to know

  • Security researchers have found the source code for Apple's CSAM detection.
  • Initial reports suggest that there may be flaws in the technology.

Reports indicate that Apple's CSAM technology may be flawed, after the code for system was allegedly found in iOS 14.

The Verge reports:

Researchers have found a flaw in iOS's built-in hash function, raising new concerns about the integrity of Apple's CSAM-scanning system. The flaw affects the hashing system, called NeuralHash, which allows Apple to check for exact matches of known child abuse imagery without possessing any of the images or gleaning any information about non-matching pictures.

A Reddit user posted reverse-engineered coge allegedly for the new CSAM system stating "Believe it or not, this algorithm already exists as early as iOS 14.3, hidden under obfuscated class names. After some digging and reverse engineering on the hidden APIs, I managed to export its model (which is MobileNetV3) to ONNX and rebuild the whole NeuralHash algorithm in Python. You can now try NeuralHash even on Linux!"

According to Asuhariet Ygvar testing indicates the CSAM technology "can tolerate image resizing and compression, but not cropping or rotations". This is strange because of the technical assessments provided by Apple that state:

Apple has produced a technology that can compute fingerprints from pictures. these fingerprints are very small compared to pictures. When two fingerprints match, it is very likely that the pictures match. Simple operations like resizing, cropping, or compressing a picture will not change its fingerprint

Another concern raised about the tech is collisions, where two different images generate the same hash, which could, in theory, be used to fool the system into detecting images that don't actually contain CSAM, however as The Verge explains this would require "extraordinary efforts to exploit" and wouldn't get past Apple's manual review process:

Generally, collision attacks allow researchers to find identical inputs that produce the same hash. In Apple's system, this would mean generating an image that sets off the CSAM alerts even though it is not a CSAM image since it produces the same hash as an image in the database. But actually generating that alert would require access to the NCMEC hash database, generating more than 30 colliding images, and then smuggling all of them onto the target's phone. Even then, it would only generate an alert to Apple and NCMEC, which would easily identify the images as false positives.

Ygvar said they hoped that the source code would help researchers "understand NeuralHash algorithm better and know its potential issues before it's enabled on all iOS devices."

In reponse to these revelations, Apple told iMore that the respresentation of reverse-engineering in this instance isn't accurate, and that the company has designed its NeuralHash algorithm to be publicly available so that security researchers can investigate it. It also states that the version being analyzed in the story is a generic version of its NeuralHash technology and not the final version coming to detect CSAM in iCloud photos. Apple says perceptual hashes by definition can be fooled into thinking that two different images are the same, and that CSAM scanning's security takes this into account. Apple also states that collisions are also expected and don't undermine the security of the system. For starters, the on-device CSAM hash database is encrypted, so it wouldn't be possible for an attacker described above to generate collissions against known CSAM. Apple further notes that when the CSAM threshold is crossed a second independent perceptual hash algorithm analyzes photos matched against known CSAM. This second algorithm is run server-side and wouldn't be available to attackers. From Apple:

"This independent hash is chosen to reject the unlikely possibility that the match threshold was exceeded due to non-CSAM images that were adversarially perturbed to cause false NeuralHash matches against the on-device encrypted CSAM database."

This safeguard is key in ensuring that your account can't be flagged because of images that don't contain CSAM, but might trigger an alert because the hashes match.

Finally, Apple again emphasized that its CSAM detection is subject to human review, such that even if the right amount of collisions trigger an alert the process is subject to human review that could identify "collisions" should your account be falsely flagged because you had been sent images with hatches that matched that CSAM database but were in fact not CSAM material.

Stephen Warwick
News Editor

Stephen Warwick has written about Apple for five years at iMore and previously elsewhere. He covers all of iMore's latest breaking news regarding all of Apple's products and services, both hardware and software. Stephen has interviewed industry experts in a range of fields including finance, litigation, security, and more. He also specializes in curating and reviewing audio hardware and has experience beyond journalism in sound engineering, production, and design. Before becoming a writer Stephen studied Ancient History at University and also worked at Apple for more than two years. Stephen is also a host on the iMore show, a weekly podcast recorded live that discusses the latest in breaking Apple news, as well as featuring fun trivia about all things Apple. Follow him on Twitter @stephenwarwick9